Estimation of latent class linear mixed models — hlme (2024)

Estimation of latent class linear mixed models — hlme (1)

Source: R/hlme.R

hlme.Rd

This function fits linear mixed models and latent class linear mixed models(LCLMM) also known as growth mixture models or heterogeneous linear mixedmodels. The LCLMM consists in assuming that the population is divided in afinite number of latent classes. Each latent class is characterised by aspecific trajectory modelled by a class-specific linear mixed model. Boththe latent class membership and the trajectory can be explained according tocovariates. This function is limited to a mixture of Gaussian outcomes. Forother types of outcomes, please see function lcmm. For multivariatelongitudinal outcomes, please see multlcmm.

Usage

hlme( fixed, mixture, random, subject, classmb, ng = 1, idiag = FALSE, nwg = FALSE, cor = NULL, data, B, convB = 1e-04, convL = 1e-04, convG = 1e-04, prior, pprior = NULL, maxiter = 500, subset = NULL, na.action = 1, posfix = NULL, verbose = FALSE, returndata = FALSE, var.time = NULL, partialH = FALSE, nproc = 1, clustertype = NULL)

Arguments

fixed

two-sided linear formula object for the fixed-effects in thelinear mixed model. The response outcome is on the left of ~ and thecovariates are separated by + on the right of ~. By default,an intercept is included. If no intercept, -1 should be the firstterm included on the right of ~.

mixture

one-sided formula object for the class-specific fixed effectsin the linear mixed model (to specify only for a number of latent classesgreater than 1). Among the list of covariates included in fixed, thecovariates with class-specific regression parameters are entered inmixture separated by +. By default, an intercept is included.If no intercept, -1 should be the first term included.

random

optional one-sided formula for the random-effects in thelinear mixed model. Covariates with a random-effect are separated by+. By default, an intercept is included. If no intercept, -1should be the first term included.

subject

name of the covariate representing the grouping structurespecified with ''.

classmb

optional one-sided formula describing the covariates in theclass-membership multinomial logistic model. Covariates included areseparated by +. By default, classmb=~1 if ng>1.

ng

optional number of latent classes considered. If ng=1 (bydefault) no mixture nor classmb should be specified. Ifng>1, mixture is required.

idiag

optional logical for the structure of the variance-covariancematrix of the random-effects. If FALSE, a non structured matrix ofvariance-covariance is considered (by default). If TRUE a diagonalmatrix of variance-covariance is considered.

nwg

optional logical indicating if the variance-covariance of therandom-effects is class-specific. If FALSE the variance-covariancematrix is common over latent classes (by default). If TRUE aclass-specific proportional parameter multiplies the variance-covariancematrix in each class (the proportional parameter in the last latent classequals 1 to ensure identifiability).

cor

optional brownian motion or autoregressive process modeling thecorrelation between the observations. "BM" or "AR" should be specified,followed by the time variable between brackets. By default, no correlationis added.

data

optional data frame containing the variables named infixed, mixture, random, classmb andsubject.

B

optional specification for the initial values for the parameters.Three options are allowed: (1) a vector of initial values is entered (theorder in which the parameters are included is detailed in detailssection). (2) nothing is specified. A preliminary analysis involving theestimation of a standard linear mixed model is performed to choose initialvalues. (3) when ng>1, a hlme object is entered. It should correspond tothe exact same structure of model but with ng=1. The program willautomatically generate initial values from this model. This specificationavoids the preliminary analysis indicated in (2). Note that due to possiblelocal maxima, the B vector should be specified and several differentstarting points should be tried.

convB

optional threshold for the convergence criterion based on theparameter stability. By default, convB=0.0001.

convL

optional threshold for the convergence criterion based on thelog-likelihood stability. By default, convL=0.0001.

convG

optional threshold for the convergence criterion based on thederivatives. By default, convG=0.0001.

prior

optional name of a covariate containing a prior informationabout the latent class membership. The covariate should be an integer withvalues in 0,1,...,ng. Value 0 indicates no prior for the subject while avalue in 1,...,ng indicates that the subject belongs to the correspondinglatent class.

pprior

optional vector specifying the names of the covariates containing theprior probabilities to belong to each latent class. These probabilities should bebetween 0 and 1 and should sum up to 1 for each subject.

maxiter

optional maximum number of iterations for the Marquardtiterative algorithm. By default, maxiter=500.

subset

a specification of the rows to be used: defaults to all rows.This can be any valid indexing vector for the rows of data or if that is notsupplied, a data frame made up of the variable used in formula.

na.action

Integer indicating how NAs are managed. The default is 1for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as'na.pass' or 'na.exclude' are not implemented in the current version.

posfix

Optional vector specifying the indices in vector B of theparameters that should not be estimated. Default to NULL, all parameters areestimated.

verbose

logical indicating if information about computation should bereported. Default to TRUE.

returndata

logical indicating if data used for computation should bereturned. Default to FALSE, data are not returned.

var.time

optional character indicating the name of the time variable.

partialH

optional logical indicating if parameters can be dropped from theHessian matrix to define convergence criteria.

nproc

the number cores for parallel computation.Default to 1 (sequential mode).

clustertype

optional character indicating the type of cluster for parallel computation.

Value

The list returned is:

ns

number of grouping units in the dataset

ng

number of latent classes

loglik

log-likelihood of the model

best

vector of parameter estimates in the same order asspecified in B and detailed in section details

V

if the model converged (conv=1 or 3), vector containing the upper trianglematrix of variance-covariance estimates of Best with exception forvariance-covariance parameters of the random-effects for which V containsthe variance-covariance estimates of the Cholesky transformed parameters displayed incholesky.If conv=2, V contains the second derivatives of the log-likelihood.

gconv

vector of convergence criteria: 1. on the parameters, 2. on thelikelihood, 3. on the derivatives

conv

status of convergence: =1 if the convergence criteria were satisfied,=2 if the maximum number of iterations was reached, =3 if the convergence criteria weresatisfied with a partial Hessian matrix, =4 or 5 if a problem occuredduring optimisation

call

the matched call

niter

number of Marquardt iterations

N

internal information used in related functions

idiag

internal information used in related functions

pred

table of individual predictions and residuals; itincludes marginal predictions (pred_m), marginal residuals (resid_m),subject-specific predictions (pred_ss) and subject-specific residuals(resid_ss) averaged over classes, the observation (obs) and finally theclass-specific marginal and subject-specific predictions (with the number ofthe latent class: pred_m_1,pred_m_2,...,pred_ss_1,pred_ss_2,...). If var.timeis specified, the corresponding measurement time is also included.

pprob

table of posterior classification and posterior individualclass-membership probabilities

Xnames

list of covariates included in the model

predRE

table containing individual predictions of the random-effects: a column per random-effect, a line per subject

cholesky

vector containing the estimates of the Cholesky transformedparameters of the variance-covariance matrix of the random-effects

data

the original data set (if returndata is TRUE)

Details

A. THE VECTOR OF PARAMETERS B

The parameters in the vector of initial values B or equivalently inthe vector of maximum likelihood estimates best are included in thefollowing order:

(1) ng-1 parameters are required for intercepts in the latent classmembership model, and when covariates are included in classmb, ng-1paramaters should be entered for each covariate;

(2) for all covariates in fixed, one parameter is required if thecovariate is not in mixture, ng paramaters are required if thecovariate is also in mixture;

(3) the variance of each random-effect specified in random (includingthe intercept) when idiag=TRUE, or the inferior triangularvariance-covariance matrix of all the random-effects whenidiag=FALSE;

(4) only when nwg=TRUE, ng-1 parameters are required for the ng-1class-specific proportional coefficients in the variance covariance matrixof the random-effects;

(5) when cor is specified, 1 parameter corresponding to the varianceof the Brownian motion should be entered with cor=BM and 2 parameterscorresponding to the correlation and the variance parameters of theautoregressive process should be entered

(6) the standard error of the residual error.

B. CAUTIONS

Some caution should be made when using the program:

(1) As the log-likelihood of a latent class model can have multiple maxima,a careful choice of the initial values is crucial for ensuring convergencetoward the global maximum. The program can be run without entering thevector of initial values (see point 2). However, we recommend tosystematically enter initial values in B and try different sets ofinitial values.

(2) The automatic choice of initial values we provide requires theestimation of a preliminary linear mixed model. The user should be awarethat first, this preliminary analysis can take time for large datatsets andsecond, that the generated initial values can be very not likely and evenmay converge slowly to a local maximum. This is the reason why severalalternatives exist. The vector of initial values can be directly specifiedin B the initial values can be generated (automatically or randomly)from a model with ng=. Finally, function gridsearch performsan automatic grid search.

(3) Convergence criteria are very strict as they are based on thederivatives of the log-likelihood in addition to the parameter stability andlog-likelihood stability. In some cases, the program may not converge andreach the maximum number of iterations fixed at 100. In this case, the usershould check that parameter estimates at the last iteration are not on theboundaries of the parameter space. If the parameters are on the boundariesof the parameter space, the identifiability of the model is critical. Thismay happen especially with splines parameters that may be too close to 0(lower boundary) or classmb parameters that are too high or low (perfectclassification). When identifiability of some parameters is suspected, theprogram can be run again from the former estimates by fixing the suspectedparameters to their value with option posfix. This usually solves theproblem. An alternative is to remove the parameters of the Beta of Splineslink function from the inverse of the Hessian with option partialH. If not,the program should be run again with other initial values, with a highermaximum number of iterations or less strict convergence tolerances.

References

Proust-Lima C, Philipps V, Liquet B (2017). Estimation of Extended Mixed Models Using Latent Classes and Latent Processes: The R Package lcmm. Journal of Statistical Software, 78(2), 1-56. doi:10.18637/jss.v078.i02

Verbeke G and Lesaffre E (1996). A linear mixed-effects model withheterogeneity in the random-effects population. Journal of the AmericanStatistical Association 91, 217-21

Muthen B and Shedden K (1999). Finite mixture modeling with mixture outcomesusing the EM algorithm. Biometrics 55, 463-9

Proust C and Jacqmin-Gadda H (2005). Estimation of linear mixed models witha mixture of distribution for the random-effects. Computer Methods ProgramsBiomedicine 78, 165-73

See also

postprob, plot.hlme,summary, predictY

Author

Cecile Proust-Lima, Benoit Liquet and Viviane Philipps

cecile.proust-lima@inserm.fr

Examples

##### Example of a latent class model estimated for a varying number# of latent classes: # The model includes a subject- (ID) and class-specific linear # trend (intercept and Time in fixed, random and mixture components)# and a common effect of X1 and its interaction with time over classes # (in fixed). # The variance of the random intercept and slope are assumed to be equal # over classes (nwg=F).# The covariate X3 predicts the class membership (in classmb).## !CAUTION: initialization of mixed models with latent classes is # of most importance because of the problem of multimodality of the likelihood.# Calls m2a-m2d illustrate the different implementations for the # initial values.### hom*ogeneous linear mixed model (standard linear mixed model) ### with correlated random-effectsm1<-hlme(Y~Time*X1,random=~Time,subject='ID',ng=1,data=data_hlme)summary(m1)#> Heterogenous linear mixed model #> fitted by maximum likelihood method #> #> hlme(fixed = Y ~ Time * X1, random = ~Time, subject = "ID", ng = 1, #> data = data_hlme)#> #> Statistical Model: #> Dataset: data_hlme #> Number of subjects: 100 #> Number of observations: 326 #> Number of latent classes: 1 #> Number of parameters: 8 #> #> Iteration process: #> Convergence criteria satisfied #> Number of iterations: 10 #> Convergence criteria: parameters= 5e-10 #> : likelihood= 1.4e-09 #> : second derivatives= 1.3e-16 #> #> Goodness-of-fit statistics: #> maximum log-likelihood: -790.51 #> AIC: 1597.02 #> BIC: 1617.86 #> #> #> Maximum Likelihood Estimates: #> #> Fixed effects in the longitudinal model:#> #> coef Se Wald p-value#> intercept 25.86093 0.77175 33.509 0.00000#> Time -0.32877 0.17121 -1.920 0.05482#> X1 1.69460 1.00508 1.686 0.09179#> Time:X1 -0.39408 0.22293 -1.768 0.07710#> #> #> Variance-covariance matrix of the random-effects:#> intercept Time#> intercept 23.13781 #> Time 2.79713 1.10558#> #> coef Se#> Residual standard error: 0.96892 0.06095#> ### latent class linear mixed model with 2 classes# a. automatic specification from G=1 model estimates:m2a<-hlme(Y~Time*X1,mixture=~Time,random=~Time,classmb=~X2+X3,subject='ID', ng=2,data=data_hlme,B=m1) # b. vector of initial values provided by the user:m2b<-hlme(Y~Time*X1,mixture=~Time,random=~Time,classmb=~X2+X3,subject='ID', ng=2,data=data_hlme,B=c(0.11,-0.74,-0.07,20.71, 29.39,-1,0.13,2.45,-0.29,4.5,0.36,0.79,0.97)) # c. random draws from G = 1 model estimates:m2c<-hlme(Y~Time*X1,mixture=~Time,random=~Time,classmb=~X2+X3,subject='ID', ng=2,data=data_hlme,B=random(m1))#> Error in eval(cl$B[[2]], parent.env(environment())): object 'm1' not found# d. gridsearch with 50 departures and 10 iterations of the algorithm # (see function gridsearch for details)# \dontrun{m2d <- gridsearch(rep = 50, maxiter = 10, minit = m1, hlme(Y ~ Time * X1, mixture =~ Time, random =~ Time, classmb =~ X2 + X3, subject = 'ID', ng = 2, data = data_hlme))# }  # summary of the estimation processsummarytable(m1, m2a, m2b, m2c)#> Error in summarytable(m1, m2a, m2b, m2c): object 'm2c' not found# summary of m2asummary(m2a)#> Heterogenous linear mixed model #> fitted by maximum likelihood method #> #> hlme(fixed = Y ~ Time * X1, mixture = ~Time, random = ~Time, #> subject = "ID", classmb = ~X2 + X3, ng = 2, data = data_hlme)#> #> Statistical Model: #> Dataset: data_hlme #> Number of subjects: 100 #> Number of observations: 326 #> Number of latent classes: 2 #> Number of parameters: 13 #> #> Iteration process: #> Convergence criteria satisfied #> Number of iterations: 21 #> Convergence criteria: parameters= 3.9e-05 #> : likelihood= 3.4e-05 #> : second derivatives= 1.6e-10 #> #> Goodness-of-fit statistics: #> maximum log-likelihood: -773.82 #> AIC: 1573.64 #> BIC: 1607.51 #> #> #> Maximum Likelihood Estimates: #> #> Fixed effects in the class-membership model:#> (the class of reference is the last class) #> #> coef Se Wald p-value#> intercept class1 0.10518 0.33392 0.315 0.75276#> X2 class1 -0.73939 0.48976 -1.510 0.13112#> X3 class1 -0.06818 0.19282 -0.354 0.72364#> #> Fixed effects in the longitudinal model:#> #> coef Se Wald p-value#> intercept class1 20.70606 0.64002 32.352 0.00000#> intercept class2 29.39253 0.54893 53.546 0.00000#> Time class1 -1.00426 0.19584 -5.128 0.00000#> Time class2 0.13102 0.17259 0.759 0.44778#> X1 2.45174 0.64546 3.798 0.00015#> Time:X1 -0.28770 0.20015 -1.437 0.15060#> #> #> Variance-covariance matrix of the random-effects:#> intercept Time#> intercept 4.50085 #> Time 0.35984 0.79338#> #> coef Se#> Residual standard error: 0.96808 0.06065#> # posterior classificationpostprob(m2a)#> #> Posterior classification: #> class1 class2#> N 46 54#> % 46 54#> #> Posterior classification table: #> --> mean of posterior probabilities in each class #> prob1 prob2#> class1 0.9588 0.0412#> class2 0.0325 0.9675#> #> Posterior probabilities above a threshold (%): #> class1 class2#> prob>0.7 93.48 100.00#> prob>0.8 93.48 92.59#> prob>0.9 86.96 83.33#> # plot of predicted trajectories using some newdatanewdata<-data.frame(Time=seq(0,5,length=100),X1=rep(0,100),X2=rep(0,100),X3=rep(0,100))plot(predictY(m2a,newdata,var.time="Time"),legend.loc="right",bty="l")Estimation of latent class linear mixed models — hlme (2)
Estimation of latent class linear mixed models — hlme (2024)

FAQs

What is the latent class linear mixed model? ›

The Latent Class Linear Mixed Model (LCLMM) combines the features of the linear mixed model (LMM) with an additional component, which partitions the population into subpopulations or latent classes. This model has usually been specified with relatively simple, restrictive assumptions.

What is LCMm? ›

The R package lcmm provides a series of functions to estimate statistical models based on the linear mixed model theory. It includes the estimation of: mixed models and latent class mixed models for Gaussian longitudinal outcomes (hlme), curvilinear and ordinal univariate longitudinal outcomes (lcmm)

What is the estimate in a linear mixed model? ›

In this model, the parameters to estimate are the fixed-effects coefficients β, and the variance components θ and σ2. The two most commonly used approaches to parameter estimation in linear mixed-effects models are maximum likelihood and restricted maximum likelihood methods.

What is robust estimation of linear mixed models? ›

A robust estimation method for linear mixed effects models based on the component contamination model is developed. The estimates of the variance components and the residual scale are based on a general- ized version of the Design Adaptive Scale estimate.

What are the 3 types of linear model? ›

Understanding Linear Models

Some common types include simple linear regression, multiple linear regression, hierarchical linear models (HLM), and structural equation models (SEM), etc. While SEM focuses on relationships among variables, HLM focuses on the effects of different levels of analysis on an outcome.

What is an example of a latent class? ›

Examples of Latent Class Analysis

For example, you think that people fall into one of three different types: abstainers, social drinkers and alcoholics. Since you cannot directly measure what category someone falls into, this is a latent variable (a variable that cannot be directly measured).

What is the life expectancy of someone with light chain myeloma? ›

Only about 5–10% of cases progress to light chain amyloidosis, and it can affect any organ except for the brain. The heart and kidneys are most commonly affected. If more than 2 organs are affected, the estimated survival time of light chain myeloma is 13 months without treatment.

How high is too high for a kappa light chain? ›

Normal results from a kappa free light chain test depend on the testing method and the lab's established reference ranges. The normal ranges for free light chains are generally: 3.3 to 19.4 milligrams per liter (mg/L) kappa free light chains.

What is the prognosis for IgG kappa myeloma? ›

around 85 out of every 100 (around 85%) will survive their myeloma for a year or more after diagnosis. around 55 out of every 100 (around 55%) will survive their myeloma for 5 years or more after diagnosis. around 30 out of every 100 (around 30%) will survive their myeloma for 10 years or more after they are diagnosed.

What does a linear mixed model tell you? ›

LMMs allow us to understand the important effects between and within levels while incorporating the corrections for standard errors for non-independence embedded in the data structure.

How do I report the results of a linear mixed model analysis? ›

Results of a linear mixed model can be presented by displaying estimated coefficients, standard errors, confidence intervals, and significance levels for fixed effects, along with variance components for random effects.

How do you estimate a linear model? ›

The least squares method is the most widely used procedure for developing estimates of the model parameters. For simple linear regression, the least squares estimates of the model parameters β0 and β1 are denoted b0 and b1. Using these estimates, an estimated regression equation is constructed: ŷ = b0 + b1x .

What is the difference between linear mixed model and GLM? ›

The generalized linear model (GLM) and the linear mixed model (LMM) provide a more advanced level of analysis. The GLM is a generalization of linear regression. The LMM allows for the inclusion of random effects factors and is useful when the assumptions of independence and constant variance are violated.

What are two limitations of linear models? ›

In this article, we will explore the drawbacks of a linear model and how they can affect the accuracy of the predictions.
  • Limited Flexibility: ...
  • Susceptible to Outliers: ...
  • Assumes Linearity: ...
  • Overfitting: ...
  • Multicollinearity: ...
  • Cannot Handle Categorical Variables: ...
  • Assumes hom*oscedasticity:
Apr 24, 2023

What are the strengths of linear mixed model? ›

The advantages of the mixed-linear-model association (MLMA) method include the prevention of false positive associations due to population or relatedness structure and an increase in power obtained through the application of a correction that is specific to this structure.

What is latent mixture modeling? ›

Latent variable mixture modeling is an emerging person-centered statistical approach that models heterogeneity by classifying individuals into unobserved groupings (latent classes) with similar (more hom*ogenous) patterns.

What is the meaning of latent class? ›

Latent class analysis (LCA) is a statistical procedure used to identify qualitatively different subgroups within populations that share certain outward characteristics (Hagenaars & McCutcheon, 2002). Subgroups are referred to as latent groups (or classes).

What is the latent class model of economics? ›

A Latent Class Model is a statistical modeling technique that incorporates categorical latent variables, instead of continuous latent variables, to identify distinct subgroups or categories of individuals within a population.

What is latent class choice model? ›

Latent Class Choice Model (LCCM) is the most popular nonparametric distribution model and is usually adopted when the analyst hypothesizes that the unobserved heterogeneity can be represented through discrete constructs such as different decision protocols used by individuals, segments of the population with varying ...

Top Articles
Latest Posts
Article information

Author: Terence Hammes MD

Last Updated:

Views: 6563

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Terence Hammes MD

Birthday: 1992-04-11

Address: Suite 408 9446 Mercy Mews, West Roxie, CT 04904

Phone: +50312511349175

Job: Product Consulting Liaison

Hobby: Jogging, Motor sports, Nordic skating, Jigsaw puzzles, Bird watching, Nordic skating, Sculpting

Introduction: My name is Terence Hammes MD, I am a inexpensive, energetic, jolly, faithful, cheerful, proud, rich person who loves writing and wants to share my knowledge and understanding with you.