Jointlcmm function - RDocumentation (2024)

Description

This function fits joint latent class mixed models for a longitudinaloutcome and a right-censored (possibly left-truncated) time-to-event. Thefunction handles competing risks and Gaussian or non Gaussian (curvilinear)longitudinal outcomes. For curvilinear longitudinal outcomes, normalizingcontinuous functions (splines or Beta CDF) can be specified as inlcmm.

Usage

Jointlcmm( fixed, mixture, random, subject, classmb, ng = 1, idiag = FALSE, nwg = FALSE, survival, hazard = "Weibull", hazardtype = "Specific", hazardnodes = NULL, hazardrange = NULL, TimeDepVar = NULL, link = NULL, intnodes = NULL, epsY = 0.5, range = NULL, cor = NULL, data, B, convB = 1e-04, convL = 1e-04, convG = 1e-04, maxiter = 100, nsim = 100, prior = NULL, pprior = NULL, logscale = FALSE, subset = NULL, na.action = 1, posfix = NULL, partialH = FALSE, verbose = FALSE, returndata = FALSE, var.time = NULL, nproc = 1, clustertype = NULL)

jlcmm( fixed, mixture, random, subject, classmb, ng = 1, idiag = FALSE, nwg = FALSE, survival, hazard = "Weibull", hazardtype = "Specific", hazardnodes = NULL, hazardrange = NULL, TimeDepVar = NULL, link = NULL, intnodes = NULL, epsY = 0.5, range = NULL, cor = NULL, data, B, convB = 1e-04, convL = 1e-04, convG = 1e-04, maxiter = 100, nsim = 100, prior = NULL, pprior = NULL, logscale = FALSE, subset = NULL, na.action = 1, posfix = NULL, partialH = FALSE, verbose = FALSE, returndata = FALSE, var.time = NULL, nproc = 1, clustertype = NULL)

Value

The list returned is:

loglik

log-likelihood of the model

best

vector of parameter estimates in the same order as specified inB and detailed in section details

V

if the model converged (conv=1 or 3), vector containingthe upper triangle matrix of variance-covariance estimates of Bestwith exception for variance-covariance parameters of the random-effects forwhich V contains the variance-covariance estimates of the Choleskytransformed parameters displayed in cholesky.If conv=2, V contains the second derivatives of the log-likelihood.

gconv

vector ofconvergence criteria: 1. on the parameters, 2. on the likelihood, 3. on thederivatives

conv

status of convergence: =1 if the convergencecriteria were satisfied, =2 if the maximum number of iterations was reached,=4 or 5 if a problem occured during optimisation

call

the matchedcall

niter

number of Marquardt iterations

pred

table ofindividual predictions and residuals; it includes marginal predictions(pred_m), marginal residuals (resid_m), subject-specific predictions(pred_ss) and subject-specific residuals (resid_ss) averaged over classes,the observation (obs) and finally the class-specific marginal andsubject-specific predictions (with the number of the latent class:pred_m_1,pred_m_2,...,pred_ss_1,pred_ss_2,...). If var.timeis specified, the corresponding measurement time is also included.

pprob

table ofposterior classification and posterior individual class-membershipprobabilities based on the longitudinal data and the time-to-event data

pprobY

table of posterior classification and posterior individualclass-membership probabilities based only on the longitudinal data

predRE

table containing individual predictions of the random-effects:a column per random-effect, a line per subject

cholesky

vectorcontaining the estimates of the Cholesky transformed parameters of thevariance-covariance matrix of the random-effects

scoretest

Statisticof the Score Test for the conditional independence assumption of thelongitudinal and survival data given the latent class structure. Under thenull hypothesis, the statistics is a Chi-square with p degrees of freedomwhere p indicates the number of random-effects in the longitudinal mixedmodel. See Jacqmin-Gadda and Proust-Lima (2009) for more details.

predSurv

table of predictions giving for the window of times to event(called "time"), the predicted baseline risk function in each latent class(called "RiskFct") and the predicted cumulative baseline risk function ineach latent class (called "CumRiskFct").

hazard

internal informationabout the hazard specification used in related functions

data

theoriginal data set (if returndata is TRUE)

Arguments

fixed

two-sided linear formula object for the fixed-effects in thelinear mixed model. The response outcome is on the left of ~ and thecovariates are separated by + on the right of the ~. Bydefault, an intercept is included. If no intercept, -1 should be thefirst term included on the right of ~.

mixture

one-sided formula object for the class-specific fixed effectsin the linear mixed model (to specify only for a number of latent classesgreater than 1). Among the list of covariates included in fixed, thecovariates with class-specific regression parameters are entered inmixture separated by +. By default, an intercept is included.If no intercept, -1 should be the first term included.

random

optional one-sided formula for the random-effects in thelinear mixed model. Covariates with a random-effect are separated by+. By default, an intercept is included. If no intercept, -1should be the first term included.

subject

name of the covariate representing the grouping structure(called subject identifier) specified with ''.

classmb

optional one-sided formula describing the covariates in theclass-membership multinomial logistic model. Covariates included areseparated by +. No intercept should be included in this formula.

ng

optional number of latent classes considered. If ng=1 (bydefault) no mixture nor classmb should be specified. Ifng>1, mixture is required.

idiag

optional logical for the structure of the variance-covariancematrix of the random-effects. If FALSE, a non structured matrix ofvariance-covariance is considered (by default). If TRUE a diagonalmatrix of variance-covariance is considered.

nwg

optional logical indicating if the variance-covariance of therandom-effects is class-specific. If FALSE the variance-covariancematrix is common over latent classes (by default). If TRUE aclass-specific proportional parameter multiplies the variance-covariancematrix in each class (the proportional parameter in the last latent classequals 1 to ensure identifiability).

survival

two-sided formula object. The left side of the formulacorresponds to a surv() object of type "counting" for right-censoredand left-truncated data (example: Surv(Time,EntryTime,Indicator)) orof type "right" for right-censored data (example:Surv(Time,Indicator)). Multiple causes of event can be considered inthe Indicator (0 for censored, k for cause k of event). The right side ofthe formula specifies the names of covariates to include in the survivalmodel with mixture() when the effect is class-specific (example:Surv(Time,Indicator) ~ X1 + mixture(X2) for a class-commoneffect of X1 and a class-specific effect of X2). In the presence ofcompeting events, covariate effects are common by default. Codecause(X3) specifies a cause-specific covariate effect for X3 on eachcause of event while cause1(X3) (or cause2(X3), ...) specifiesa cause-specific effect of X3 on the first (or second, ...) cause only.

hazard

optional family of hazard function assumed for the survivalmodel. By default, "Weibull" specifies a Weibull baseline risk function.Other possibilities are "piecewise" for a piecewise constant risk functionor "splines" for a cubic M-splines baseline risk function. For these twolatter families, the number of nodes and the location of the nodes should bespecified as well, separated by -. The number of nodes is enteredfirst followed by -, then the location is specified with "equi","quant" or "manual" for respectively equidistant nodes, nodes at quantilesof the times of event distribution or interior nodes entered manually inargument hazardnodes. It is followed by - and finally"piecewise" or "splines" indicates the family of baseline risk functionconsidered. Examples include "5-equi-splines" for M-splines with 5equidistant nodes, "6-quant-piecewise" for piecewise constant risk over 5intervals and nodes defined at the quantiles of the times of eventsdistribution and "9-manual-splines" for M-splines risk function with 9nodes, the vector of 7 interior nodes being entered in the argumenthazardnodes. In the presence of competing events, a vector of hazardsshould be provided such as hazard=c("Weibull","splines" with 2 causesof event, the first one modelled by a Weibull baseline cause-specific riskfunction and the second one by splines.

hazardtype

optional indicator for the type of baseline risk functionwhen ng>1. By default "Specific" indicates a class-specific baseline riskfunction. Other possibilities are "PH" for a baseline risk functionproportional in each latent class, and "Common" for a baseline risk functionthat is common over classes. In the presence of competing events, a vectorof hazardtypes should be given.

hazardnodes

optional vector containing interior nodes ifsplines or piecewise is specified for the baseline hazardfunction in hazard.

hazardrange

optional vector indicating the range of the survival times (that isthe minimum and maximum). By default, the range is defined according to theminimum and maximum observed values of the survival times. The option should beused only for piecewise constant and Splines hazard functions.

TimeDepVar

optional vector containing an intermediate timecorresponding to a change in the risk of event. This time-dependentcovariate can only take the form of a time variable with the assumption thatthere is no effect on the risk before this time and a constant effect on therisk of event after this time (example: initiation of a treatment to accountfor).

link

optional family of link functions to estimate. By default,"linear" option specifies a linear link function leading to a standardlinear mixed model (hom*ogeneous or heterogeneous as estimated inhlme). Other possibilities include "beta" for estimating a linkfunction from the family of Beta cumulative distribution functions,"thresholds" for using a threshold model to describe the correspondencebetween each level of an ordinal outcome and the underlying latent process,and "Splines" for approximating the link function by I-splines. For thislatter case, the number of nodes and the nodes location should be alsospecified. The number of nodes is first entered followed by -, thenthe location is specified with "equi", "quant" or "manual" for respectivelyequidistant nodes, nodes at quantiles of the marker distribution or interiornodes entered manually in argument intnodes. It is followed by- and finally "splines" is indicated. For example, "7-equi-splines"means I-splines with 7 equidistant nodes, "6-quant-splines" means I-splineswith 6 nodes located at the quantiles of the marker distribution and"9-manual-splines" means I-splines with 9 nodes, the vector of 7 interiornodes being entered in the argument intnodes.

intnodes

optional vector of interior nodes. This argument is onlyrequired for a I-splines link function with nodes entered manually.

epsY

optional definite positive real used to rescale the marker in(0,1) when the beta link function is used. By default, epsY=0.5.

range

optional vector indicating the range of the outcome (that isthe minimum and maximum). By default, the range is defined according to theminimum and maximum observed values of the outcome. The option should beused only for Beta and Splines transformations.

cor

optional brownian motion or autoregressive process modeling thecorrelation between the observations. "BM" or "AR" should be specified,followed by the time variable between brackets. By default, no correlationis added.

data

optional data frame containing the variables named infixed, mixture, random, classmb andsubject.

B

optional specification for the initial values for the parameters.Three options are allowed: (1) a vector of initial values is entered (theorder in which the parameters are included is detailed in detailssection). (2) nothing is specified. A preliminary analysis involving theestimation of a standard linear mixed model is performed to choose initialvalues. (3) when ng>1, a Jointlcmm object is entered. It should correspondto the exact same structure of model but with ng=1. The program willautomatically generate initial values from this model. This specificationavoids the preliminary analysis indicated in (2) Note that due to possiblelocal maxima, the B vector should be specified and several differentstarting points should be tried.

convB

optional threshold for the convergence criterion based on theparameter stability. By default, convB=0.0001.

convL

optional threshold for the convergence criterion based on thelog-likelihood stability. By default, convL=0.0001.

convG

optional threshold for the convergence criterion based on thederivatives. By default, convG=0.0001.

maxiter

optional maximum number of iterations for the Marquardtiterative algorithm. By default, maxiter=150.

nsim

optional number of points for the predicted survival curves andpredicted baseline risk curves. By default, nsim=100.

prior

optional name of a covariate containing a prior informationabout the latent class membership. The covariate should be an integer withvalues in 0,1,...,ng. Value O indicates no prior for the subject while avalue in 1,...,ng indicates that the subject belongs to the correspondinglatent class.

pprior

optional vector specifying the names of the covariates containing theprior probabilities to belong to each latent class. These probabilities should bebetween 0 and 1 and should sum up to 1 for each subject.

logscale

optional boolean indicating whether an exponential(logscale=TRUE) or a square (logscale=FALSE -by default) transformation isused to ensure positivity of parameters in the baseline risk functions. Seedetails section

subset

a specification of the rows to be used: defaults to all rows.This can be any valid indexing vector for the rows of data or if that is notsupplied, a data frame made up of the variable used in formula.

na.action

Integer indicating how NAs are managed. The default is 1for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as'na.pass' or 'na.exclude' are not implemented in the current version.

posfix

Optional vector specifying the indices in vector B of theparameters that should not be estimated. Default to NULL, all parameters areestimated.

partialH

optional logical for Piecewise and Splines baseline riskfunctions and Splines link functions only. Indicates whether the parameters of thebaseline risk or link functions can be dropped from the Hessian matrix todefine convergence criteria.

verbose

logical indicating if information about computation should bereported. Default to TRUE.

returndata

logical indicating if data used for computation should bereturned. Default to FALSE, data are not returned.

var.time

optional character indicating the name of the time variable.

nproc

the number cores for parallel computation.Default to 1 (sequential mode).

clustertype

optional character indicating the type of cluster for parallel computation.

Author

Cecile Proust Lima, Amadou Diakite and Viviane Philipps

cecile.proust-lima@inserm.fr

Details

A. BASELINE RISK FUNCTIONS

For the baseline risk functions, the following parameterizations wereconsidered. Be careful, parametrisations changed in lcmm_V1.5:

1. With the "Weibull" function: 2 parameters are necessary w_1 and w_2 sothat the baseline risk function a_0(t) = w_1^2*w_2^2*(w_1^2*t)^(w_2^2-1) iflogscale=FALSE and a_0(t) = exp(w_1)*exp(w_2)(t)^(exp(w_2)-1) iflogscale=TRUE.

2. with the "piecewise" step function and nz nodes (y_1,...y_nz), nz-1parameters are necesssary p_1,...p_nz-1 so that the baseline risk functiona_0(t) = p_j^2 for y_j < t =< y_j+1 if logscale=FALSE and a_0(t) = exp(p_j)for y_j < t =< y_j+1 if logscale=TRUE.

3. with the "splines" function and nz nodes (y_1,...y_nz), nz+2 parametersare necessary s_1,...s_nz+2 so that the baseline risk function a_0(t) =sum_j s_j^2 M_j(t) if logscale=FALSE and a_0(t) = sum_j exp(s_j) M_j(t) iflogscale=TRUE where M_j is the basis of cubic M-splines.

Two parametrizations of the baseline risk function are proposed(logscale=TRUE or FALSE) because in some cases, especially when theinstantaneous risks are very close to 0, some convergence problems mayappear with one parameterization or the other. As a consequence, werecommend to try the alternative parameterization (changing logscale option)when a joint latent class model does not converge (maximum number ofiterations reached) where as convergence criteria based on the parametersand likelihood are small.

B. THE VECTOR OF PARAMETERS B

The parameters in the vector of initial values B or in the vector ofmaximum likelihood estimates best are included in the followingorder: (1) ng-1 parameters are required for intercepts in the latent classmembership model, and if covariates are included in classmb, ng-1parameters should be entered for each one; (2) parameters for the baselinerisk function: 2 parameters for each Weibull, nz-1 for each piecewiseconstant risk and nz+2 for each splines risk; this number should bemultiplied by ng if specific hazard is specified; otherwise, ng-1 additionalproportional effects are expected if PH hazard is specified; otherwisenothing is added if common hazard is specified. In the presence of competingevents, the number of parameters should be adapted to the number of causesof event; (3) for all covariates in survival, ng parameters arerequired if the covariate is inside a mixture(), otherwise 1parameter is required. Covariates parameters should be included in the sameorder as in survival. In the presence of cause-specific effects, thenumber of parameters should be multiplied by the number of causes; (4) forall covariates in fixed, one parameter is required if the covariateis not in mixture, ng parameters are required if the covariate isalso in mixture. Parameters should be included in the same order asin fixed; (5) the variance of each random-effect specified inrandom (including the intercept) if idiag=TRUE and theinferior triangular variance-covariance matrix of all the random-effects ifidiag=FALSE; (6) only if nwg=TRUE, ng-1 parameters forclass-specific proportional coefficients for the variance covariance matrixof the random-effects; (7) the variance of the residual error.

C. CAUTION

Some caution should be made when using the program:

(1) As the log-likelihood of a latent class model can have multiple maxima,a careful choice of the initial values is crucial for ensuring convergencetoward the global maximum. The program can be run without entering thevector of initial values (see point 2). However, we recommend tosystematically enter initial values in B and try different sets ofinitial values.

(2) The automatic choice of initial values that we provide requires theestimation of a preliminary linear mixed model. The user should be awarethat first, this preliminary analysis can take time for large datatsets andsecond, that the generated initial values can be very not likely and evenmay converge slowly to a local maximum. This is a reason why severalalternatives exist. The vector of initial values can be directly specifiedin B the initial values can be generated (automatically or randomly)from a model with ng=. Finally, function gridsearch performsan automatic grid search.

(3) Convergence criteria are very strict as they are based on derivatives ofthe log-likelihood in addition to the parameter and log-likelihoodstability. In some cases, the program may not converge and reach themaximum number of iterations fixed at 150. In this case, the user shouldcheck that parameter estimates at the last iteration are not on theboundaries of the parameter space. If the parameters are on the boundariesof the parameter space, the identifiability of the model is critical. Thismay happen especially when baseline risk functions involve splines (valueclose to the lower boundary - 0 with logscale=F -infinity with logscale=F)or classmb parameters that are too high or low (perfect classification) orlinkfunction parameters. When identifiability of some parameters issuspected, the program can be run again from the former estimates by fixingthe suspected parameters to their value with option posfix. This usuallysolves the problem. An alternative is to remove the parameters of the Betaof Splines link function from the inverse of the Hessian with optionpartialH. If not, the program should be run again with other initialvalues. Some problems of convergence may happen when the instantaneousrisks of event are very low and "piecewise" or "splines" baseline riskfunctions are specified. In this case, changing the parameterization of thebaseline risk functions with option logscale is recommended (see paragraph Afor details).

References

Proust-Lima C, Philipps V, Liquet B (2017). Estimation of Extended Mixed Models Using Latent Classes and Latent Processes: The R Package lcmm. Journal of Statistical Software, 78(2), 1-56. doi:10.18637/jss.v078.i02

Lin, H., Turnbull, B. W., McCulloch, C. E. and Slate, E. H. (2002). Latentclass models for joint analysis of longitudinal biomarker and event processdata: application to longitudinal prostate-specific antigen readings andprostate cancer. Journal of the American Statistical Association 97, 53-65.

Proust-Lima, C. and Taylor, J. (2009). Development and validation of adynamic prognostic tool for prostate cancer recurrence using repeatedmeasures of post-treatment PSA: a joint modelling approach. Biostatistics10, 535-49.

Jacqmin-Gadda, H. and Proust-Lima, C. (2010). Score test for conditionalindependence between longitudinal outcome and time-to-event given theclasses in the joint latent class model. Biometrics 66(1), 11-9

Proust-Lima, Sene, Taylor and Jacqmin-Gadda (2014). Joint latent classmodels of longitudinal and time-to-event data: a review. Statistical Methodsin Medical Research 23, 74-90.

See Also

postprob, plot.Jointlcmm,plot.predict, epoce

Jointlcmm function - RDocumentation (2024)

FAQs

What is the Lcmm link function? ›

lcmm function estimates mixed models and latent class mixed models for different types of outcomes by assuming a parameterized link function for linking the outcome Y(t) with the underlying latent process L(t) it measures.

What is the latent class mixed effect model? ›

The Latent Class Linear Mixed Model (LCLMM) combines the features of the linear mixed model (LMM) with an additional component, which partitions the population into subpopulations or latent classes. This model has usually been specified with relatively simple, restrictive assumptions.

What is LCM communication? ›

LCM is a set of libraries and tools for message passing and data marshalling, targeted at real-time systems where high-bandwidth and low latency are critical.

What is the purpose of a link function? ›

Link functions are used to connect the outcome variable to the linear model (that is, the linear combination of the parameters estimated for each of the predictors in the model).

What does a mixed effect model tell you? ›

A Mixed Effects Model is a statistical test used to predict a single variable using two or more other variables. It also is used to determine the numerical relationship between one variable and others.

What is the latent class model of the LCM? ›

A latent class model (LCM) relates a set of observed discrete multivariate variables to a set of latent variables (latent variables are not directly observed but are rather inferred, mostly through a mathematical model, from other variables that are observed; e.g., quality of life or intelligence of a person is a ...

What is the three level mixed effect model? ›

In the three-level mixed-effects model, individuals are the subjects and measurements are repeated only across time while individuals are nested in the familial clusters.

What is the function of the link command? ›

The link utility is a Unix command line program that creates a hard link from an existing directory entry to a new directory entry. It does no more than call the link() system function. It does not perform error checking before attempting to create the link.

What is the function of the data link? ›

The data-link layer is responsible for transferring messages (or frame) from a given node to all other nodes in the CAN network. This layer handles bit stuffing and checksums for error handling, and after sending a message, waits for acknowledgment from the receivers. It is subdivided into two further layers: (a)

What does a log link function do? ›

The log link exponentiates the linear predictors. It does not log transform the outcome variable. Where μ=predicted value of Y given X, exp(β0) = the effect on the mean of μ when X=0, and exp(β1)= the multiplicative effect on the mean of Y for a one-unit increase in X.

What is the link function in Jet Reports? ›

LINK gives users advanced filtering capabilities in Jet Essentials. Using LINK allows users to tie together information from different tables. LINK is available for all connector types.

Top Articles
Latest Posts
Article information

Author: Duane Harber

Last Updated:

Views: 6567

Rating: 4 / 5 (71 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Duane Harber

Birthday: 1999-10-17

Address: Apt. 404 9899 Magnolia Roads, Port Royceville, ID 78186

Phone: +186911129794335

Job: Human Hospitality Planner

Hobby: Listening to music, Orienteering, Knapping, Dance, Mountain biking, Fishing, Pottery

Introduction: My name is Duane Harber, I am a modern, clever, handsome, fair, agreeable, inexpensive, beautiful person who loves writing and wants to share my knowledge and understanding with you.