you can use proc glimmix which is intended for the categorical outcome with random effect
Basic Features
The GLIMMIX procedure enables you to specify a generalized linear mixed model
and to perform confirmatory inference in such models. The syntax is similar to that
of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.
The following are some of the basic features of PROC GLIMMIX.
•
SUBJECT= and GROUP= options, which enable blocking of variance matricesand parameter heterogeneity
•
choice of linearization about expected values or expansion about current solutionsof best linear unbiased predictors
•
flexible covariance structures for random and residual random effects, includingvariance components, unstructured, autoregressive, and spatial structures
•
CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements,which produce hypothesis tests and estimable linear combinations of effects
•
NLOPTIONS statement, which enables you to exercise control over the numericaloptimization. You can choose techniques, update methods, line search
algorithms, convergence criteria, and more. Or, you can choose the default
optimization strategies selected for the particular class of model you are fitting
•
computed variables with SAS programming statements inside of PROCGLIMMIX (except for variables listed in the CLASS statement). These computed
variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ
statements.
•
grouped data analysis•
user-specified link and variance functions•
choice of model-based variance-covariance estimators for the fixed effectsor empirical (sandwich) estimators to make analysis robust against misspecification
of the covariance structure and to adjust for small-sample bias
•
joint modeling for multivariate data. For example, you can model binary andnormal responses from a subject jointly and use random effects to relate (fuse)
the two outcomes.
•
multinomial models for ordinal and nominal outcomes•
univariate and multivariate low-rank smoothingAssumptions
The primary assumptions underlying the analyses performed by PROC GLIMMIX
are as follows:
•
If the model contains random effects, the distribution of the data conditionalon the random effects is known. This distribution is either a member of the
exponential family of distributions or one of the supplementary distributions
provided by the GLIMMIX procedure. In models without random effects, the
Notation for the Generalized Linear Mixed Model
7unconditional (marginal) distribution is assumed to be known for maximum
likelihood estimation, or the first two moments are known in the case of quasilikelihood
estimation.
•
The conditional expected value of the data takes the form of a linear mixedmodel after a monotonic transformation is applied.
•
The problem of fitting the GLMM can be cast as a singly or doubly iterativeoptimization problem. The objective function for the optimization is a function
of either the actual log likelihood, an approximation to the log likelihood, or
the log likelihood of an approximated model.
For a model containing random effects, the GLIMMIX procedure, by default, estimates
the parameters by applying pseudo-likelihood techniques as in Wolfinger and
O’Connell (1993) and Breslow and Clayton (1993). In a model without random
effects (GLM models), PROC GLIMMIX estimates the parameters by maximum
likelihood, restricted maximum likelihood, or quasi-likelihood. See the
“Singly orDoubly Iterative Fitting”
section on page 140 on when the GLIMMIX procedure appliesnoniterative, singly and doubly iterative algorithms, and the
“Default EstimationTechniques”
section on page 142 on the default estimation methods.Once the parameters have been estimated, you can perform statistical inferences for
the fixed effects and covariance parameters of the model. Tests of hypotheses for
the fixed effects are based on Wald-type tests and the estimated variance-covariance
matrix.
PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling
the output from SAS procedures. ODS enables you to convert any of the
output from PROC GLIMMIX into a SAS data set. See the
“ODS Table Names”section on page 160.
ODS statistical graphics are available with the GLIMMIX procedure. For more information,
see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.
For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical
Graphics Using ODS” (
SAS/STAT User’s Guide).Notation for the Generalized Linear Mixed Model
This section introduces the mathematical notation used throughout the chapter to
describe the generalized linear mixed model (GLMM). See the
“Details” section onpage 108 for a description of the fitting algorithms and the mathematical-statistical
details.
The Basic Model
Suppose
Y represents the (n × 1) vector of observed data and is a (r × 1) vectorof random effects. Models fit by the GLIMMIX procedure assume that
E
[Y|] = g−1(X + Z)where
g(·) is a differentiable monotonic link function and g−1(·) is its inverse. Thematrix
Xis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random8
The GLIMMIX Procedureeffects. The random effects are assumed to be normally distributed with mean
0 andvariance matrix
G.The GLMM contains a linear mixed model inside the inverse link function. This
model component is referred to as the linear predictor,
= X + Z
The variance of the observations, conditional on the random effects, is
var
[Y|] = A1/2RA1/2The matrix
A is a diagonal matrix and contains the variance functions of the model.The variance function expresses the variance of a response as a function of the mean.
The GLIMMIX procedure determines the variance function from the
DIST= option inthe MODEL statement or from the user-supplied variance function (see the
“ImpliedVariance Functions”
section on page 104). The matrix R is a variance matrix specifiedby the user through the RANDOM statement. If the conditional distribution of
the data contains an additional scale parameter, it is either part of the variance functions
or part of the
R matrix. For example, the gamma distribution with mean μ hasvariance function
a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side randomeffects only (see below), the procedure models
R = -I, where I is the identitymatrix.
Table 9 on page 104 identifies the distributions for which - 1.G-side and R-side Random Effects
The GLIMMIX procedure distinguishes two types of random effects. Depending on
whether the variance of the random effect is contained in
Gor inR, these are referredto as “G-side” and “R-side” random effects. R-side effects are also called “residual”
effects. Simply put, if a random effect is an element of , it is a G-side effect;
otherwise, it is an R-side effect. Models without G-side effects are also known as
marginal (or population-averaged) models. Models fit with the GLIMMIX procedure
can have none, one, or more of each type of effect.
Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED
effect in the MIXED procedure. In the GLIMMIX procedure all random effects are
specified through the RANDOM statement.
The columns of
X are constructed from effects listed on the right-hand side in theMODEL statement. Columns of
Z and the variance matricesGandRare constructedfrom the RANDOM statement.
The
R matrix is by default the scaled identity matrix, R = -I. The scale parameter-
is set to one if the distribution does not have a scale parameter, for example, inthe case of the the binary, binomial, Poisson, and exponential distribution (see
Table9
on page 104). To specify a different R matrix, use the RANDOM statement withthe –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that
the
Time effect for each patient is an R-side effect with a first-order autoregressivecovariance structure, use the RESIDUAL option:
PROC GLIMMIX Contrasted with Other SAS Procedures
9random time / type=ar(1) subject=patient residual;
To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword
random _residual_;
You specify the link function
g(·) with the LINK= option of the MODEL statementor with programming statements. You specify the variance function that controls the
matrix
A with the DIST= option of the MODEL statement or with programmingstatements.
Unknown quantities subject to estimation are the fixed-effects parameter vector
and the covariance parameter vector
that comprises all unknowns in G and R.The random effects
are not parameters of the model in the sense that they are notestimated. The vector
is a vector of random variables. The solutions for arepredictors of these random variables.
Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of
be computed at every iteration.
Relationship with Generalized Linear Models
Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder
1989) are a special case of GLMMs. If
= 0 and R = -I, the GLMM reduces toeither a generalized linear model (GLM) or a GLM with overdispersion. For example,
if
Y is a vector of Poisson variables so that A is a diagonal matrix containingE
[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1and overdispersed relative to a Poisson distribution for
- > 1. Since the Poissondistribution does not have an extra scale parameter, you can model overdispersion by
adding the statement
random _residual_;
to your GLIMMIX statements. If the only random effect is an overdispersion effect,
PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of
the methods specific to GLMMs.
PROC GLIMMIX Contrasted with Other SAS Procedures
The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two
important ways. First, the response can have a nonnormal distribution. The MIXED
procedure assumes that the response is normally (Gaussian) distributed. Second,
the GLIMMIX procedure incorporates random effects in the model and so allows
for subject-specific (conditional) and population-averaged (marginal) inference. The
GENMOD procedure only allows for marginal inference.
The GLIMMIX and MIXED procedure are closely related; see the syntax and feature
comparison in the section
“Comparing PROC GLIMMIX with PROC MIXED”10
The GLIMMIX Procedureon page 138. The remainder of this section compares PROC GLIMMIX with the
GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.
The GENMOD procedure fits generalized linear models for independent data by
maximum likelihood. It can also handle correlated data through the marginal GEE
approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation
in the GENMOD procedure is a marginal method that does not incorporate
random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances
only, and the unknown parameters in
R are estimated by the method ofmoments. The GLIMMIX procedure allows G-side random effects and R-side covariances.
The parameters are estimated by likelihood-based techniques.
Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.
In a GLMM it is not always possible to derive the log likelihood of the data.
Even if the log likelihood is tractable, it may be computationally infeasible. In some
cases, the objective function must be constructed based on a substitute model. In other
cases, only the first two moments of the marginal distribution can be approximated.
Consequently, obtaining likelihood-based tests and statistics is difficult in the majority
of generalized linear mixed models. The GLIMMIX procedure relies heavily on
linearization and Taylor-series techniques to construct Wald-type test statistics and
confidence intervals. Likelihood ratio tests and confidence intervals are not available
in the GLIMMIX procedure.
The NLMIXED procedure also fits generalized linear mixed models but the class of
models it can accommodate is more narrow. The NLMIXED procedure relies on approximating
the marginal log likelihood by integral approximation through Gaussian
quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the
problem of obtaining solutions for the parameter estimates as an optimization problem.
The objective function for the NLMIXED procedure is the marginal log likelihood
obtained by integrating out the random effects from the joint distribution of
responses and random effects using quadrature techniques. Although these are very
accurate, the number of random effects that can be practically managed is limited.
Also, R-side random effects cannot be accommodated with the NLMIXED procedure.
The GLIMMIX procedure, on the other hand, determines the marginal log
likelihood as that of an approximate linear mixed model. This allows multiple random
effects, nested and crossed random effects, multiple cluster types, and R-side
random components. The disadvantage is a doubly iterative fitting algorithm and the
absence of a true log likelihood.
The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate
the independence case only. Binary, binomial, multinomial models for
ordered data, and generalized logit models that can be fit with PROC LOGISTIC,
can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities
specific to such data implemented in the LOGISTIC procedure go beyond the
capabilities of the GLIMMIX procedure.
Logistic Regressions with Random Intercepts
11Getting Started
Logistic Regressions with Random Intercepts
Researchers investigated the performance of two medical procedures in a multicenter
study. They randomly selected 15 centers for inclusion. One of the study goals
was to compare the occurrence of side effects for the procedures. In each center
nApatients were randomly selected and assigned to procedure “A,” and
nB patients wererandomly assigned to procedure “B”. The following DATA step creates the data set
for the analysis.
data multicenter;
input center group$ n sideeffect;
datalines;
1 A 32 14
1 B 33 18
2 A 30 4
2 B 28 8
3 A 23 14
3 B 24 9
4 A 22 7
4 B 22 10
5 A 20 6
5 B 21 12
6 A 19 1
6 B 20 3
7 A 17 2
7 B 17 6
8 A 16 7
8 B 15 9
9 A 13 1
9 B 14 5
10 A 13 3
10 B 13 1
11 A 11 1
11 B 12 2
12 A 10 1
12 B 9 0
13 A 9 2
13 B 9 6
14 A 8 1
14 B 8 1
15 A 7 1
15 B 8 0
;
The variable
group identifies the two procedures, n is the number of patients whoreceived a given procedure in a particular center, and
sideeffect is the number ofpatients who reported side effects.
If
YiA and YiB denote the number of patients in center i who report side effects forprocedures
A and B, respectively, then—for a given center—these are independent12
The GLIMMIX Procedurebinomial random variables. To model the probability of side effects for the two drugs,
iA and iB, you need to account for the fixed group effect and the random selectionof centers. One possibility is to assume a model that relates group and center effects
linearly to the logit of the probabilities:
log
iA1
− iA=
0 + A + ilog
iB1
− iB=
0 + B + iIn this model,
A − B measures the difference in the logits of experiencing sideeffects, and the
i are independent random variables due to the random selectionof centers. If you think of
0 as the overall intercept in the model, then the i arerandom intercept adjustments. Observations from the same center receive the same
adjustment, and these vary randomly from center to center with variance var
[i] = 2c
.Since
iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA,you can model the sample proportions as binomial ratios in a generalized linear mixed
model. The following statements request this analysis under the assumption of normally
distributed center effects with equal variance and a logit link function.
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
run;
The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs
the procedure to treat the variables
center and group as classification variables.The MODEL statement specifies the response variable as a sample proportion
using the
events/trials syntax. In terms of the previous formulas, sideeffect/n correspondsto
YiA/niA for observations from Group A and to YiB/niB for observationsfrom Group B. The SOLUTION option in the MODEL statement requests a listing
of the solutions for the fixed-effects parameter estimates. Note that because of the
events/trials
syntax, the GLIMMIX procedure defaults to the binomial distribution,and that distribution’s default link is the logit link. The RANDOM statement specifies
that the linear predictor contains an intercept term that randomly varies at the
level of the
center effect. In other words, a random intercept is drawn separately andindependently for each center in the study.
The results of this analysis are shown in the following tables.
The “Model Information Table” in
Figure 1 summarizes important information aboutthe model you fit and about aspects of the estimation technique.
Logistic Regressions with Random Intercepts
13The GLIMMIX Procedure
Model Information
Data Set WORK.MULTICENTER
Response Variable (Events) sideeffect
Response Variable (Trials) n
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Blocked By center
Estimation Technique Residual PL
Degrees of Freedom Method Containment
Figure 1.
Model InformationPROC GLIMMIX recognizes the variables
sideeffect and n as the numerator anddenominator in the
events/trials syntax, respectively. The distribution—conditionalon the random center effects—is binomial. The marginal variance matrix is blockdiagonal,
and observations from the same center form the blocks. The default estimation
technique in generalized linear mixed models is residual pseudo-likelihood with
a subject-specific expansion (METHOD=RSPL).
In
Figure 2, the “Class Level Information” table lists the levels of the variablesspecified in the CLASS statement and the ordering of the levels. The “Number of
Observations” table displays the number of observations read and used in the analysis.
Class Level Information
Class Levels Values
center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
group 2 A B
Number of Observations Read 30
Number of Observations Used 30
Number of Events 155
Number of Trials 503
Figure 2.
Class Level Information and Number of ObservationsThere are two variables listed in the CLASS statement. The
center variable hasfifteen levels, and the
group variable has two levels. Since the response is specifiedthrough the
events/trial syntax, the “Number of Observations” table also contains thetotal number of events and trials used in the analysis.
The “Dimensions” table in
Figure 3 lists the size of relevant matrices.14
The GLIMMIX ProcedureDimensions
G-side Cov. Parameters 1
Columns in X 3
Columns in Z per Subject 1
Subjects (Blocks in V) 15
Max Obs per Subject 2
Figure 3.
DimensionsThere are three columns in the
X matrix, corresponding to an intercept and the twolevels of the
group variable. For each subject (center), the Z matrix contains only anintercept column.
The “Optimization Information” table in
Figure 4 provides information about themethods and size of the optimization problem.
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 1
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Profiled
Starting From Data
Figure 4.
Optimization InformationThe default optimization technique for generalized linear mixed models is the Quasi-
Newton method. Because a residual likelihood technique is used to compute the objective
function, only the covariance parameters are participating in the optimization.
A lower boundary constraint is placed on the variance component for the random
center
effect. The solution for this variance cannot be less than zero.The “Iteration History” table in
Figure 5 displays information about the progress ofthe optimization process.
Logistic Regressions with Random Intercepts
15Iteration History
Objective Max
Iteration Restarts Subiterations Function Change Gradient
0 0 5 79.688580269 0.11807224 7.851E-7
1 0 3 81.294622554 0.02558021 8.209E-7
2 0 2 81.438701534 0.00166079 4.061E-8
3 0 1 81.444083567 0.00006263 2.311E-8
4 0 1 81.444265216 0.00000421 0.000025
5 0 1 81.444277364 0.00000383 0.000023
6 0 1 81.444266322 0.00000348 0.000021
7 0 1 81.44427636 0.00000316 0.000019
8 0 1 81.444267235 0.00000287 0.000017
9 0 1 81.444275529 0.00000261 0.000016
10 0 1 81.44426799 0.00000237 0.000014
11 0 1 81.444274843 0.00000216 0.000013
12 0 1 81.444268614 0.00000196 0.000012
13 0 1 81.444274277 0.00000178 0.000011
14 0 1 81.444269129 0.00000162 9.772E-6
15 0 0 81.444273808 0.00000000 9.102E-6
Convergence criterion (PCONV=1.11022E-8) satisfied.
Figure 5.
Iteration History and Convergence StatusAfter the initial optimization, the GLIMMIX procedure performed 15 updates before
the convergence criterion was met. At convergence, the largest absolute value of the
gradient was near zero. This indicates that the process stopped at an extremum of the
objective function.
The “Fit Statistics” table in
Figure 6 lists information about the fitted model.Fit Statistics
-2 Res Log Pseudo-Likelihood 81.44
Generalized Chi-Square 30.69
Gener. Chi-Square / DF 1.10
Figure 6.
Fit StatisticsTwice the negative of the residual log lilikelihood in the final pseudo-model equaled
81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is
close to 1. This is a measure of the residual variability in the marginal distribution of
the data.
The “Covariance Parameter Estimates” table in
Figure 7 displays estimates andasymptotic estimated standard errors for all covariance parameters.
16
The GLIMMIX ProcedureCovariance Parameter Estimates
Standard
Cov Parm Subject Estimate Error
Intercept center 0.6176 0.3181
Figure 7.
Covariance Parameter EstimatesThe variance of the random center intercepts on the logit scale is estimated as
b2c
=0
.6176.The “Parameter Estimates” table in
Figure 8 displays the solutions for the fixed effectsin the model.
Solutions for Fixed Effects
Standard
Effect group Estimate Error DF t Value Pr > |t|
Intercept -0.8071 0.2514 14 -3.21 0.0063
group A -0.4896 0.2034 14 -2.41 0.0305
group B 0 . . . .
Figure 8.
Parameter EstimatesBecause of the fixed-effects parameterization used in the GLIMMIX procedure, the
“Intercept” effect is an estimate of
0 + B, and the “A” group effect is an estimateof
A − B, the log-odds ratio. The associated estimated probabilities of side effectsin the two groups are
b
A =1
1 + exp
{0.8071 + 0.4896}= 0
.2147b
B =1
1 + exp
{0.8071}= 0
.3085There is a significant difference between the two groups (
p=0.0305).The “Type III Tests of Fixed Effect” table in
Figure 9 displays significance tests forthe fixed effects in the model.
Type III Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
group 1 14 5.79 0.0305
Figure 9.
Type III Tests of Fixed EffectsLogistic Regressions with Random Intercepts
17Because the
group effect has only two levels, the p-value for the effect is the same asin the “Parameter Estimates” table, and the “
F Value” is the square of the “t Value”shown there.
You can produce the estimates of the average logits in the two groups and their predictions
on the scale of the data with the LSMEANS statement in PROC GLIMMIX.
ods select lsmeans;
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
lsmeans group / cl ilink;
run;
The LSMEANS statement requests the least-squares means of the
group effect on thelogit scale. The CL option requests their confidence limits. The ILINK option adds
estimates, standard errors, and confidence limits on the mean (probability) scale. The
table in
Figure 10 displays the results.The GLIMMIX Procedure
group Least Squares Means
Standard
group Estimate Error DF t Value Pr > |t| Alpha Lower Upper
A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388
B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679
group Least Squares Means
Standard
Error Lower Upper
group Mean Mean Mean Mean
A 0.2147 0.04385 0.1354 0.3233
B 0.3085 0.05363 0.2065 0.4334
Figure 10.
Least-squares MeansThe “Estimate” column displays the least-squares mean estimate on the logit scale,
and the “Mean” column represents its mapping onto the probability scale. The
“Lower” and “Upper” columns are 95% confidence limits for the logits in the two
groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence
limits for the probabilities of side effects. These limits are obtained by inversely
linking the confidence bounds on the linear scale, and thus are not symmetric
about the estimate of the probabilities.
Basic Features
The GLIMMIX procedure enables you to specify a generalized linear mixed model
and to perform confirmatory inference in such models. The syntax is similar to that
of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.
The following are some of the basic features of PROC GLIMMIX.
•
SUBJECT= and GROUP= options, which enable blocking of variance matricesand parameter heterogeneity
•
choice of linearization about expected values or expansion about current solutionsof best linear unbiased predictors
•
flexible covariance structures for random and residual random effects, includingvariance components, unstructured, autoregressive, and spatial structures
•
CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements,which produce hypothesis tests and estimable linear combinations of effects
•
NLOPTIONS statement, which enables you to exercise control over the numericaloptimization. You can choose techniques, update methods, line search
algorithms, convergence criteria, and more. Or, you can choose the default
optimization strategies selected for the particular class of model you are fitting
•
computed variables with SAS programming statements inside of PROCGLIMMIX (except for variables listed in the CLASS statement). These computed
variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ
statements.
•
grouped data analysis•
user-specified link and variance functions•
choice of model-based variance-covariance estimators for the fixed effectsor empirical (sandwich) estimators to make analysis robust against misspecification
of the covariance structure and to adjust for small-sample bias
•
joint modeling for multivariate data. For example, you can model binary andnormal responses from a subject jointly and use random effects to relate (fuse)
the two outcomes.
•
multinomial models for ordinal and nominal outcomes•
univariate and multivariate low-rank smoothingAssumptions
The primary assumptions underlying the analyses performed by PROC GLIMMIX
are as follows:
•
If the model contains random effects, the distribution of the data conditionalon the random effects is known. This distribution is either a member of the
exponential family of distributions or one of the supplementary distributions
provided by the GLIMMIX procedure. In models without random effects, the
Notation for the Generalized Linear Mixed Model
7unconditional (marginal) distribution is assumed to be known for maximum
likelihood estimation, or the first two moments are known in the case of quasilikelihood
estimation.
•
The conditional expected value of the data takes the form of a linear mixedmodel after a monotonic transformation is applied.
•
The problem of fitting the GLMM can be cast as a singly or doubly iterativeoptimization problem. The objective function for the optimization is a function
of either the actual log likelihood, an approximation to the log likelihood, or
the log likelihood of an approximated model.
For a model containing random effects, the GLIMMIX procedure, by default, estimates
the parameters by applying pseudo-likelihood techniques as in Wolfinger and
O’Connell (1993) and Breslow and Clayton (1993). In a model without random
effects (GLM models), PROC GLIMMIX estimates the parameters by maximum
likelihood, restricted maximum likelihood, or quasi-likelihood. See the
“Singly orDoubly Iterative Fitting”
section on page 140 on when the GLIMMIX procedure appliesnoniterative, singly and doubly iterative algorithms, and the
“Default EstimationTechniques”
section on page 142 on the default estimation methods.Once the parameters have been estimated, you can perform statistical inferences for
the fixed effects and covariance parameters of the model. Tests of hypotheses for
the fixed effects are based on Wald-type tests and the estimated variance-covariance
matrix.
PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling
the output from SAS procedures. ODS enables you to convert any of the
output from PROC GLIMMIX into a SAS data set. See the
“ODS Table Names”section on page 160.
ODS statistical graphics are available with the GLIMMIX procedure. For more information,
see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.
For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical
Graphics Using ODS” (
SAS/STAT User’s Guide).Notation for the Generalized Linear Mixed Model
This section introduces the mathematical notation used throughout the chapter to
describe the generalized linear mixed model (GLMM). See the
“Details” section onpage 108 for a description of the fitting algorithms and the mathematical-statistical
details.
The Basic Model
Suppose
Y represents the (n × 1) vector of observed data and is a (r × 1) vectorof random effects. Models fit by the GLIMMIX procedure assume that
E
[Y|] = g−1(X + Z)where
g(·) is a differentiable monotonic link function and g−1(·) is its inverse. Thematrix
Xis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random8
The GLIMMIX Procedureeffects. The random effects are assumed to be normally distributed with mean
0 andvariance matrix
G.The GLMM contains a linear mixed model inside the inverse link function. This
model component is referred to as the linear predictor,
= X + Z
The variance of the observations, conditional on the random effects, is
var
[Y|] = A1/2RA1/2The matrix
A is a diagonal matrix and contains the variance functions of the model.The variance function expresses the variance of a response as a function of the mean.
The GLIMMIX procedure determines the variance function from the
DIST= option inthe MODEL statement or from the user-supplied variance function (see the
“ImpliedVariance Functions”
section on page 104). The matrix R is a variance matrix specifiedby the user through the RANDOM statement. If the conditional distribution of
the data contains an additional scale parameter, it is either part of the variance functions
or part of the
R matrix. For example, the gamma distribution with mean μ hasvariance function
a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side randomeffects only (see below), the procedure models
R = -I, where I is the identitymatrix.
Table 9 on page 104 identifies the distributions for which - 1.G-side and R-side Random Effects
The GLIMMIX procedure distinguishes two types of random effects. Depending on
whether the variance of the random effect is contained in
Gor inR, these are referredto as “G-side” and “R-side” random effects. R-side effects are also called “residual”
effects. Simply put, if a random effect is an element of , it is a G-side effect;
otherwise, it is an R-side effect. Models without G-side effects are also known as
marginal (or population-averaged) models. Models fit with the GLIMMIX procedure
can have none, one, or more of each type of effect.
Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED
effect in the MIXED procedure. In the GLIMMIX procedure all random effects are
specified through the RANDOM statement.
The columns of
X are constructed from effects listed on the right-hand side in theMODEL statement. Columns of
Z and the variance matricesGandRare constructedfrom the RANDOM statement.
The
R matrix is by default the scaled identity matrix, R = -I. The scale parameter-
is set to one if the distribution does not have a scale parameter, for example, inthe case of the the binary, binomial, Poisson, and exponential distribution (see
Table9
on page 104). To specify a different R matrix, use the RANDOM statement withthe –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that
the
Time effect for each patient is an R-side effect with a first-order autoregressivecovariance structure, use the RESIDUAL option:
PROC GLIMMIX Contrasted with Other SAS Procedures
9random time / type=ar(1) subject=patient residual;
To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword
random _residual_;
You specify the link function
g(·) with the LINK= option of the MODEL statementor with programming statements. You specify the variance function that controls the
matrix
A with the DIST= option of the MODEL statement or with programmingstatements.
Unknown quantities subject to estimation are the fixed-effects parameter vector
and the covariance parameter vector
that comprises all unknowns in G and R.The random effects
are not parameters of the model in the sense that they are notestimated. The vector
is a vector of random variables. The solutions for arepredictors of these random variables.
Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of
be computed at every iteration.
Relationship with Generalized Linear Models
Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder
1989) are a special case of GLMMs. If
= 0 and R = -I, the GLMM reduces toeither a generalized linear model (GLM) or a GLM with overdispersion. For example,
if
Y is a vector of Poisson variables so that A is a diagonal matrix containingE
[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1and overdispersed relative to a Poisson distribution for
- > 1. Since the Poissondistribution does not have an extra scale parameter, you can model overdispersion by
adding the statement
random _residual_;
to your GLIMMIX statements. If the only random effect is an overdispersion effect,
PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of
the methods specific to GLMMs.
PROC GLIMMIX Contrasted with Other SAS Procedures
The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two
important ways. First, the response can have a nonnormal distribution. The MIXED
procedure assumes that the response is normally (Gaussian) distributed. Second,
the GLIMMIX procedure incorporates random effects in the model and so allows
for subject-specific (conditional) and population-averaged (marginal) inference. The
GENMOD procedure only allows for marginal inference.
The GLIMMIX and MIXED procedure are closely related; see the syntax and feature
comparison in the section
“Comparing PROC GLIMMIX with PROC MIXED”10
The GLIMMIX Procedureon page 138. The remainder of this section compares PROC GLIMMIX with the
GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.
The GENMOD procedure fits generalized linear models for independent data by
maximum likelihood. It can also handle correlated data through the marginal GEE
approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation
in the GENMOD procedure is a marginal method that does not incorporate
random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances
only, and the unknown parameters in
R are estimated by the method ofmoments. The GLIMMIX procedure allows G-side random effects and R-side covariances.
The parameters are estimated by likelihood-based techniques.
Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.
In a GLMM it is not always possible to derive the log likelihood of the data.
Even if the log likelihood is tractable, it may be computationally infeasible. In some
cases, the objective function must be constructed based on a substitute model. In other
cases, only the first two moments of the marginal distribution can be approximated.
Consequently, obtaining likelihood-based tests and statistics is difficult in the majority
of generalized linear mixed models. The GLIMMIX procedure relies heavily on
linearization and Taylor-series techniques to construct Wald-type test statistics and
confidence intervals. Likelihood ratio tests and confidence intervals are not available
in the GLIMMIX procedure.
The NLMIXED procedure also fits generalized linear mixed models but the class of
models it can accommodate is more narrow. The NLMIXED procedure relies on approximating
the marginal log likelihood by integral approximation through Gaussian
quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the
problem of obtaining solutions for the parameter estimates as an optimization problem.
The objective function for the NLMIXED procedure is the marginal log likelihood
obtained by integrating out the random effects from the joint distribution of
responses and random effects using quadrature techniques. Although these are very
accurate, the number of random effects that can be practically managed is limited.
Also, R-side random effects cannot be accommodated with the NLMIXED procedure.
The GLIMMIX procedure, on the other hand, determines the marginal log
likelihood as that of an approximate linear mixed model. This allows multiple random
effects, nested and crossed random effects, multiple cluster types, and R-side
random components. The disadvantage is a doubly iterative fitting algorithm and the
absence of a true log likelihood.
The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate
the independence case only. Binary, binomial, multinomial models for
ordered data, and generalized logit models that can be fit with PROC LOGISTIC,
can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities
specific to such data implemented in the LOGISTIC procedure go beyond the
capabilities of the GLIMMIX procedure.
Logistic Regressions with Random Intercepts
11Getting Started
Logistic Regressions with Random Intercepts
Researchers investigated the performance of two medical procedures in a multicenter
study. They randomly selected 15 centers for inclusion. One of the study goals
was to compare the occurrence of side effects for the procedures. In each center
nApatients were randomly selected and assigned to procedure “A,” and
nB patients wererandomly assigned to procedure “B”. The following DATA step creates the data set
for the analysis.
data multicenter;
input center group$ n sideeffect;
datalines;
1 A 32 14
1 B 33 18
2 A 30 4
2 B 28 8
3 A 23 14
3 B 24 9
4 A 22 7
4 B 22 10
5 A 20 6
5 B 21 12
6 A 19 1
6 B 20 3
7 A 17 2
7 B 17 6
8 A 16 7
8 B 15 9
9 A 13 1
9 B 14 5
10 A 13 3
10 B 13 1
11 A 11 1
11 B 12 2
12 A 10 1
12 B 9 0
13 A 9 2
13 B 9 6
14 A 8 1
14 B 8 1
15 A 7 1
15 B 8 0
;
The variable
group identifies the two procedures, n is the number of patients whoreceived a given procedure in a particular center, and
sideeffect is the number ofpatients who reported side effects.
If
YiA and YiB denote the number of patients in center i who report side effects forprocedures
A and B, respectively, then—for a given center—these are independent12
The GLIMMIX Procedurebinomial random variables. To model the probability of side effects for the two drugs,
iA and iB, you need to account for the fixed group effect and the random selectionof centers. One possibility is to assume a model that relates group and center effects
linearly to the logit of the probabilities:
log
iA1
− iA=
0 + A + ilog
iB1
− iB=
0 + B + iIn this model,
A − B measures the difference in the logits of experiencing sideeffects, and the
i are independent random variables due to the random selectionof centers. If you think of
0 as the overall intercept in the model, then the i arerandom intercept adjustments. Observations from the same center receive the same
adjustment, and these vary randomly from center to center with variance var
[i] = 2c
.Since
iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA,you can model the sample proportions as binomial ratios in a generalized linear mixed
model. The following statements request this analysis under the assumption of normally
distributed center effects with equal variance and a logit link function.
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
run;
The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs
the procedure to treat the variables
center and group as classification variables.The MODEL statement specifies the response variable as a sample proportion
using the
events/trials syntax. In terms of the previous formulas, sideeffect/n correspondsto
YiA/niA for observations from Group A and to YiB/niB for observationsfrom Group B. The SOLUTION option in the MODEL statement requests a listing
of the solutions for the fixed-effects parameter estimates. Note that because of the
events/trials
syntax, the GLIMMIX procedure defaults to the binomial distribution,and that distribution’s default link is the logit link. The RANDOM statement specifies
that the linear predictor contains an intercept term that randomly varies at the
level of the
center effect. In other words, a random intercept is drawn separately andindependently for each center in the study.
The results of this analysis are shown in the following tables.
The “Model Information Table” in
Figure 1 summarizes important information aboutthe model you fit and about aspects of the estimation technique.
Logistic Regressions with Random Intercepts
13The GLIMMIX Procedure
Model Information
Data Set WORK.MULTICENTER
Response Variable (Events) sideeffect
Response Variable (Trials) n
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Blocked By center
Estimation Technique Residual PL
Degrees of Freedom Method Containment
Figure 1.
Model InformationPROC GLIMMIX recognizes the variables
sideeffect and n as the numerator anddenominator in the
events/trials syntax, respectively. The distribution—conditionalon the random center effects—is binomial. The marginal variance matrix is blockdiagonal,
and observations from the same center form the blocks. The default estimation
technique in generalized linear mixed models is residual pseudo-likelihood with
a subject-specific expansion (METHOD=RSPL).
In
Figure 2, the “Class Level Information” table lists the levels of the variablesspecified in the CLASS statement and the ordering of the levels. The “Number of
Observations” table displays the number of observations read and used in the analysis.
Class Level Information
Class Levels Values
center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
group 2 A B
Number of Observations Read 30
Number of Observations Used 30
Number of Events 155
Number of Trials 503
Figure 2.
Class Level Information and Number of ObservationsThere are two variables listed in the CLASS statement. The
center variable hasfifteen levels, and the
group variable has two levels. Since the response is specifiedthrough the
events/trial syntax, the “Number of Observations” table also contains thetotal number of events and trials used in the analysis.
The “Dimensions” table in
Figure 3 lists the size of relevant matrices.14
The GLIMMIX ProcedureDimensions
G-side Cov. Parameters 1
Columns in X 3
Columns in Z per Subject 1
Subjects (Blocks in V) 15
Max Obs per Subject 2
Figure 3.
DimensionsThere are three columns in the
X matrix, corresponding to an intercept and the twolevels of the
group variable. For each subject (center), the Z matrix contains only anintercept column.
The “Optimization Information” table in
Figure 4 provides information about themethods and size of the optimization problem.
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 1
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Profiled
Starting From Data
Figure 4.
Optimization InformationThe default optimization technique for generalized linear mixed models is the Quasi-
Newton method. Because a residual likelihood technique is used to compute the objective
function, only the covariance parameters are participating in the optimization.
A lower boundary constraint is placed on the variance component for the random
center
effect. The solution for this variance cannot be less than zero.The “Iteration History” table in
Figure 5 displays information about the progress ofthe optimization process.
Logistic Regressions with Random Intercepts
15Iteration History
Objective Max
Iteration Restarts Subiterations Function Change Gradient
0 0 5 79.688580269 0.11807224 7.851E-7
1 0 3 81.294622554 0.02558021 8.209E-7
2 0 2 81.438701534 0.00166079 4.061E-8
3 0 1 81.444083567 0.00006263 2.311E-8
4 0 1 81.444265216 0.00000421 0.000025
5 0 1 81.444277364 0.00000383 0.000023
6 0 1 81.444266322 0.00000348 0.000021
7 0 1 81.44427636 0.00000316 0.000019
8 0 1 81.444267235 0.00000287 0.000017
9 0 1 81.444275529 0.00000261 0.000016
10 0 1 81.44426799 0.00000237 0.000014
11 0 1 81.444274843 0.00000216 0.000013
12 0 1 81.444268614 0.00000196 0.000012
13 0 1 81.444274277 0.00000178 0.000011
14 0 1 81.444269129 0.00000162 9.772E-6
15 0 0 81.444273808 0.00000000 9.102E-6
Convergence criterion (PCONV=1.11022E-8) satisfied.
Figure 5.
Iteration History and Convergence StatusAfter the initial optimization, the GLIMMIX procedure performed 15 updates before
the convergence criterion was met. At convergence, the largest absolute value of the
gradient was near zero. This indicates that the process stopped at an extremum of the
objective function.
The “Fit Statistics” table in
Figure 6 lists information about the fitted model.Fit Statistics
-2 Res Log Pseudo-Likelihood 81.44
Generalized Chi-Square 30.69
Gener. Chi-Square / DF 1.10
Figure 6.
Fit StatisticsTwice the negative of the residual log lilikelihood in the final pseudo-model equaled
81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is
close to 1. This is a measure of the residual variability in the marginal distribution of
the data.
The “Covariance Parameter Estimates” table in
Figure 7 displays estimates andasymptotic estimated standard errors for all covariance parameters.
16
The GLIMMIX ProcedureCovariance Parameter Estimates
Standard
Cov Parm Subject Estimate Error
Intercept center 0.6176 0.3181
Figure 7.
Covariance Parameter EstimatesThe variance of the random center intercepts on the logit scale is estimated as
b2c
=0
.6176.The “Parameter Estimates” table in
Figure 8 displays the solutions for the fixed effectsin the model.
Solutions for Fixed Effects
Standard
Effect group Estimate Error DF t Value Pr > |t|
Intercept -0.8071 0.2514 14 -3.21 0.0063
group A -0.4896 0.2034 14 -2.41 0.0305
group B 0 . . . .
Figure 8.
Parameter EstimatesBecause of the fixed-effects parameterization used in the GLIMMIX procedure, the
“Intercept” effect is an estimate of
0 + B, and the “A” group effect is an estimateof
A − B, the log-odds ratio. The associated estimated probabilities of side effectsin the two groups are
b
A =1
1 + exp
{0.8071 + 0.4896}= 0
.2147b
B =1
1 + exp
{0.8071}= 0
.3085There is a significant difference between the two groups (
p=0.0305).The “Type III Tests of Fixed Effect” table in
Figure 9 displays significance tests forthe fixed effects in the model.
Type III Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
group 1 14 5.79 0.0305
Figure 9.
Type III Tests of Fixed EffectsLogistic Regressions with Random Intercepts
17Because the
group effect has only two levels, the p-value for the effect is the same asin the “Parameter Estimates” table, and the “
F Value” is the square of the “t Value”shown there.
You can produce the estimates of the average logits in the two groups and their predictions
on the scale of the data with the LSMEANS statement in PROC GLIMMIX.
ods select lsmeans;
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
lsmeans group / cl ilink;
run;
The LSMEANS statement requests the least-squares means of the
group effect on thelogit scale. The CL option requests their confidence limits. The ILINK option adds
estimates, standard errors, and confidence limits on the mean (probability) scale. The
table in
Figure 10 displays the results.The GLIMMIX Procedure
group Least Squares Means
Standard
group Estimate Error DF t Value Pr > |t| Alpha Lower Upper
A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388
B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679
group Least Squares Means
Standard
Error Lower Upper
group Mean Mean Mean Mean
A 0.2147 0.04385 0.1354 0.3233
B 0.3085 0.05363 0.2065 0.4334
Figure 10.
Least-squares MeansThe “Estimate” column displays the least-squares mean estimate on the logit scale,
and the “Mean” column represents its mapping onto the probability scale. The
“Lower” and “Upper” columns are 95% confidence limits for the logits in the two
groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence
limits for the probabilities of side effects. These limits are obtained by inversely
linking the confidence bounds on the linear scale, and thus are not symmetric
about the estimate of the probabilities.