经管之家首页 答疑首页 SAS mixed logit 建模最好用什么?
  • 统计与数据分析
  • 经济学
  • 管理学
  • 金融学
  • 财会类
  • 国际贸易类
  • 考研考博与考证
  • 找数据和资料
  • 求职与职场
  • 学术与投稿类
  • 社会生活
  • 其它
我现在知道有gauss的代码,可现在gauss的实用资料太少了。所以想问高手:SAS mixed logit 建模方便吗
vincentsg
回答于 2007/09/14 02:54

you can use proc glimmix which is intended for the categorical outcome with random effect

Basic Features

The GLIMMIX procedure enables you to specify a generalized linear mixed model

and to perform confirmatory inference in such models. The syntax is similar to that

of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.

The following are some of the basic features of PROC GLIMMIX.

SUBJECT= and GROUP= options, which enable blocking of variance matrices

and parameter heterogeneity

choice of linearization about expected values or expansion about current solutions

of best linear unbiased predictors

flexible covariance structures for random and residual random effects, including

variance components, unstructured, autoregressive, and spatial structures

CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements,

which produce hypothesis tests and estimable linear combinations of effects

NLOPTIONS statement, which enables you to exercise control over the numerical

optimization. You can choose techniques, update methods, line search

algorithms, convergence criteria, and more. Or, you can choose the default

optimization strategies selected for the particular class of model you are fitting

computed variables with SAS programming statements inside of PROC

GLIMMIX (except for variables listed in the CLASS statement). These computed

variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ

statements.

grouped data analysis

user-specified link and variance functions

choice of model-based variance-covariance estimators for the fixed effects

or empirical (sandwich) estimators to make analysis robust against misspecification

of the covariance structure and to adjust for small-sample bias

joint modeling for multivariate data. For example, you can model binary and

normal responses from a subject jointly and use random effects to relate (fuse)

the two outcomes.

multinomial models for ordinal and nominal outcomes

univariate and multivariate low-rank smoothing

Assumptions

The primary assumptions underlying the analyses performed by PROC GLIMMIX

are as follows:

If the model contains random effects, the distribution of the data conditional

on the random effects is known. This distribution is either a member of the

exponential family of distributions or one of the supplementary distributions

provided by the GLIMMIX procedure. In models without random effects, the

Notation for the Generalized Linear Mixed Model  7

unconditional (marginal) distribution is assumed to be known for maximum

likelihood estimation, or the first two moments are known in the case of quasilikelihood

estimation.

The conditional expected value of the data takes the form of a linear mixed

model after a monotonic transformation is applied.

The problem of fitting the GLMM can be cast as a singly or doubly iterative

optimization problem. The objective function for the optimization is a function

of either the actual log likelihood, an approximation to the log likelihood, or

the log likelihood of an approximated model.

For a model containing random effects, the GLIMMIX procedure, by default, estimates

the parameters by applying pseudo-likelihood techniques as in Wolfinger and

O’Connell (1993) and Breslow and Clayton (1993). In a model without random

effects (GLM models), PROC GLIMMIX estimates the parameters by maximum

likelihood, restricted maximum likelihood, or quasi-likelihood. See the “Singly or

Doubly Iterative Fitting” section on page 140 on when the GLIMMIX procedure applies

noniterative, singly and doubly iterative algorithms, and the “Default Estimation

Techniques” section on page 142 on the default estimation methods.

Once the parameters have been estimated, you can perform statistical inferences for

the fixed effects and covariance parameters of the model. Tests of hypotheses for

the fixed effects are based on Wald-type tests and the estimated variance-covariance

matrix.

PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling

the output from SAS procedures. ODS enables you to convert any of the

output from PROC GLIMMIX into a SAS data set. See the “ODS Table Names”

section on page 160.

ODS statistical graphics are available with the GLIMMIX procedure. For more information,

see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.

For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical

Graphics Using ODS” (SAS/STAT User’s Guide).

Notation for the Generalized Linear Mixed Model

This section introduces the mathematical notation used throughout the chapter to

describe the generalized linear mixed model (GLMM). See the “Details” section on

page 108 for a description of the fitting algorithms and the mathematical-statistical

details.

The Basic Model

Suppose Y represents the (n × 1) vector of observed data and is a (r × 1) vector

of random effects. Models fit by the GLIMMIX procedure assume that

E[Y|] = g1(X + Z)

where g(·) is a differentiable monotonic link function and g1(·) is its inverse. The

matrixXis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random

8  The GLIMMIX Procedure

effects. The random effects are assumed to be normally distributed with mean 0 and

variance matrix G.

The GLMM contains a linear mixed model inside the inverse link function. This

model component is referred to as the linear predictor,

 = X + Z

The variance of the observations, conditional on the random effects, is

var[Y|] = A1/2RA1/2

The matrix A is a diagonal matrix and contains the variance functions of the model.

The variance function expresses the variance of a response as a function of the mean.

The GLIMMIX procedure determines the variance function from the DIST= option in

the MODEL statement or from the user-supplied variance function (see the “Implied

Variance Functions” section on page 104). The matrix R is a variance matrix specified

by the user through the RANDOM statement. If the conditional distribution of

the data contains an additional scale parameter, it is either part of the variance functions

or part of the R matrix. For example, the gamma distribution with mean μ has

variance function a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side random

effects only (see below), the procedure models R = -I, where I is the identity

matrix. Table 9 on page 104 identifies the distributions for which -  1.

G-side and R-side Random Effects

The GLIMMIX procedure distinguishes two types of random effects. Depending on

whether the variance of the random effect is contained inGor inR, these are referred

to as “G-side” and “R-side” random effects. R-side effects are also called “residual”

effects. Simply put, if a random effect is an element of , it is a G-side effect;

otherwise, it is an R-side effect. Models without G-side effects are also known as

marginal (or population-averaged) models. Models fit with the GLIMMIX procedure

can have none, one, or more of each type of effect.

Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED

effect in the MIXED procedure. In the GLIMMIX procedure all random effects are

specified through the RANDOM statement.

The columns of X are constructed from effects listed on the right-hand side in the

MODEL statement. Columns of Z and the variance matricesGandRare constructed

from the RANDOM statement.

The R matrix is by default the scaled identity matrix, R = -I. The scale parameter

- is set to one if the distribution does not have a scale parameter, for example, in

the case of the the binary, binomial, Poisson, and exponential distribution (see Table

9 on page 104). To specify a different R matrix, use the RANDOM statement with

the –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that

the Time effect for each patient is an R-side effect with a first-order autoregressive

covariance structure, use the RESIDUAL option:

PROC GLIMMIX Contrasted with Other SAS Procedures  9

random time / type=ar(1) subject=patient residual;

To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword

random _residual_;

You specify the link function g(·) with the LINK= option of the MODEL statement

or with programming statements. You specify the variance function that controls the

matrix A with the DIST= option of the MODEL statement or with programming

statements.

Unknown quantities subject to estimation are the fixed-effects parameter vector

and the covariance parameter vector  that comprises all unknowns in G and R.

The random effects are not parameters of the model in the sense that they are not

estimated. The vector is a vector of random variables. The solutions for are

predictors of these random variables.

Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of

be computed at every iteration.

Relationship with Generalized Linear Models

Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder

1989) are a special case of GLMMs. If = 0 and R = -I, the GLMM reduces to

either a generalized linear model (GLM) or a GLM with overdispersion. For example,

if Y is a vector of Poisson variables so that A is a diagonal matrix containing

E[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1

and overdispersed relative to a Poisson distribution for - > 1. Since the Poisson

distribution does not have an extra scale parameter, you can model overdispersion by

adding the statement

random _residual_;

to your GLIMMIX statements. If the only random effect is an overdispersion effect,

PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of

the methods specific to GLMMs.

PROC GLIMMIX Contrasted with Other SAS Procedures

The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two

important ways. First, the response can have a nonnormal distribution. The MIXED

procedure assumes that the response is normally (Gaussian) distributed. Second,

the GLIMMIX procedure incorporates random effects in the model and so allows

for subject-specific (conditional) and population-averaged (marginal) inference. The

GENMOD procedure only allows for marginal inference.

The GLIMMIX and MIXED procedure are closely related; see the syntax and feature

comparison in the section “Comparing PROC GLIMMIX with PROC MIXED”

10  The GLIMMIX Procedure

on page 138. The remainder of this section compares PROC GLIMMIX with the

GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.

The GENMOD procedure fits generalized linear models for independent data by

maximum likelihood. It can also handle correlated data through the marginal GEE

approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation

in the GENMOD procedure is a marginal method that does not incorporate

random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances

only, and the unknown parameters in R are estimated by the method of

moments. The GLIMMIX procedure allows G-side random effects and R-side covariances.

The parameters are estimated by likelihood-based techniques.

Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.

In a GLMM it is not always possible to derive the log likelihood of the data.

Even if the log likelihood is tractable, it may be computationally infeasible. In some

cases, the objective function must be constructed based on a substitute model. In other

cases, only the first two moments of the marginal distribution can be approximated.

Consequently, obtaining likelihood-based tests and statistics is difficult in the majority

of generalized linear mixed models. The GLIMMIX procedure relies heavily on

linearization and Taylor-series techniques to construct Wald-type test statistics and

confidence intervals. Likelihood ratio tests and confidence intervals are not available

in the GLIMMIX procedure.

The NLMIXED procedure also fits generalized linear mixed models but the class of

models it can accommodate is more narrow. The NLMIXED procedure relies on approximating

the marginal log likelihood by integral approximation through Gaussian

quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the

problem of obtaining solutions for the parameter estimates as an optimization problem.

The objective function for the NLMIXED procedure is the marginal log likelihood

obtained by integrating out the random effects from the joint distribution of

responses and random effects using quadrature techniques. Although these are very

accurate, the number of random effects that can be practically managed is limited.

Also, R-side random effects cannot be accommodated with the NLMIXED procedure.

The GLIMMIX procedure, on the other hand, determines the marginal log

likelihood as that of an approximate linear mixed model. This allows multiple random

effects, nested and crossed random effects, multiple cluster types, and R-side

random components. The disadvantage is a doubly iterative fitting algorithm and the

absence of a true log likelihood.

The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate

the independence case only. Binary, binomial, multinomial models for

ordered data, and generalized logit models that can be fit with PROC LOGISTIC,

can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities

specific to such data implemented in the LOGISTIC procedure go beyond the

capabilities of the GLIMMIX procedure.

Logistic Regressions with Random Intercepts  11

Getting Started

Logistic Regressions with Random Intercepts

Researchers investigated the performance of two medical procedures in a multicenter

study. They randomly selected 15 centers for inclusion. One of the study goals

was to compare the occurrence of side effects for the procedures. In each center nA

patients were randomly selected and assigned to procedure “A,” and nB patients were

randomly assigned to procedure “B”. The following DATA step creates the data set

for the analysis.

data multicenter;

input center group$ n sideeffect;

datalines;

1 A 32 14

1 B 33 18

2 A 30 4

2 B 28 8

3 A 23 14

3 B 24 9

4 A 22 7

4 B 22 10

5 A 20 6

5 B 21 12

6 A 19 1

6 B 20 3

7 A 17 2

7 B 17 6

8 A 16 7

8 B 15 9

9 A 13 1

9 B 14 5

10 A 13 3

10 B 13 1

11 A 11 1

11 B 12 2

12 A 10 1

12 B 9 0

13 A 9 2

13 B 9 6

14 A 8 1

14 B 8 1

15 A 7 1

15 B 8 0

;

The variable group identifies the two procedures, n is the number of patients who

received a given procedure in a particular center, and sideeffect is the number of

patients who reported side effects.

If YiA and YiB denote the number of patients in center i who report side effects for

procedures A and B, respectively, then—for a given center—these are independent

12  The GLIMMIX Procedure

binomial random variables. To model the probability of side effects for the two drugs,

iA and iB, you need to account for the fixed group effect and the random selection

of centers. One possibility is to assume a model that relates group and center effects

linearly to the logit of the probabilities:

log



iA

1 iA



= 0 + A + i

log



iB

1 iB



= 0 + B + i

In this model, A B measures the difference in the logits of experiencing side

effects, and the i are independent random variables due to the random selection

of centers. If you think of 0 as the overall intercept in the model, then the i are

random intercept adjustments. Observations from the same center receive the same

adjustment, and these vary randomly from center to center with variance var[i] =

2

c .

Since iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA,

you can model the sample proportions as binomial ratios in a generalized linear mixed

model. The following statements request this analysis under the assumption of normally

distributed center effects with equal variance and a logit link function.

proc glimmix data=multicenter;

class center group;

model sideeffect/n = group / solution;

random intercept / subject=center;

run;

The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs

the procedure to treat the variables center and group as classification variables.

The MODEL statement specifies the response variable as a sample proportion

using the events/trials syntax. In terms of the previous formulas, sideeffect/n corresponds

to YiA/niA for observations from Group A and to YiB/niB for observations

from Group B. The SOLUTION option in the MODEL statement requests a listing

of the solutions for the fixed-effects parameter estimates. Note that because of the

events/trials syntax, the GLIMMIX procedure defaults to the binomial distribution,

and that distribution’s default link is the logit link. The RANDOM statement specifies

that the linear predictor contains an intercept term that randomly varies at the

level of the center effect. In other words, a random intercept is drawn separately and

independently for each center in the study.

The results of this analysis are shown in the following tables.

The “Model Information Table” in Figure 1 summarizes important information about

the model you fit and about aspects of the estimation technique.

Logistic Regressions with Random Intercepts  13

The GLIMMIX Procedure

Model Information

Data Set WORK.MULTICENTER

Response Variable (Events) sideeffect

Response Variable (Trials) n

Response Distribution Binomial

Link Function Logit

Variance Function Default

Variance Matrix Blocked By center

Estimation Technique Residual PL

Degrees of Freedom Method Containment

Figure 1. Model Information

PROC GLIMMIX recognizes the variables sideeffect and n as the numerator and

denominator in the events/trials syntax, respectively. The distribution—conditional

on the random center effects—is binomial. The marginal variance matrix is blockdiagonal,

and observations from the same center form the blocks. The default estimation

technique in generalized linear mixed models is residual pseudo-likelihood with

a subject-specific expansion (METHOD=RSPL).

In Figure 2, the “Class Level Information” table lists the levels of the variables

specified in the CLASS statement and the ordering of the levels. The “Number of

Observations” table displays the number of observations read and used in the analysis.

Class Level Information

Class Levels Values

center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

group 2 A B

Number of Observations Read 30

Number of Observations Used 30

Number of Events 155

Number of Trials 503

Figure 2. Class Level Information and Number of Observations

There are two variables listed in the CLASS statement. The center variable has

fifteen levels, and the group variable has two levels. Since the response is specified

through the events/trial syntax, the “Number of Observations” table also contains the

total number of events and trials used in the analysis.

The “Dimensions” table in Figure 3 lists the size of relevant matrices.

14  The GLIMMIX Procedure

Dimensions

G-side Cov. Parameters 1

Columns in X 3

Columns in Z per Subject 1

Subjects (Blocks in V) 15

Max Obs per Subject 2

Figure 3. Dimensions

There are three columns in the X matrix, corresponding to an intercept and the two

levels of the group variable. For each subject (center), the Z matrix contains only an

intercept column.

The “Optimization Information” table in Figure 4 provides information about the

methods and size of the optimization problem.

Optimization Information

Optimization Technique Dual Quasi-Newton

Parameters in Optimization 1

Lower Boundaries 1

Upper Boundaries 0

Fixed Effects Profiled

Starting From Data

Figure 4. Optimization Information

The default optimization technique for generalized linear mixed models is the Quasi-

Newton method. Because a residual likelihood technique is used to compute the objective

function, only the covariance parameters are participating in the optimization.

A lower boundary constraint is placed on the variance component for the random

center effect. The solution for this variance cannot be less than zero.

The “Iteration History” table in Figure 5 displays information about the progress of

the optimization process.

Logistic Regressions with Random Intercepts  15

Iteration History

Objective Max

Iteration Restarts Subiterations Function Change Gradient

0 0 5 79.688580269 0.11807224 7.851E-7

1 0 3 81.294622554 0.02558021 8.209E-7

2 0 2 81.438701534 0.00166079 4.061E-8

3 0 1 81.444083567 0.00006263 2.311E-8

4 0 1 81.444265216 0.00000421 0.000025

5 0 1 81.444277364 0.00000383 0.000023

6 0 1 81.444266322 0.00000348 0.000021

7 0 1 81.44427636 0.00000316 0.000019

8 0 1 81.444267235 0.00000287 0.000017

9 0 1 81.444275529 0.00000261 0.000016

10 0 1 81.44426799 0.00000237 0.000014

11 0 1 81.444274843 0.00000216 0.000013

12 0 1 81.444268614 0.00000196 0.000012

13 0 1 81.444274277 0.00000178 0.000011

14 0 1 81.444269129 0.00000162 9.772E-6

15 0 0 81.444273808 0.00000000 9.102E-6

Convergence criterion (PCONV=1.11022E-8) satisfied.

Figure 5. Iteration History and Convergence Status

After the initial optimization, the GLIMMIX procedure performed 15 updates before

the convergence criterion was met. At convergence, the largest absolute value of the

gradient was near zero. This indicates that the process stopped at an extremum of the

objective function.

The “Fit Statistics” table in Figure 6 lists information about the fitted model.

Fit Statistics

-2 Res Log Pseudo-Likelihood 81.44

Generalized Chi-Square 30.69

Gener. Chi-Square / DF 1.10

Figure 6. Fit Statistics

Twice the negative of the residual log lilikelihood in the final pseudo-model equaled

81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is

close to 1. This is a measure of the residual variability in the marginal distribution of

the data.

The “Covariance Parameter Estimates” table in Figure 7 displays estimates and

asymptotic estimated standard errors for all covariance parameters.

16  The GLIMMIX Procedure

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

Intercept center 0.6176 0.3181

Figure 7. Covariance Parameter Estimates

The variance of the random center intercepts on the logit scale is estimated as b2

c =

0.6176.

The “Parameter Estimates” table in Figure 8 displays the solutions for the fixed effects

in the model.

Solutions for Fixed Effects

Standard

Effect group Estimate Error DF t Value Pr > |t|

Intercept -0.8071 0.2514 14 -3.21 0.0063

group A -0.4896 0.2034 14 -2.41 0.0305

group B 0 . . . .

Figure 8. Parameter Estimates

Because of the fixed-effects parameterization used in the GLIMMIX procedure, the

“Intercept” effect is an estimate of 0 + B, and the “A” group effect is an estimate

of A B, the log-odds ratio. The associated estimated probabilities of side effects

in the two groups are

bA =

1

1 + exp{0.8071 + 0.4896}

= 0.2147

bB =

1

1 + exp{0.8071}

= 0.3085

There is a significant difference between the two groups (p=0.0305).

The “Type III Tests of Fixed Effect” table in Figure 9 displays significance tests for

the fixed effects in the model.

Type III Tests of Fixed Effects

Num Den

Effect DF DF F Value Pr > F

group 1 14 5.79 0.0305

Figure 9. Type III Tests of Fixed Effects

Logistic Regressions with Random Intercepts  17

Because the group effect has only two levels, the p-value for the effect is the same as

in the “Parameter Estimates” table, and the “F Value” is the square of the “t Value”

shown there.

You can produce the estimates of the average logits in the two groups and their predictions

on the scale of the data with the LSMEANS statement in PROC GLIMMIX.

ods select lsmeans;

proc glimmix data=multicenter;

class center group;

model sideeffect/n = group / solution;

random intercept / subject=center;

lsmeans group / cl ilink;

run;

The LSMEANS statement requests the least-squares means of the group effect on the

logit scale. The CL option requests their confidence limits. The ILINK option adds

estimates, standard errors, and confidence limits on the mean (probability) scale. The

table in Figure 10 displays the results.

The GLIMMIX Procedure

group Least Squares Means

Standard

group Estimate Error DF t Value Pr > |t| Alpha Lower Upper

A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388

B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679

group Least Squares Means

Standard

Error Lower Upper

group Mean Mean Mean Mean

A 0.2147 0.04385 0.1354 0.3233

B 0.3085 0.05363 0.2065 0.4334

Figure 10. Least-squares Means

The “Estimate” column displays the least-squares mean estimate on the logit scale,

and the “Mean” column represents its mapping onto the probability scale. The

“Lower” and “Upper” columns are 95% confidence limits for the logits in the two

groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence

limits for the probabilities of side effects. These limits are obtained by inversely

linking the confidence bounds on the linear scale, and thus are not symmetric

about the estimate of the probabilities.

Basic Features

The GLIMMIX procedure enables you to specify a generalized linear mixed model

and to perform confirmatory inference in such models. The syntax is similar to that

of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.

The following are some of the basic features of PROC GLIMMIX.

SUBJECT= and GROUP= options, which enable blocking of variance matrices

and parameter heterogeneity

choice of linearization about expected values or expansion about current solutions

of best linear unbiased predictors

flexible covariance structures for random and residual random effects, including

variance components, unstructured, autoregressive, and spatial structures

CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements,

which produce hypothesis tests and estimable linear combinations of effects

NLOPTIONS statement, which enables you to exercise control over the numerical

optimization. You can choose techniques, update methods, line search

algorithms, convergence criteria, and more. Or, you can choose the default

optimization strategies selected for the particular class of model you are fitting

computed variables with SAS programming statements inside of PROC

GLIMMIX (except for variables listed in the CLASS statement). These computed

variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ

statements.

grouped data analysis

user-specified link and variance functions

choice of model-based variance-covariance estimators for the fixed effects

or empirical (sandwich) estimators to make analysis robust against misspecification

of the covariance structure and to adjust for small-sample bias

joint modeling for multivariate data. For example, you can model binary and

normal responses from a subject jointly and use random effects to relate (fuse)

the two outcomes.

multinomial models for ordinal and nominal outcomes

univariate and multivariate low-rank smoothing

Assumptions

The primary assumptions underlying the analyses performed by PROC GLIMMIX

are as follows:

If the model contains random effects, the distribution of the data conditional

on the random effects is known. This distribution is either a member of the

exponential family of distributions or one of the supplementary distributions

provided by the GLIMMIX procedure. In models without random effects, the

Notation for the Generalized Linear Mixed Model  7

unconditional (marginal) distribution is assumed to be known for maximum

likelihood estimation, or the first two moments are known in the case of quasilikelihood

estimation.

The conditional expected value of the data takes the form of a linear mixed

model after a monotonic transformation is applied.

The problem of fitting the GLMM can be cast as a singly or doubly iterative

optimization problem. The objective function for the optimization is a function

of either the actual log likelihood, an approximation to the log likelihood, or

the log likelihood of an approximated model.

For a model containing random effects, the GLIMMIX procedure, by default, estimates

the parameters by applying pseudo-likelihood techniques as in Wolfinger and

O’Connell (1993) and Breslow and Clayton (1993). In a model without random

effects (GLM models), PROC GLIMMIX estimates the parameters by maximum

likelihood, restricted maximum likelihood, or quasi-likelihood. See the “Singly or

Doubly Iterative Fitting” section on page 140 on when the GLIMMIX procedure applies

noniterative, singly and doubly iterative algorithms, and the “Default Estimation

Techniques” section on page 142 on the default estimation methods.

Once the parameters have been estimated, you can perform statistical inferences for

the fixed effects and covariance parameters of the model. Tests of hypotheses for

the fixed effects are based on Wald-type tests and the estimated variance-covariance

matrix.

PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling

the output from SAS procedures. ODS enables you to convert any of the

output from PROC GLIMMIX into a SAS data set. See the “ODS Table Names”

section on page 160.

ODS statistical graphics are available with the GLIMMIX procedure. For more information,

see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.

For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical

Graphics Using ODS” (SAS/STAT User’s Guide).

Notation for the Generalized Linear Mixed Model

This section introduces the mathematical notation used throughout the chapter to

describe the generalized linear mixed model (GLMM). See the “Details” section on

page 108 for a description of the fitting algorithms and the mathematical-statistical

details.

The Basic Model

Suppose Y represents the (n × 1) vector of observed data and is a (r × 1) vector

of random effects. Models fit by the GLIMMIX procedure assume that

E[Y|] = g1(X + Z)

where g(·) is a differentiable monotonic link function and g1(·) is its inverse. The

matrixXis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random

8  The GLIMMIX Procedure

effects. The random effects are assumed to be normally distributed with mean 0 and

variance matrix G.

The GLMM contains a linear mixed model inside the inverse link function. This

model component is referred to as the linear predictor,

 = X + Z

The variance of the observations, conditional on the random effects, is

var[Y|] = A1/2RA1/2

The matrix A is a diagonal matrix and contains the variance functions of the model.

The variance function expresses the variance of a response as a function of the mean.

The GLIMMIX procedure determines the variance function from the DIST= option in

the MODEL statement or from the user-supplied variance function (see the “Implied

Variance Functions” section on page 104). The matrix R is a variance matrix specified

by the user through the RANDOM statement. If the conditional distribution of

the data contains an additional scale parameter, it is either part of the variance functions

or part of the R matrix. For example, the gamma distribution with mean μ has

variance function a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side random

effects only (see below), the procedure models R = -I, where I is the identity

matrix. Table 9 on page 104 identifies the distributions for which -  1.

G-side and R-side Random Effects

The GLIMMIX procedure distinguishes two types of random effects. Depending on

whether the variance of the random effect is contained inGor inR, these are referred

to as “G-side” and “R-side” random effects. R-side effects are also called “residual”

effects. Simply put, if a random effect is an element of , it is a G-side effect;

otherwise, it is an R-side effect. Models without G-side effects are also known as

marginal (or population-averaged) models. Models fit with the GLIMMIX procedure

can have none, one, or more of each type of effect.

Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED

effect in the MIXED procedure. In the GLIMMIX procedure all random effects are

specified through the RANDOM statement.

The columns of X are constructed from effects listed on the right-hand side in the

MODEL statement. Columns of Z and the variance matricesGandRare constructed

from the RANDOM statement.

The R matrix is by default the scaled identity matrix, R = -I. The scale parameter

- is set to one if the distribution does not have a scale parameter, for example, in

the case of the the binary, binomial, Poisson, and exponential distribution (see Table

9 on page 104). To specify a different R matrix, use the RANDOM statement with

the –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that

the Time effect for each patient is an R-side effect with a first-order autoregressive

covariance structure, use the RESIDUAL option:

PROC GLIMMIX Contrasted with Other SAS Procedures  9

random time / type=ar(1) subject=patient residual;

To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword

random _residual_;

You specify the link function g(·) with the LINK= option of the MODEL statement

or with programming statements. You specify the variance function that controls the

matrix A with the DIST= option of the MODEL statement or with programming

statements.

Unknown quantities subject to estimation are the fixed-effects parameter vector

and the covariance parameter vector  that comprises all unknowns in G and R.

The random effects are not parameters of the model in the sense that they are not

estimated. The vector is a vector of random variables. The solutions for are

predictors of these random variables.

Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of

be computed at every iteration.

Relationship with Generalized Linear Models

Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder

1989) are a special case of GLMMs. If = 0 and R = -I, the GLMM reduces to

either a generalized linear model (GLM) or a GLM with overdispersion. For example,

if Y is a vector of Poisson variables so that A is a diagonal matrix containing

E[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1

and overdispersed relative to a Poisson distribution for - > 1. Since the Poisson

distribution does not have an extra scale parameter, you can model overdispersion by

adding the statement

random _residual_;

to your GLIMMIX statements. If the only random effect is an overdispersion effect,

PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of

the methods specific to GLMMs.

PROC GLIMMIX Contrasted with Other SAS Procedures

The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two

important ways. First, the response can have a nonnormal distribution. The MIXED

procedure assumes that the response is normally (Gaussian) distributed. Second,

the GLIMMIX procedure incorporates random effects in the model and so allows

for subject-specific (conditional) and population-averaged (marginal) inference. The

GENMOD procedure only allows for marginal inference.

The GLIMMIX and MIXED procedure are closely related; see the syntax and feature

comparison in the section “Comparing PROC GLIMMIX with PROC MIXED”

10  The GLIMMIX Procedure

on page 138. The remainder of this section compares PROC GLIMMIX with the

GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.

The GENMOD procedure fits generalized linear models for independent data by

maximum likelihood. It can also handle correlated data through the marginal GEE

approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation

in the GENMOD procedure is a marginal method that does not incorporate

random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances

only, and the unknown parameters in R are estimated by the method of

moments. The GLIMMIX procedure allows G-side random effects and R-side covariances.

The parameters are estimated by likelihood-based techniques.

Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.

In a GLMM it is not always possible to derive the log likelihood of the data.

Even if the log likelihood is tractable, it may be computationally infeasible. In some

cases, the objective function must be constructed based on a substitute model. In other

cases, only the first two moments of the marginal distribution can be approximated.

Consequently, obtaining likelihood-based tests and statistics is difficult in the majority

of generalized linear mixed models. The GLIMMIX procedure relies heavily on

linearization and Taylor-series techniques to construct Wald-type test statistics and

confidence intervals. Likelihood ratio tests and confidence intervals are not available

in the GLIMMIX procedure.

The NLMIXED procedure also fits generalized linear mixed models but the class of

models it can accommodate is more narrow. The NLMIXED procedure relies on approximating

the marginal log likelihood by integral approximation through Gaussian

quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the

problem of obtaining solutions for the parameter estimates as an optimization problem.

The objective function for the NLMIXED procedure is the marginal log likelihood

obtained by integrating out the random effects from the joint distribution of

responses and random effects using quadrature techniques. Although these are very

accurate, the number of random effects that can be practically managed is limited.

Also, R-side random effects cannot be accommodated with the NLMIXED procedure.

The GLIMMIX procedure, on the other hand, determines the marginal log

likelihood as that of an approximate linear mixed model. This allows multiple random

effects, nested and crossed random effects, multiple cluster types, and R-side

random components. The disadvantage is a doubly iterative fitting algorithm and the

absence of a true log likelihood.

The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate

the independence case only. Binary, binomial, multinomial models for

ordered data, and generalized logit models that can be fit with PROC LOGISTIC,

can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities

specific to such data implemented in the LOGISTIC procedure go beyond the

capabilities of the GLIMMIX procedure.

Logistic Regressions with Random Intercepts  11

Getting Started

Logistic Regressions with Random Intercepts

Researchers investigated the performance of two medical procedures in a multicenter

study. They randomly selected 15 centers for inclusion. One of the study goals

was to compare the occurrence of side effects for the procedures. In each center nA

patients were randomly selected and assigned to procedure “A,” and nB patients were

randomly assigned to procedure “B”. The following DATA step creates the data set

for the analysis.

data multicenter;

input center group$ n sideeffect;

datalines;

1 A 32 14

1 B 33 18

2 A 30 4

2 B 28 8

3 A 23 14

3 B 24 9

4 A 22 7

4 B 22 10

5 A 20 6

5 B 21 12

6 A 19 1

6 B 20 3

7 A 17 2

7 B 17 6

8 A 16 7

8 B 15 9

9 A 13 1

9 B 14 5

10 A 13 3

10 B 13 1

11 A 11 1

11 B 12 2

12 A 10 1

12 B 9 0

13 A 9 2

13 B 9 6

14 A 8 1

14 B 8 1

15 A 7 1

15 B 8 0

;

The variable group identifies the two procedures, n is the number of patients who

received a given procedure in a particular center, and sideeffect is the number of

patients who reported side effects.

If YiA and YiB denote the number of patients in center i who report side effects for

procedures A and B, respectively, then—for a given center—these are independent

12  The GLIMMIX Procedure

binomial random variables. To model the probability of side effects for the two drugs,

iA and iB, you need to account for the fixed group effect and the random selection

of centers. One possibility is to assume a model that relates group and center effects

linearly to the logit of the probabilities:

log



iA

1 iA



= 0 + A + i

log



iB

1 iB



= 0 + B + i

In this model, A B measures the difference in the logits of experiencing side

effects, and the i are independent random variables due to the random selection

of centers. If you think of 0 as the overall intercept in the model, then the i are

random intercept adjustments. Observations from the same center receive the same

adjustment, and these vary randomly from center to center with variance var[i] =

2

c .

Since iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA,

you can model the sample proportions as binomial ratios in a generalized linear mixed

model. The following statements request this analysis under the assumption of normally

distributed center effects with equal variance and a logit link function.

proc glimmix data=multicenter;

class center group;

model sideeffect/n = group / solution;

random intercept / subject=center;

run;

The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs

the procedure to treat the variables center and group as classification variables.

The MODEL statement specifies the response variable as a sample proportion

using the events/trials syntax. In terms of the previous formulas, sideeffect/n corresponds

to YiA/niA for observations from Group A and to YiB/niB for observations

from Group B. The SOLUTION option in the MODEL statement requests a listing

of the solutions for the fixed-effects parameter estimates. Note that because of the

events/trials syntax, the GLIMMIX procedure defaults to the binomial distribution,

and that distribution’s default link is the logit link. The RANDOM statement specifies

that the linear predictor contains an intercept term that randomly varies at the

level of the center effect. In other words, a random intercept is drawn separately and

independently for each center in the study.

The results of this analysis are shown in the following tables.

The “Model Information Table” in Figure 1 summarizes important information about

the model you fit and about aspects of the estimation technique.

Logistic Regressions with Random Intercepts  13

The GLIMMIX Procedure

Model Information

Data Set WORK.MULTICENTER

Response Variable (Events) sideeffect

Response Variable (Trials) n

Response Distribution Binomial

Link Function Logit

Variance Function Default

Variance Matrix Blocked By center

Estimation Technique Residual PL

Degrees of Freedom Method Containment

Figure 1. Model Information

PROC GLIMMIX recognizes the variables sideeffect and n as the numerator and

denominator in the events/trials syntax, respectively. The distribution—conditional

on the random center effects—is binomial. The marginal variance matrix is blockdiagonal,

and observations from the same center form the blocks. The default estimation

technique in generalized linear mixed models is residual pseudo-likelihood with

a subject-specific expansion (METHOD=RSPL).

In Figure 2, the “Class Level Information” table lists the levels of the variables

specified in the CLASS statement and the ordering of the levels. The “Number of

Observations” table displays the number of observations read and used in the analysis.

Class Level Information

Class Levels Values

center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

group 2 A B

Number of Observations Read 30

Number of Observations Used 30

Number of Events 155

Number of Trials 503

Figure 2. Class Level Information and Number of Observations

There are two variables listed in the CLASS statement. The center variable has

fifteen levels, and the group variable has two levels. Since the response is specified

through the events/trial syntax, the “Number of Observations” table also contains the

total number of events and trials used in the analysis.

The “Dimensions” table in Figure 3 lists the size of relevant matrices.

14  The GLIMMIX Procedure

Dimensions

G-side Cov. Parameters 1

Columns in X 3

Columns in Z per Subject 1

Subjects (Blocks in V) 15

Max Obs per Subject 2

Figure 3. Dimensions

There are three columns in the X matrix, corresponding to an intercept and the two

levels of the group variable. For each subject (center), the Z matrix contains only an

intercept column.

The “Optimization Information” table in Figure 4 provides information about the

methods and size of the optimization problem.

Optimization Information

Optimization Technique Dual Quasi-Newton

Parameters in Optimization 1

Lower Boundaries 1

Upper Boundaries 0

Fixed Effects Profiled

Starting From Data

Figure 4. Optimization Information

The default optimization technique for generalized linear mixed models is the Quasi-

Newton method. Because a residual likelihood technique is used to compute the objective

function, only the covariance parameters are participating in the optimization.

A lower boundary constraint is placed on the variance component for the random

center effect. The solution for this variance cannot be less than zero.

The “Iteration History” table in Figure 5 displays information about the progress of

the optimization process.

Logistic Regressions with Random Intercepts  15

Iteration History

Objective Max

Iteration Restarts Subiterations Function Change Gradient

0 0 5 79.688580269 0.11807224 7.851E-7

1 0 3 81.294622554 0.02558021 8.209E-7

2 0 2 81.438701534 0.00166079 4.061E-8

3 0 1 81.444083567 0.00006263 2.311E-8

4 0 1 81.444265216 0.00000421 0.000025

5 0 1 81.444277364 0.00000383 0.000023

6 0 1 81.444266322 0.00000348 0.000021

7 0 1 81.44427636 0.00000316 0.000019

8 0 1 81.444267235 0.00000287 0.000017

9 0 1 81.444275529 0.00000261 0.000016

10 0 1 81.44426799 0.00000237 0.000014

11 0 1 81.444274843 0.00000216 0.000013

12 0 1 81.444268614 0.00000196 0.000012

13 0 1 81.444274277 0.00000178 0.000011

14 0 1 81.444269129 0.00000162 9.772E-6

15 0 0 81.444273808 0.00000000 9.102E-6

Convergence criterion (PCONV=1.11022E-8) satisfied.

Figure 5. Iteration History and Convergence Status

After the initial optimization, the GLIMMIX procedure performed 15 updates before

the convergence criterion was met. At convergence, the largest absolute value of the

gradient was near zero. This indicates that the process stopped at an extremum of the

objective function.

The “Fit Statistics” table in Figure 6 lists information about the fitted model.

Fit Statistics

-2 Res Log Pseudo-Likelihood 81.44

Generalized Chi-Square 30.69

Gener. Chi-Square / DF 1.10

Figure 6. Fit Statistics

Twice the negative of the residual log lilikelihood in the final pseudo-model equaled

81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is

close to 1. This is a measure of the residual variability in the marginal distribution of

the data.

The “Covariance Parameter Estimates” table in Figure 7 displays estimates and

asymptotic estimated standard errors for all covariance parameters.

16  The GLIMMIX Procedure

Covariance Parameter Estimates

Standard

Cov Parm Subject Estimate Error

Intercept center 0.6176 0.3181

Figure 7. Covariance Parameter Estimates

The variance of the random center intercepts on the logit scale is estimated as b2

c =

0.6176.

The “Parameter Estimates” table in Figure 8 displays the solutions for the fixed effects

in the model.

Solutions for Fixed Effects

Standard

Effect group Estimate Error DF t Value Pr > |t|

Intercept -0.8071 0.2514 14 -3.21 0.0063

group A -0.4896 0.2034 14 -2.41 0.0305

group B 0 . . . .

Figure 8. Parameter Estimates

Because of the fixed-effects parameterization used in the GLIMMIX procedure, the

“Intercept” effect is an estimate of 0 + B, and the “A” group effect is an estimate

of A B, the log-odds ratio. The associated estimated probabilities of side effects

in the two groups are

bA =

1

1 + exp{0.8071 + 0.4896}

= 0.2147

bB =

1

1 + exp{0.8071}

= 0.3085

There is a significant difference between the two groups (p=0.0305).

The “Type III Tests of Fixed Effect” table in Figure 9 displays significance tests for

the fixed effects in the model.

Type III Tests of Fixed Effects

Num Den

Effect DF DF F Value Pr > F

group 1 14 5.79 0.0305

Figure 9. Type III Tests of Fixed Effects

Logistic Regressions with Random Intercepts  17

Because the group effect has only two levels, the p-value for the effect is the same as

in the “Parameter Estimates” table, and the “F Value” is the square of the “t Value”

shown there.

You can produce the estimates of the average logits in the two groups and their predictions

on the scale of the data with the LSMEANS statement in PROC GLIMMIX.

ods select lsmeans;

proc glimmix data=multicenter;

class center group;

model sideeffect/n = group / solution;

random intercept / subject=center;

lsmeans group / cl ilink;

run;

The LSMEANS statement requests the least-squares means of the group effect on the

logit scale. The CL option requests their confidence limits. The ILINK option adds

estimates, standard errors, and confidence limits on the mean (probability) scale. The

table in Figure 10 displays the results.

The GLIMMIX Procedure

group Least Squares Means

Standard

group Estimate Error DF t Value Pr > |t| Alpha Lower Upper

A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388

B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679

group Least Squares Means

Standard

Error Lower Upper

group Mean Mean Mean Mean

A 0.2147 0.04385 0.1354 0.3233

B 0.3085 0.05363 0.2065 0.4334

Figure 10. Least-squares Means

The “Estimate” column displays the least-squares mean estimate on the logit scale,

and the “Mean” column represents its mapping onto the probability scale. The

“Lower” and “Upper” columns are 95% confidence limits for the logits in the two

groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence

limits for the probabilities of side effects. These limits are obtained by inversely

linking the confidence bounds on the linear scale, and thus are not symmetric

about the estimate of the probabilities.

 加载中...