Models for handling sample selection or informative missingness have been developed for both cross sectional and longitudinal or panel data. For cross sectional data, Heckman (1979) suggested a joint model for the response and sample selection processes where the disturbances of the processes are correlated. For longitudinal data, Hausman and Wise (1979) and Diggle and Kenward (1994) developed a model in which the continuous response (observed or unobserved), and possibly the lagged response, is a predictor of attrition or dropout. The Heckman model can be estimated using the heckman command in Stata and the Diggle-Kenward model is available in the Oswald package running in S-PLUS. Both models can also be estimated using gllamm with the advantage that the following three generalisations are possible. First, the models can be extended to multilevel settings where there may be unobserved heterogeneity between the clusters at the different levels in both the substantive and selection processes and where selection may operate at several levels. Second, the Heckman model can be modified for non-normal response processes. Third, both the Heckman and Diggle-Kenward models can be extended to situations where the substantive response is a latent variable measured by a number of indicators. I will show how the standard Heckman and Diggle-Kenward models are estimated in gllamm and give a examples of all three types of generalisation of these standard models. The research was carried out jointly with Anders Skrondal and Andrew Pickles.

Sophia Rabe-Hesketh

English

Research Papers in Economics

'
&
$
%
Multilevel Selection Models using gllamm
Stata User Group Meeting in Maastricht, May 2002
Sophia Rabe-Hesketh
Department of Biostatistics and Computing
Institute of Psychiatry, London
Joint work with
Anders Skrondal, Norwegian Institute of Public Health
and
Andrew Pickles, The University of Manchester
Gllamm can be downloaded from
http://www.iop.kcl.ac.uk/iop/departments/biocomp/programs/gllamm.html
Slide 1
'
&
$
%
Outline
Brief introduction to GLLAMM
The Heckman model and extensions
The Hausman-Wise-Diggle-Kenward dropout model and extensions
Application: Cluster randomized study of sex education in Norwegian schools
Slide 2
1
'
&
$
%
Overview of GLLAMM models
² Response model: Generalised linear model conditional on latent variables
– Linear predictor: latent variables as factors or random coeﬃcients
– Links and distributions
² Structural model: Equations for the latent variables
– Regressions of latent variables on observed variables
– Regressions of latent variables on other latent variables
² Distribution of the latent variables (disturbances)
– Multivariate normal
– Discrete
Slide 3
'
&
$
%
Linear Predictor in GLLAMM
´ = ¯0x +
L X
l=2
Ml X
m=1
u(l)
m¸
(l)
m
0z(l)
m for identiﬁcation; ¸
(l)
m1 = 1
² Fixed part: ¯0x as usual
² Random part:
– u(l)
m is mth latent variable at level l, m = 1;¢¢¢;Ml, l = 2;¢¢¢;L
– u(l)
m can be a factor or a random coeﬃcient
– z(l)
m are variables and ¸
(l)
m are parameters
– Unless regressions for the latent variables are speciﬁed, latent variables at
diﬀerent levels are independent whereas latent variables at the same level may be
correlated.
Slide 4
2'
&
$
%
Random coeﬃcient models in GLLAMM
² One covariate multiplies each latent variable,
u(l)
mz(l)
m (¸(l)
m = 1)
² e.g. Latent growth curve model for individuals j (level 2) observed at times tij,
i = 1;¢¢¢;nj (level 1)
´ij = ¯1 + ¯2tij + u
(2)
1j + u
(2)
2j tij
¯1, ¯2 : mean intercept and slope
u
(2)
1j , u
(2)
2j : random deviations of the subject-speciﬁc intercepts and slopes
from their means
² The model can also be deﬁned as
´ij = b1j + b2jtij
b1j = ¯1 + u1j
b2j = ¯2 + u2j
Slide 5
'
&
$
%
Factor models in GLLAMM
² A linear combination of dummy variables for the items multiplies each latent
variable,
u(l)
m¸
(l)
m
0z(l)
m
² e.g. One-factor model for items i, i = 1;¢¢¢;I (level 1) and subjects j (level 2)
´ij = ¯1±1i + ¢¢¢ + ¯I±Ii + u
(2)
j (±1i + ¸
(2)
2 ±2i ¢¢¢ + ¸
(2)
I ±Ii)
= ¯i + u
(2)
j ¸
(2)
i
; ±pi =
8
> > <
> > :
1 if p = i
0 otherwise
¯i: intercept for item i
u
(2)
j : common factor
¸
(2)
i : factor loading for item i, ¸
(2)
1 = 1
unit j item i ±1i ±2i ¢¢¢ ±Ii yij
1 1 1 0 ¢¢¢ 0 y11
1 2 0 1 ¢¢¢ 0 y21
. . . . . . . . . . . . ... . . . . . .
1 I 0 0 ¢¢¢ 1 yI1
Slide 6
3
'
&
$
%
Heckman selection model
² Selection equation (probit regression):
y¤
1j = °0zj + ²1j; ²1j » N(0;1);
y1j = I(y¤
i1 > 0)
² Substantive equation (linear regression):
y2j =
8
> > <
> > :
®0wj + ²2j; ²2j » N(0;¾2) if yi1 = 1
missing if yi1 = 0
;
² Correlation
cor(²1j;²2j) = ½
² Missingness or non-selection is
– Completely at random if ° = 0 and ½ = 0
– At random if ½ = 0
– Informative if ½ 6= 0
Slide 7
'
&
$
%
Heckman selection model as a GLLAMM model
² Parameterize random part as
²ij = u
(2)
j ¸i + eij; var(u
(2)
j ) = 1; ¸1 = 1; var(eij) = º2; cov(e1j;e2j) = 0
² Write as a GLLAMM model
´ij = ¯1
0zij±1i + ¯2
0wij±2i + u
(2)
j (±1i + ¸2±2i); var(u
(2)
j ) = 1;
yijj´ij »
8
> > <
> > :
Bernouilli(Φ(´ij=º)) if i = 1 (Binomial with scaled probit link)
N(´ij;º2) if i = 2 (Gaussian with identity link)
e1
6
y¤
1
e2
6
y2
u(2)
"!
#Ã
@
@
@ @ R
¡
¡
¡ ¡ ª
1 ¸2
z
?
w
?
¯1 ¯2
² Equivalences:
¾2 = ¸2
2 + º2; ½ =
¸2
q
(¸2
2 + º2)(1 + º2)
; ° =
¯1 p
1 + º2; ® = ¯2
Slide 8
4'
&
$
%
Syntax for linear predictor in gllamm
gllamm [varlist] [ if exp] [ in range] , i(varlist) [ nrf(numlist)
eqs(eqnames) noconstant offset(varname) constraints(numlist)
¢¢¢
i(varlist) L ¡ 1 variables identifying the hierarchical, nested clusters, from level 2 to L,
e.g., i(pupil class school).
nrf(numlist) L¡1 numbers specifying the numbers of latent variables Ml at each level.
eqs(eqnames) M =
PMl equations for the ¸
(l)
mz(l)
m multiplying each latent variable. No
constant is assumed unless explicitly included in the equation deﬁnition.
noconstant no constant in the ﬁxed part ¯0x.
offset(varname) variable in ﬁxed part with regression coeﬃcient set to 1.
constraints(numlist) list of linear parameter constraints deﬁned using the
constraint define command.
Slide 9
'
&
$
%
Heckman selection model - linear predictor in gllamm
² heckman command in Stata
heckman y2 w, select(y1 = z w)
² Linear predictor in gllamm
´ij = ¯1zij±1i + ¯2wij±1i + ¯3±1i + ¯4wij±2i + ¯5±2i + u
(2)
j (±1i + ¸2±2i)
² Data manipulation
gen id = _n
reshape long y, i(id) j(var)
tab var, gen(i) /* i1 = ±1i, i2 = ±2i */
gen z_i1 = z*i1
gen w_i1 = w*i1
gen w_i2 = w*i2
² gllamm command
eq load: i1 i2 /* for u
(2)
j (±1i + ¸2±2i) */
constraint define 1 [id1]i1 = 1 /* sets var(u
(2)
j ) = 1 */
gllamm y z_i1 w_i1 i1 w_i2 i2, i(id) eqs(load) nocons constr(1) /*
*/ << more options >>
Slide 10
5
'
&
$
%
Links and families in GLLAMM
² The conditional expectation of the response is ‘linked’ to the linear predictor
g(E[yjx;u;z]) = ´
² The conditional distribution of the response is from the exponential family
² The response variables may be of mixed type - requiring mixed links and families:
Links
identity
reciprocal
logarithm
logit
probit
scaled probit
compl. log-log
Families
Gaussian
gamma
Poisson
binomial
Polytomous responses
ordinal logit
ordinal probit
ordinal compl. log-log
scaled ord. probit
multinomial logit
² Heteroscedasticity: The dispersion parameter for the Gauss and gamma families and
the scale for the scaled probit link can depend on covariates: log Á = ®0z(1)
Slide 11
'
&
$
%
Options for links and families in gllamm
[ ¢¢¢ family(families) fv(varname) link(links) lv(varname) nats
s(eqname) ¢¢¢ ]
family(families) family or families to be used.
fv(varname) variable whose values indicate which family applies to which observation.
link(links) and lv(varname) analogous to family(families) and fv(varname).
nats option to estimate the scale parameter directly instead of its logarithm.
s(eqname) equation for (log) scale parameter.
² Heckman selection model: links and families
reshape long y, i(id) j(var) /* var = 1 for y=y1 and 2 for y=y2 */
tab var, gen(i)
<< more data manipulation >>
gllamm y z_i1 w_i1 i1 w_i2 i2, nocons i(id) eq(load) constr(1) /*
*/ family(binom gauss) fv(var) link(sprobit ident) lv(var) /*
*/ nip(10) adapt
Slide 12
6'
&
$
%
Multilevel Heckman selection model
² Example: longitudinal data with observations at times t on subjects j where data
are missing intermittently
² Add correlated subject level random eﬀects u
(3)
j1 for the selection model and u
(3)
j2 for
the substantive model:
´itj = ¯1
0ztj±1i + ¯2
0wtj±2i + u
(2)
tj (±1i + ¸±2i) + u
(3)
j1 ±1i + u
(3)
j2 ±2i; var(u
(2)
tj ) = 1
j
tj e1
6
y¤
1
e2
6
y2
u(2)
"!
#Ã
@
@
@ @ R
¡
¡
¡ ¡ ª
1 ¸2
u
(3)
1
"!
#Ã
u
(3)
2
"!
#Ã
j ¼
? ?
z
?
w
?
¯1 ¯2
² The variances of u
(3)
1 and u
(3)
2 are identiﬁed through the intraclass correlations in the
selection and substantive models respectively.
Slide 13
'
&
$
%
Multilevel Heckman selection model in gllamm
² Linear predictor
´itj = ¯1ztj±1i + ¯2wtj±1i + ¯3±1i + ¯4wtj±2i + ¯5±2i + u
(2)
tj (±1i + ¸±2i) + u
(3)
j1 ±1i + u
(3)
j2 ±2i
² gllamm command
eq load: i1 i2 /* for u
(2)
tj (±1i + ¸2±1i) */
eq i1: i1 /* for u
(3)
j1 */
eq i2: i2 /* for u
(3)
j2 */
constraint define 1 [t1]i1 = 1 /* sets var(u
(2)
tj ) = 1 */
gllamm y z_i1 w_i1 i1 w_i2 i2, nocons i(t id) nrf(1 2) eq(load i1 i2) /*
*/ constr(1) family(binom gauss) fv(var) link(sprobit ident) lv(var) /*
*/ nip(19 15) ip(m) adapt
Slide 14
7
'
&
$
%
Hausman-Wise-Diggle-Kenward dropout model
² Longitudinal data at times t = 1;2;3 for subjects j. Subjects drop out at some time
t > 1 and never return
² Substantive model (without autocorrelated errors)
ytj = ¯0xtj + uj + ²tj; ²tj » N(0;¾2); uj » N(0;¿2)
² Dropout model (dtj = 1 if subject j drops out at time t > 1)
logit(Pr(dtj = 1)) = ®0 + ®1y¤
tj + ®2yt¡1;j; y¤
tj =
8
> > <
> > :
observed ytj if dtj = 0
unobserved ytj if dtj = 1
² Dropout is
– Completely at random if ®1 = ®2 = 0
– At random if ®1 = 0
– Informative if ®1 6= 0
Slide 15
'
&
$
%
Diagram for dropout model
Complete data (d2 = d3 = 0) Dropout at time 2 (d2 = 1)
6 6
d2 d3
y1 ¾ y2 ¾ y3 ¾
x1 x2 x3
? ?
Q
Q
Q
Q
Q
Q
Q
Q Q s
Q
Q
Q
Q
Q
Q
Q
Q Q s
? ? ?
²1 ²2 ²3
¯ ¯ ¯
®1 ®1
®2 ®2
1 1 1
u
"!
#Ã
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤ ²
B
B
B
B
B
B
B
B
B
B
B
B N
@
@
@
@
@
@
@
@
@
@
@
@ @ R
6
d2
y1 ¾ y¤
2
"!
#Ã
¾
x1 x2
?
Q
Q
Q
Q
Q
Q
Q
Q Q s
?
?
²1 ²2
¯ ¯
®1
®2
1 1
u
"!
#Ã
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤ ²
B
B
B
B
B
B
B
B
B
B
B B N
Slide 16
8'
&
$
%
Structural model in GLLAMM
Regressions of latent variables on other latent and explanatory variables
u = Bu + Γw + ³
² u = (u
(2)
1 ;u
(2)
2 ;¢¢¢;u
(2)
M2;¢¢¢;u
(l)
1 ;¢¢¢;u
(l)
Ml;¢¢¢;u
(L)
ML)0 (M elements)
– factors
– random coeﬃcients
² B is an upper diagonal M £ M matrix of regression coeﬃcients
² Γ is an M £ p matrix of regression coeﬃcients
² w is a p dimensional vector of explanatory variables
² ³ is an M dimensional vector of errors/disturbances
(same level as corresponding elements in u).
Slide 17
'
&
$
%
Hausman-Wise-Diggle-Kenward dropout model in GLLAMM
Complete data (d2 = d3 = 0) Dropout at time 2 (d2 = 1)
6 6
d2 d3
y1 ¾ y2 ¾ y3 ¾
x1 x2 x3
? ?
Q
Q
Q
Q
Q
Q
Q
Q Q s
Q
Q
Q
Q
Q
Q
Q
Q Q s
? ? ?
²1 ²2 ²3
¯ ¯ ¯
®1 ®1
®2 ®2
1 1 1
u(3)
"!
#Ã
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤ ²
B
B
B
B
B
B
B
B
B
B
B
B N
@
@
@
@
@
@
@
@
@
@
@
@ @ R
6
d2
y1 ¾ u
(2)
2
"!
#Ã
¾
x1 x2
?
Q
Q
Q
Q
Q
Q
Q
Q Q s
?
?
²1 ³
(2)
2
¯ °1 = ¯
¸1 = ®1
®2
1 b12 = 1
u(3)
"!
#Ã
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤
¤ ²
B
B
B
B
B
B
B
B
B
B
B B N
Slide 18
9
'
&
$
%
Hausman-Wise-Diggle-Kenward dropout model in GLLAMM
² Linear predictor
´itj = ±1i
µ
¯0xtj + u
(3)
j
¶
+ ±2i
µ
®0 + ®1ytj(1 ¡ dtj) + ¸1u
(2)
tj dtj + ®2yi¡1;j
¶
= ¯0xtj±1i + ®0±2i + ®1ytj±2i(1 ¡ dtj) + ®2yi¡1;j±2i + u
(2)
tj ¸1±2idtj + u
(3)
j ±1i
² Response process
yitjj´itj »
8
> > <
> > :
N(´itj;¾2) if i = 1 (Gaussian with identity link)
Bernouilli
µ
exp(´itj)
1+exp(´itj)
¶
if i = 2 (Binomial with logit link)
² Structural model
2
6 6
4
u
(2)
tj
u
(3)
j
3
7 7
5 =
2
6 6
4
0 b12
0 0
3
7 7
5
2
6 6
4
u
(2)
tj
u
(3)
j
3
7 7
5 +
2
6 6
4
°1
0
00
3
7 7
5xtj +
2
6 6
4
³
(2)
tj
³
(3)
j
3
7 7
5
² Constraints
¸1 = ®1; °1 = ¯; var(³tj) = ¾2
Slide 19
'
&
$
%
Options for the structural model
[ ¢¢¢ bmatrix(matname) geqs(eqnames) frload(numlist) ¢¢¢ ]
bmatrix(matrix) M £ M matrix of 1s and 0s. Elements equal to 0 indicate that the
corresponding element in B is 0; elements equal to 1 that the corresponding element
in B should be estimated.
geqs(eqnames) equations for regressions of latent variables on explanatory variables.
The second character of each equation name indicates which latent variable is
regressed on the predictors.
frload(numlist) frees ﬁrst factor loading for latent variables corresponding to numlist.
Slide 20
10'
&
$
%
Estimating the Hausman-Wise-Diggle-Kenward model
² Data manipulation (data are in long form)
gen y0 = cond(y~=.,y,0) /* lag 0 ytj */
sort id t
qui by id: gen y1 = cond(_n>1,y[_n-1],0) /* lag 1 yt¡1;j */
gen d = y == . /* dtj */
sort id d t
qui by id d: drop if d==1&_n>1 /* drop records after first missing */
gen resp1 = y
gen resp2 = d
reshape long resp, i(id t) j(var) /* var = 1,2 if resp = y,d */
drop if var == 2 & t == 1 /* no dropout at time 1 */
tab var, gen(i) /* i1 = ±1i, i2 = ±1i */
gen x_i1 = x*i1
gen y0_i2d0 = y0*i2*(1-d) /* ytj±2i(1 ¡ dtj) */
gen y1_i2 = y1*i2
gen i2d1 = i2*d
Slide 21
'
&
$
%
. list id var t resp y0_i2d0 y1_i2 i2 in 1/12
id var t resp y0_i2d0 y1_i2 i2d1
1. 1 1 1 2.657621 0 0 0
2. 1 2 2 1 0 2.657621 1
3. 2 1 1 1.423789 0 0 0
4. 2 2 2 1 0 1.423789 1
5. 3 1 1 -.7317839 0 0 0
6. 3 1 2 -.597519 0 0 0
7. 3 1 3 .8041697 0 0 0
8. 3 2 2 0 -.597519 -.7317839 0
9. 3 2 3 0 .8041697 -.597519 0
10. 4 1 1 .4663057 0 0 0
11. 4 1 2 1.797121 0 0 0
12. 4 2 2 0 1.797121 .4663057 0
Slide 22
11
'
&
$
%
² Linear predictor
´itj = ¯0xtj±1i + ®0±2i + ®1ytj±2i(1 ¡ dtj) + ®2yi¡1;j±2i + u
(2)
tj ¸1±2idtj + u
(3)
j ±1i
² Syntax for gllamm
eq u_2: i2d1 /* for eqs(): u
(2)
tj ¸1±2idtj */
eq u_3: i1 /* for eqs(): u
(3)
j ±1i */
matrix B=(0,1n0,0) /* for bmatrix() */
gen one = 1
eq f1: x one /* for geqs(): °2xtj + °1 */
constraint def 1 [b1_2]_cons = 1 /* set b12 = 1 */
constraint def 2 [f1]one = [resp]i1 /* set °1 = ¯1 */
constraint def 3 [f1]x = [resp]x /* set °2 = ¯2 */
constraint def 4 [t1]i2d1 = [s1]_cons /* set var(³
(2)
jt ) = var(²jt) */
constraint def 5 [t1l]i2d1 = [resp]y0_i2d0 /* set ¸1 = ®1 */
gllamm resp x_i1 i1 y0_i2d0 y1_i2 i2, i(t id) eqs(u_2 u_3) /*
*/ nocons family(gauss binom) fv(var) link(ident probit) lv(var) /*
*/ bmat(B) geqs(f1) frload(1) nats constr(1/5) nip(7) adapt
Slide 23
'
&
$
%
Extensions of the Hausman-Wise-Diggle-Kenward model
Application
² Cluster randomised study of sex education in Norway
² Schools were randomised to receive sex education or not
² Assessments pre randomisation, 6 months and 18 months post randomisation
² Three ordinal outcomes (5-point scale) measuring readiness to use contraception:
“If my partner and I were about to have intercourse without either of us having
mentioned contraception ...
– I would have no problems saying that I have no contraception”
– I would have no problems asking my partner whether he/she has contraception”
– it would be easy for me to produce a condom (if I brought one)”
² 46 schools and 1183 pupils contributed to the analysis
Slide 24
12'
&
$
%
Model
² Factor model with ordinal logit link for three outcomes i at time t for pupil j in
school k
y¤
itjk = ¯i + u
(2)
tjk¸i + [u
(3)
jk 0 + u
(4)
k 0] + ²itjk; ¯1 = 0
yitjk = s if ·s¡1 < y¤
itjk · ·s; s = 1;¢¢¢5; 1 = ·0 < ·1 < ¢¢¢ < ·5 = 1
² Substantive model: structural model for latent outcome u
(2)
tjk
u
(2)
tjk = °1xTtij + °2xItij + °3xTtijxItij + u
(3)
jk + ³
(2)
tjk
u
(3)
jk = u
(4)
k + ³
(3)
jk
where xTtij is time (0,1,3) and xItij is an indicator for the intervention group.
² Selection model
logit(Pr(dtjk = 1)) = ¯6 + ®0u
(2)
tjk + ®1u
(2)
t¡1jk + ®2u
(2)
t¡2jk
Slide 25
'
&
$
%
(4) School k
(3) Pupil jk
³k -
³jk
?
ujk
&%
'$
- ¡
¡
¡
¡
¡
¡
¡
¡ ¡ µ
@
@
@
@
@
@
@
@ @ R
6
¡
¡ ¡ µ
@
@ @ I
? ? ?
u1jk
&%
'$
- ¾ d1jk J
J
J
J
J
J
J
J J ^
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B B N
@ @ R
³1jk
- x3jk
6
¡
¡ ¡ µ
@
@ @ I
? ? ?
u2jk
&%
'$
- ¾ d2jk J
J
J
J
J
J
J
J J ^
@ @ R
³2jk
© © * x2jk
6
¡
¡ ¡ µ
@
@ @ I
? ? ?
u3jk
&%
'$
- ¾ d3jk
@ @ R
³3jk
- x3jk
Slide 26
13
'
&
$
%
Model 1 Model 2 Model 3 Model 4
estimate se estimate se estimate se estimate se
Selection model
¯6 -1.99 0.26 -2.10 0.29 -0.85 0.04 -1.09 0.06
®0 0.77 0.09 0.64 0.09 – – – –
®1 -0.10 0.04 – – – – -0.07 0.03
®2 -0.20 0.05 – – – – -0.25 0.04
Substantive model
°1 (time) 0.32 0.10 0.45 0.09 -0.06 0.09 -0.39 0.09
°2 (interv.) -0.91 0.26 -0.25 0.23 -0.28 0.24 -1.46 0.21
°3 (time by interv.) 0.48 0.11 0.34 0.10 0.20 0.11 0.56 0.11
var(³
(1)
tjk) 6.74 0.60 7.29 0.68 4.57 0.41 5.04 0.45
var(³
(2)
jk ) 5.30 0.57 4.29 0.98 3.72 0.43 3.51 0.39
Measurement model Not shown
log-likelihood -8624.49 -8631.63 -8680.35 -8657.93
Slide 27
'
&
$
%
Empirical Bayes predictions for control (left) and intervention group (right) with
standard errors
-5
0
5
0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0
-5
0
5
-5
0
5
0.0 1.0 2.0 3.0
-5
0
5
0.0 1.0 2.0 3.0
Time
P
r
e
d
i
c
t
e
d
 
f
a
c
t
o
r
 
s
c
o
r
e
-5
0
5
0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0
-5
0
5
-5
0
5
0.0 1.0 2.0 3.0
-5
0
5
0.0 1.0 2.0 3.0
Time
P
r
e
d
i
c
t
e
d
 
f
a
c
t
o
r
 
s
c
o
r
e
Slide 28
14

Multilevel selection models using gllamm

http://fmwww.bc.edu/RePEc/dsug2002/select.pdf

Multilevel selection models using gllamm

Abstract

Similar works

Full text

Available Versions

Research Papers in Economics