equation_ing – Summer Training UIII

Nested Data

Nested data means that individual data points at one level (e.g., students) appear at only one level of a higher-level variable such as schools. Thus, students are nested within schools.

ICC

Intraclass correlation (ICC) is the ratio of the variance between groups to the total variance denoted by \({\rho}_1\) in the population. ICC ranges from 0 (no variance between clusters) to 1 (there is variance between clusters but no variance within clusters). Mathematically, ICC can be formulated as follows.

\[ {\rho}_1=\frac {\tau^2}{\tau^2+\sigma^2} \]

Where

\(\tau^2\) is the population variance between clusters and \(\sigma^2\) is the population variance within clusters.

The higher the ICC value, the more diversity there is between groups/clusters.

Pitfalls of Ignoring Multilevel Data Structure

When performing multilevel modeling, violations of the assumption of independent errors may occur. For example, the math achievement scores of students who attend the same school will be more correlated than the scores of students who attend different schools. This may be because students in the same school have the same teachers, curriculum, and community, or for other reasons. Correlation within schools will result in inaccurate estimates of standard errors for the model parameters, which in turn can lead to statistical inference errors, such as p-values that are smaller than they should be and result in Type 1 errors.

In addition, we may miss important relationships involving each level in the data. As in our example, there are two sampling levels: students (level 1) nested within schools (level 2). Specifically, by not including information about schools, for example, we may miss important variables at the school level that help explain students’ math performance. Therefore, beyond the well-known problem of incorrectly estimating standard errors, we are also proposing a model that is not appropriate for understanding the outcome variable of interest. In the context of MLM, including variables at each level is relatively straightforward, as are interactions among variables at different levels. This greater model complexity can in turn lead to a better understanding of the phenomena being studied.

Random Intercept

The intercept in a simple regression model represents the average value of \(y\) when \(x\) is 0. In a simple regression model, there is one intercept for all individuals in the study population. However, when individuals are grouped, such as in classes, schools, or firms, there is the potential for separate intercepts for each group.

The equation for a level 1 regression model in multilevel modeling is:

\[ y_{ij}=\beta_{0j}+\beta_{1j}x+\varepsilon_ij \]

where \(ij\) declare the individual \(i\) and group \(j\).

Random intercept for each group:

\[ y_{ij}=\beta_{0j}+\varepsilon_{ij} \]

Random intercept:

\[ \beta_{0j}=\gamma_{00}+U_{0j} \] \(\gamma_{00}\) represents the average intercept value across all clusters, while \(U_{0j}\) represents the group-specific effect on the intercept. When the clusters are assumed to be random, \(U_{0j}\) can be described as the residual effect on \(y_{ij}\).

If both random intercept equations are substituted into the regression model equation, we obtain the full or composite model.

\[ y=\gamma_{00}+U_{0j}+\beta_1x+\varepsilon \] In MLM, analysis often begins with a null model, which can be denoted as follows. The null model is used as a basis for model building and as a model comparison.

\[ y_{ij}=\gamma_{00}+U_{0j}+\varepsilon_{ij} \]

Random Slope

The random intercept model can be extended by accommodating one or more predictor variables. For example, adding one predictor \(x_{ij}\) to the level 1 model, we obtain:

\[ y_{ij}=\gamma_{00}+\gamma_{10}x_{ij}+U_{0j}+\varepsilon_{ij} \]

The model can be expressed in 2 different levels, as follows:

Level 1:

\[ y_{ij}=\beta_{0j}+\beta_{1j}x+\varepsilon_{ij} \]

Level 2:

\[ \beta_{0j}=\gamma_{00}+U_{0j} \]

\[ \beta{ij}=\gamma_{10} \] The model includes predictors and slopes that relate them to the dependent variable, \(\gamma_{10}\) at level 1 with subscript \(10\). \(\gamma_{10}\) is defined similarly to \(\beta_1\) in the regression model, which is a measure of the effect on \(y\) of a 1-unit change in \(x\). In addition, in this model \(\gamma_{10}\) and \(\gamma_{00}\) are fixed models, while \(\sigma^2\) and \(\tau^2\) are random models.

One implication of the model in equation 2.12 is that the dependent variable is influenced by variation between individuals (\(\sigma^2\)), variation between groups (\(\tau^2\)), the average across all groups (\(\gamma_{00}\)), and the effect of the independent variable measured by \(\gamma_{10}\) which is also common to all clusters.

In practice there is no reason that the effect of \(x\) on \(y\) should be common to all clusters, but it is possible that one \(\gamma_{10}\) is common to all clusters. There is a unique cluster effect \(\gamma_{10}+U_{1j}\), where \(\gamma_{10}\) is the average relationship of \(x\) to \(y\) across clusters and \(\U_{1j}\) is the variation in the relationship between the two variables. \(\gamma_{10}+U_{1j}\) is assumed to have a mean ) and vary randomly around \(\gamma_{10}\). Thus the random slope model can be denoted by:

\[ y_{ij}=\gamma_{00}+\gamma_{10}x_{ij}+U_{0j}+U_{1j}x_{ij}+\varepsilon_{ij} \]

According to the equation above, the fixed model and the random model can be written as follows.

Fixed model: \(\gamma_{00}+\gamma_{10}x_{ij}\)

Random model: \(U_{0j}+U_{1j}x_{ij}+\varepsilon_{ij}\)

The model in the equation above has an interaction between cluster and \(x\), so the relationship between \(x\) and \(y\) is not constant across all clusters.

Centering

Centering refers to the practice of subtracting the mean of a variable from each individual’s value. This implies that the mean for the sample of centered variables is 0, and implies that each individual’s (centered) score represents a deviation from the mean, rather than any meaning its raw value might have. In regression, centering is commonly used to reduce collinearity caused by the inclusion of interaction terms in a regression model. Such issues are also important to consider in MLM, where interactions are often used. We saw earlier that the intercept is the value of the dependent variable when the independent variable is set equal to 0. In many applications, the independent variable cannot reasonably be 0 (e.g., ESCS score), which essentially makes the intercept a value that is needed to fit the regression line but not one that has an easily interpretable value. However, when \(x\) has been centered, the intercept takes the value of the dependent variable when the independent variable is at its mean. This is a much more useful interpretation for researchers in many situations, and another reason why centering is an important aspect of modeling, particularly in multilevel contexts.

Data centering is commonly approached by calculating the difference between an individual’s score and the overall mean across the sample. Another approach, called group mean centering, involves calculating the difference between an individual’s score and the mean of the group they belong to. For example, in schools, overall mean centering would involve finding the difference between each score and the overall mean across the entire school, while group mean centering would entail finding the difference between each score and the mean of the specific school.

The choice between these two approaches should be based on the relationship between variables \(x\) and \(y\). Overall mean centering allows for the comparison of individuals across the sample, while group mean centering places each individual in their relative position within a group. For instance, using group mean centered ESCS in the analysis for schools means investigating the relationship between students’ relative ESCS scores within their school and their mathematics achievement. In contrast, overall mean centering would examine the relationship between students’ relative standing in the overall sample on the ESCS and mathematics achievement

Overview of Two level MLMs

Previously we learned about the random slope model:

\[ y_{ij}=\gamma_{00}+\gamma_{10}x_{ij}+U_{0j}+U_{1j}x_{ij}+\varepsilon_{ij} \] For example, \(y_{ij}\) is the dependent variable of mathematics achievement, \(x_{ij}\) is the independent variable of ESCS score, and random error at the student and school levels.

We can extend the random slope model above by including several independent variables at both level 1 and level 2. In addition to knowing the relationship between ESCS scores and students’ mathematics achievement, we can also find out the extent to which the average ESCS at the school level is related to the students’ ESCS scores. This model basically has two parts, the first model explains the relationship between individual ESCS (\(x_{ij}\)) and mathematics achievement, and the other model explains the coefficient at level 1 as a function of the level 2 predictor, namely the average ESCS (\(Z_j\)).

Level 1:

\[ y_{ij}=\beta_{0j}+\beta_{1j}x_{ij}+\varepsilon_{ij} \]

and level 2: \[ \beta_{hj}=\gamma_{h0}+\gamma_{h1}Z_j+U_{hj} \]

The addition of the above equation is \(y_{h1}Z_{j}\) which represents the slope value for \(y_{h1}\) and the average ESCS score \(Z_j\). In other words, the average school achievement is directly related to the coefficient that links students’ ESCS scores to students’ mathematics achievement. The two equations at level 1 and level 2 above can be combined into one equation for the following second-level MLM.

\[ y_{ij}=\gamma_{00}+\gamma_{10}x_{ij}+\gamma_{01}Z_j+\gamma_{1001}x_{ij}Z_j+U_{0j}+U_{1j}X_{ij}+\varepsilon_{ij} \] Where \(\gamma_{00}\) is the intercept, \(\gamma_{10}\) is the fixed effect of variable \(x\) (ESCS), \(U_{0j}\) represents the random variation of the intercept between groups, and \(U_{1j}\) represents the random variation of the slope between groups. \(\gamma_{01}\) represents the fixed effect of the level 2 variable (mean ESCS). \(\gamma_{11}\) represents the slope and mean of the level 2 variable. \(\gamma_{1001}x_{ij}Z_j\) is the cross-level interaction, which is the interaction between level 1 and level 2 predictors. In this case, it is the interaction between student ESCS scores and school mean ESCS scores.

MLM (NLME Package)

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(nlme)


Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

nlme is a package for fitting and comparing linear and nonlinear mixed effects models.

pisa_idn <- read.csv("dataset/pisa_idn_sample.csv") 

pisa_idn %>% 
  group_by(CNTSCHID) %>% 
  summarise(mean = mean(MATH, na.rm = T), 
            SD = sd(MATH, na.rm = T), 
            miss = mean(is.na(MATH))) %>% 
  mutate_if(is.numeric, ~round(., 2)) %>% 
  print(n = 10)

# A tibble: 41 × 4
   CNTSCHID  mean    SD  miss
      <dbl> <dbl> <dbl> <dbl>
 1 36000006  359.  43.6     0
 2 36000020  356.  37.6     0
 3 36000030  428.  69.8     0
 4 36000034  353.  37.8     0
 5 36000038  297.  30.8     0
 6 36000043  324.  20.1     0
 7 36000050  307.  37.2     0
 8 36000063  330.  34.8     0
 9 36000065  355.  39.7     0
10 36000074  372.  34.9     0
# ℹ 31 more rows

Simple Multilevel Models Using `nlme`

As in the previous examples, we will estimate the extent to which ESCS scores can be used to predict student mathematics achievement. Students are assigned to schools, so the linear regression model is less appropriate. School is a random effect, whereas ESCS score is fixed. First, we will create a null model where there is only an intercept, no independent variables using the lme function from the nlme package.

m0 <- lme(fixed = MATH ~ 1, random = ~ 1 | CNTSCHID, data = pisa_idn)

We can obtain output from this model by running summary() function.

summary(m0)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC      BIC    logLik
  13687.47 13702.97 -6840.736

Random effects:
 Formula: ~1 | CNTSCHID
        (Intercept) Residual
StdDev:    48.29751 44.85779

Fixed effects:  MATH ~ 1 
               Value Std.Error   DF  t-value p-value
(Intercept) 363.6311  7.737214 1256 46.99768       0

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.85680001 -0.64361007 -0.06236587  0.61427017  4.59901503 

Number of Observations: 1297
Number of Groups: 41

The null model provides estimates of the variance between individuals \(σ^2\), and between clusters \(\tau^2\), where these values can be used to estimate \(\rho_I\) (ICC). Based on the output above, the ICC value of this model is

\[ \hat{\rho}_I=\frac{43.052}{43.052+41.613}=0.508 \]

We interpret this value to mean that the correlation of ESCS scores among students within the same schools is approximately \(0.508\).

To fit the model with ESCS score as the independent variable, we run the following syntax in R.

m1 <- lme(fixed = MATH ~ ESCS, random = ~ 1 | CNTSCHID, data = pisa_idn)

In this example, fixed = MATH ~ ESCS essentially says that math achievement is predicted with ESCS scire fixed effect. The random = ~1|CNTSCHID indicates only a random intercept varies within schools.

Fitting this model, which is saved in the output object m1, we obtain the following output.

summary(m1)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC   BIC   logLik
  13678.34 13699 -6835.17

Random effects:
 Formula: ~1 | CNTSCHID
        (Intercept) Residual
StdDev:    46.73248 44.76844

Fixed effects:  MATH ~ ESCS 
               Value Std.Error   DF  t-value p-value
(Intercept) 370.3298  7.832656 1255 47.28023  0.0000
ESCS          4.3259  1.474379 1255  2.93406  0.0034
 Correlation: 
     (Intr)
ESCS 0.29  

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.05909259 -0.63661764 -0.03356555  0.61869880  4.50856329 

Number of Observations: 1297
Number of Groups: 41

From this output, we can see that ESCS is a not significant predictor of MATH (\(t = 0.332, p > 0.05\)). In addition,the output shows that after accounting for the impact of ESCS, the estimate of variation in intercepts across schools \(42.894\) (\(\tau^2_0\)) while the within-school variation estimated as \(41.632\) (\(\sigma^2\)) and the overall fixed intercept denoted as \(\gamma_{00}\) is \(364.372\), which is the mean of math achievement when the ESCS score is \(0\).

Finally, it is possible to estimate the proportion of variance in the outcome variable that is accounted for at each level of the model. With single-level OLS regression models, the proportion of response variable variance accounted for by the model is expressed as \(R^2\). In the context of multilevel modeling, \(R^2\) values can be estimated for each level of the model. For level 1, we can calculate

\[ R^2_1=1-\frac{\sigma^2_{M1}+\tau^2_{M1}}{\sigma^2_{M0}+\tau^2_{M0}}=1-\frac{41.632+42.894}{41.613+43.052}=0.002 \]

This result tells us that level 1 of model 1 explains approximately \(2\%\) of the variance in the ESCS score above and beyond that accounted for in the null model. We can also calculate a level 2 \(R^2\):

\[ R^2_1=1-\frac{\sigma^2_{M1}/B+\tau^2_{M1}}{\sigma^2_{M0}/B+\tau^2_{M0}} \]

Where \(B\) is the average size of the level 2 units, or schools in this case. R provides us with the number of individuals in the sample, 1329, and the number of schools, 360, so that we can calculate B as \(1329/160 = 3.69\). We can now estimate level 2 \(R^2\)

\[ R^2_1=1-\frac{\sigma^2_{M1}/B+\tau^2_{M1}}{\sigma^2_{M0}/B+\tau^2_{M0}}=1-\frac{41.632/3.69+42.894}{41.613/3.69+43.052}=0.002 \]

Random Coefficient Models Using `nlme`

Random coefficient models allow the impact of ESCS on math achievement to vary from one school to another. So, we want to allow a level 1 slope to vary randomly, we will change this part of the syntax to include ESCS in the fixed part of the model.

m2 <- lme(fixed = MATH ~ ESCS, random = ~ ESCS | CNTSCHID, data = pisa_idn)

The random slope and intercept syntax will generate the following model summary:

summary(m2)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC      BIC    logLik
  13665.78 13696.78 -6826.892

Random effects:
 Formula: ~ESCS | CNTSCHID
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev    Corr  
(Intercept) 54.737693 (Intr)
ESCS         7.816269 0.794 
Residual    44.284027       

Fixed effects:  MATH ~ ESCS 
               Value Std.Error   DF  t-value p-value
(Intercept) 367.3046  9.068580 1255 40.50299  0.0000
ESCS          3.3524  1.948364 1255  1.72062  0.0856
 Correlation: 
     (Intr)
ESCS 0.682 

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.88087233 -0.63176272 -0.03797024  0.61085613  4.15094043 

Number of Observations: 1297
Number of Groups: 41

We can see that ESCS is a not significant predictor of MATH (\(t = 0.04, p > 0.05\)) across schools. The estimated coefficient \(0.0746\) corresponds to \(\gamma_{10}\) and is interpreted as the average inpact of ESCS on the math achievement across schools. In addition, the value \(7.986\) represnts the estimate of variation in coefficient across schools (\(\tau^2_1\)), the estimates of \(\tau^2_0=51.370\) and (\(\sigma^2=41.169\)).

A model with two random slopes can be defined in much the same way as defining a single slope. As an example, suppose we are interested in determining whether the age of student also impacts math achievement, and wants to allow this effect to vary from one school to another. Such incorporation of two random slopes can be modeled as:

m3 <- lme(fixed = MATH ~ ESCS + age, random = ~ ESCS + age | CNTSCHID, data = pisa_idn)
summary(m3)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC      BIC    logLik
  13670.72 13722.38 -6825.362

Random effects:
 Formula: ~ESCS + age | CNTSCHID
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev       Corr        
(Intercept) 54.752138484 (Intr) ESCS 
ESCS         7.839900933 0.795       
age          0.002596453 0.001  0.002
Residual    44.287184439             

Fixed effects:  MATH ~ ESCS + age 
               Value Std.Error   DF   t-value p-value
(Intercept) 349.4355 21.164629 1254 16.510351  0.0000
ESCS          3.2511  1.953642 1254  1.664099  0.0963
age           1.1132  1.192002 1254  0.933918  0.3505
 Correlation: 
     (Intr) ESCS  
ESCS  0.341       
age  -0.903 -0.054

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.88290594 -0.63142742 -0.04355263  0.60787074  4.15109489 

Number of Observations: 1297
Number of Groups: 41

The result show that age is not significantly related to math achievement (\(p=0.3505\)), with a positive coefficient indicating that older students had higher scores. In addition, the random variance of coefficients for this variable across schools (\(0.004\)) is much smaller than ESCS (\(7.840\)), which mean that the relationship of ESCS on mathematic achievement varies more across schools than does the impact of age.

Interactions and Cross-Level Interaction Using `nlme`

Cross-level interactions occur when the influence of level-1 variables on outcomes (e.g. students’ ESCS scores) differ depending on the grade level-2 predictors (e.g. school ESCS). Interaction, both inside same level or across levels, is simply the product of two predictors. Therefore, interaction and cross-level relationships in multilevel modeling achieved in much the same way as we saw with the lm() function. The following is an example of installing an interaction model for two level-1 variables (ESCS and age).

m4 <- lme(fixed = MATH ~ ESCS + age + ESCS*age, random = ~ 1| CNTSCHID, data = pisa_idn)
summary(m4)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC      BIC    logLik
  13675.69 13706.68 -6831.844

Random effects:
 Formula: ~1 | CNTSCHID
        (Intercept) Residual
StdDev:    46.67385 44.78031

Fixed effects:  MATH ~ ESCS + age + ESCS * age 
               Value Std.Error   DF   t-value p-value
(Intercept) 416.8774  75.75324 1253  5.503096  0.0000
ESCS         28.9531  28.80159 1253  1.005260  0.3150
age          -2.9123   4.71029 1253 -0.618286  0.5365
ESCS:age     -1.5474   1.80113 1253 -0.859137  0.3904
 Correlation: 
         (Intr) ESCS   age   
ESCS      0.963              
age      -0.995 -0.966       
ESCS:age -0.962 -0.999  0.967

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.06189321 -0.64189069 -0.03528411  0.61916180  4.51047998 

Number of Observations: 1297
Number of Groups: 41

We can see from the output of model 4 (m4) that age (\(t=-0.618, p>0.05\)) and the interaction between age and ESCS (\(t=-0.859, p>0.05\)) are not significant predictors of math achievement.

Centering Predictors

Centering predictors can provide slightly easier interpretation of interaction terms and also help alleviate multicollinearity arising from inclusion of both main effects and interactions in the same model. Centering of predictors can be accomplished through R by the creation of new variables. For example, grand mean centered ESCS and age variables can create with the following syntax:

c_escs <- pisa_idn$ESCS - mean(pisa_idn$ESCS)
c_age <- pisa_idn$age - mean(pisa_idn$age)

Then, they can be incorporated into the model in the same manner used earlier.

m5 <- lme(fixed = MATH ~ c_escs + c_age + c_escs*c_age,
          random = ~1|CNTSCHID, data = pisa_idn)

summary(m5)

Linear mixed-effects model fit by REML
  Data: pisa_idn 
       AIC      BIC    logLik
  13675.69 13706.68 -6831.844

Random effects:
 Formula: ~1 | CNTSCHID
        (Intercept) Residual
StdDev:    46.67385 44.78031

Fixed effects:  MATH ~ c_escs + c_age + c_escs * c_age 
                Value Std.Error   DF  t-value p-value
(Intercept)  364.0807  7.491264 1253 48.60070  0.0000
c_escs         4.3245  1.481585 1253  2.91883  0.0036
c_age         -0.6063  2.223586 1253 -0.27266  0.7852
c_escs:c_age  -1.5474  1.801127 1253 -0.85914  0.3904
 Correlation: 
             (Intr) c_escs c_age 
c_escs        0.011              
c_age        -0.023 -0.093       
c_escs:c_age -0.028 -0.065  0.841

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.06189321 -0.64189069 -0.03528411  0.61916180  4.51047998 

Number of Observations: 1297
Number of Groups: 41

The result show that the interaction is still not significant (\(t=-0.895, p>0.05\)) but the ESCS significant (\(t=2.919,p<0.05\)).

The `lme4` Package

Random Intercept Models Using `lme4`

A second function for fitting such models is lmer in the lme4 package. In particular, the lme4 package offers a slightly more streamlined syntax for fitting multilevel models. It also provides a more flexible framework for definition of complex models. We would fit m1 using following syntax:

library(lme4)

Loading required package: Matrix


Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList

m1 <- lmer(MATH ~ ESCS + (1 | CNTSCHID), data = pisa_idn)

The model is defined in much the same way as we defined the lme function. The only difference in treatment of fixed and random effects is that the random effects require information on the nesting structure for the parameter within which they vary. The primary difference in model syntax between lme and lmer function is that the random effects is denotedby its appearance within parentheses rather than through explicit assignment using the random statement.

We can obtain output from model 1 by running summary() function.

summary(m1)

Linear mixed model fit by REML ['lmerMod']
Formula: MATH ~ ESCS + (1 | CNTSCHID)
   Data: pisa_idn

REML criterion at convergence: 13670.3

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0591 -0.6366 -0.0336  0.6187  4.5086 

Random effects:
 Groups   Name        Variance Std.Dev.
 CNTSCHID (Intercept) 2184     46.73   
 Residual             2004     44.77   
Number of obs: 1297, groups:  CNTSCHID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  370.330      7.833  47.280
ESCS           4.326      1.474   2.934

Correlation of Fixed Effects:
     (Intr)
ESCS 0.290

m2 <- lmer(MATH ~ ESCS + age + (1 | CNTSCHID), data = pisa_idn)

summary(m2)

Linear mixed model fit by REML ['lmerMod']
Formula: MATH ~ ESCS + age + (1 | CNTSCHID)
   Data: pisa_idn

REML criterion at convergence: 13667.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.0591 -0.6415 -0.0350  0.6204  4.5099 

Random effects:
 Groups   Name        Variance Std.Dev.
 CNTSCHID (Intercept) 2181     46.70   
 Residual             2005     44.77   
Number of obs: 1297, groups:  CNTSCHID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  354.299     20.815  17.021
ESCS           4.240      1.478   2.868
age            1.000      1.203   0.831

Correlation of Fixed Effects:
     (Intr) ESCS  
ESCS  0.174       
age  -0.927 -0.070

Random Coefficient Models Using `lme4`

m3 <- lmer(MATH ~ ESCS +  (ESCS | CNTSCHID), data = pisa_idn)

summary(m3)

Linear mixed model fit by REML ['lmerMod']
Formula: MATH ~ ESCS + (ESCS | CNTSCHID)
   Data: pisa_idn

REML criterion at convergence: 13653.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8809 -0.6318 -0.0380  0.6109  4.1509 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 CNTSCHID (Intercept) 2996.20  54.738       
          ESCS          61.09   7.816   0.79
 Residual             1961.08  44.284       
Number of obs: 1297, groups:  CNTSCHID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  367.305      9.069  40.503
ESCS           3.352      1.948   1.721

Correlation of Fixed Effects:
     (Intr)
ESCS 0.682

m4 <- lmer(MATH ~ ESCS + age  + (ESCS + age | CNTSCHID), data = pisa_idn)

boundary (singular) fit: see help('isSingular')

summary(m4)

Linear mixed model fit by REML ['lmerMod']
Formula: MATH ~ ESCS + age + (ESCS + age | CNTSCHID)
   Data: pisa_idn

REML criterion at convergence: 13649.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8875 -0.6277 -0.0405  0.6017  4.1594 

Random effects:
 Groups   Name        Variance Std.Dev. Corr       
 CNTSCHID (Intercept) 7307.187 85.482              
          ESCS          60.987  7.809    0.56      
          age            5.973  2.444   -0.87 -0.08
 Residual             1957.684 44.246              
Number of obs: 1297, groups:  CNTSCHID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  351.110     25.293  13.882
ESCS           3.186      1.946   1.637
age            1.006      1.383   0.728

Correlation of Fixed Effects:
     (Intr) ESCS  
ESCS  0.272       
age  -0.935 -0.028
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

m5 <- lmer(MATH ~ ESCS + age + (ESCS | CNTSCHID) + (age | CNTSCHID), data = pisa_idn)

boundary (singular) fit: see help('isSingular')

summary(m5)

Linear mixed model fit by REML ['lmerMod']
Formula: MATH ~ ESCS + age + (ESCS | CNTSCHID) + (age | CNTSCHID)
   Data: pisa_idn

REML criterion at convergence: 13665.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7938 -0.6342 -0.0410  0.6077  4.2489 

Random effects:
 Groups     Name        Variance Std.Dev. Corr
 CNTSCHID   (Intercept)    0.000  0.000       
            ESCS          42.920  6.551    NaN
 CNTSCHID.1 (Intercept)    0.000  0.000       
            age            9.154  3.026    NaN
 Residual               1970.832 44.394       
Number of obs: 1297, groups:  CNTSCHID, 41

Fixed effects:
            Estimate Std. Error t value
(Intercept)  341.714     19.746  17.306
ESCS           4.148      1.841   2.253
age            1.765      1.308   1.350

Correlation of Fixed Effects:
     (Intr) ESCS  
ESCS  0.158       
age  -0.922 -0.057
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Additional Options

Parameter Estimation Method

By default, nlme and lme4 uses restricted maximum likelihood estimation (REML). However, it also allows for the use of maximum likelihood estimation (ML) instead. To change the estimation method, the call is method = "ML" for the nlme package and REML = FALSE for the lme4 package.

m1_1 <- lme(fixed = MATH ~ ESCS, random = ~ 1 | CNTSCHID, 
          data = pisa_idn, method = "ML")

m1_2 <- lmer(MATH ~ ESCS + (1 | CNTSCHID), 
          data = pisa_idn, REML = FALSE)

Estimation Controls

Sometimes a correctly specified model will not reach a solution (converge) using the default settings for model convergence. Many times, this problem can be fixed by changing the default estimation controls using the control option. Quite often, convergence issues can be fixed by changing the model iteration limit (maxIter), or by changing the model optimizer (opt). In order to specify which controls will be changed, R must be given a list of controls and their new values. For example, control=list(maxIter=100, opt=”optim”) would change the maximum number of iterations to 100 and the optimizer to “optim.”.

m2_1 <- lme(fixed = MATH ~ ESCS, random = ~ 1 | CNTSCHID, 
          data = pisa_idn, method = "ML",
          control = list(maxIter = 100, opt = "optim"))

m2_2 <- lmer(MATH ~ ESCS + (1 | CNTSCHID), data = pisa_idn, 
             REML = FALSE, control = list(maxIter = 100, opt = "optim"))

Chi Square Test for Comparing Model Fit

Model comparison information can be obtained through use of the anova() function. This function can be used to provide two different types of model fit information: AIC and BIC statistics, and the chi-square test of model fit.

anova(m1_1, m1_2)

Confidence Intervals for Parameter Estimates

With nlme package can be used to generate \(95\%\) confidence intervals for the fixed effects and the variances of the random effects. The confidence intervals obtained for the variances of the random effects can be used to determine the significance of the random effects. To determine the significance of the random effects we can use intervals() function.

intervals(m3_1)