Mixed models: model syntax
keywords power analysis, mixed models, multiple clusters, participants by stimuli
0.8.2
Here we discuss some rules in defining mixed models for mixed models power analysis in PAMLj.
Model Syntax
Power analysis for mixed model requires to input a mixed model with
all the expected coefficients. PAMLj
employs a custom syntax based on R package lme4 (Bates et al. 2015) standard formulas, modified
to easily pass coefficients.
Model Terms
First, a model in the R package lme4 (Bates et al. 2015) is to be defined. For
instance, a simple random intercept-only model may look like this
Recall that 1 indicates the intercept: this model has a
fixed intercept, a fixed effect of x and a random intercept
across a variable named clustervar. The definition of terms
is pretty much like the R syntax, woth a few restrictions:
- Interactions are always defined with the column
:, never with the star. - Intercepts should always be defined with
1or0. For fixed effect, the intercept may be omitted but a warning is issued. Better explicitly declare it.
Model Coefficients
Second, one needs to input the coefficients of each term in the model
using the syntax value*x. Thus, the syntax:
This means that we expect the fixed intercept to be \(1\), the fix coefficient associated with \(x\) to be equal to \(.5\) and we expect the variance of the random intercept to be \(1\). There are a few rules to follows:
- Each term should have a coefficient.
- Coefficients for categorical variables are passed with the syntax
[1,2,3]*x, wherexhas 4 levels, so it requires 3 contrast variables to estimate the effect.
For instance, if x is categorical with 3 levels, the
syntax would be:
If a model has also a random coefficient of the independent variable (random slopes), it should be added to the model with its expected coefficient. For instance:
indicates that x has a random slopes whose variance is
\(1.5\) across levels of
clustervar. If the random slopes regards a categorical
variable, the variance coefficients corresponding to the contrasts
variable representing the variable should be cast with the
brackets syntax:
indicating a random coefficient for the three contrasts variables
representing the effect of x, with variances \(1.1\), \(2\), and \(3.1\) respectively.
For categorical variable, indicate the sign of the coefficient within the brackets:
Model Coefficients sign
Coefficients can be positive or negative. For continuous independent variables (terms) simply use the the negative sign in the formula, like in:
Multiple clustering variables can specified, for instance:
Symbolic coefficients
PAMLj allow passing also symbolic coefficients in the formula. Symbolic coefficients are labels that can be used to refer to the term/coefficient in additional syntax lines. Here an example:
In this example, a and b would refer to the
coefficient of x and z respectively.
Additional directives
The field Module Structure accepts additional directive over the specified model. At the moment there is only one directive that is recognized.
test: alabel
This directive implies that the estimated sample size, Required
Number of cluster levels or Required N per cluster are
solved for the effect whose label is alabel. By default, in
fact, PAMLj estimated the required
sample size for the smallest fixed effect in the model. If one wishes to
estimate the required sample size for a coefficient which is not the
smallest, simply use a symbolic coefficient for the target term and use
test: directive. In practice:
computes the required sample size for 2*z, irrespective
of the fact that .5*x will have a smaller effect.
Additional material
Details
Some more information about the module specs can be found here
Examples
Some worked out practical examples can be found here
Comments?
Got comments, issues or spotted a bug? Please open an issue on PAMLj at github or send me an email