Process model 7 using Hayes Process macro with RStudio

In this post, I discuss Hayes Process Model 7 and illustrate its use when testing for first-stage moderated mediation using R Studio. 

[For a demo of Model 58 which pivots off concepts and procedures discussed in this post, see this blog post.]

First, a (very) brief review of simple mediation...

When specifying a simple mediation model, part of the effect of an independent variable (X) on a dependent variable (Y) is presumed to be transmitted by way of a proposed mediator (M). In effect, the mediator serves as the mechanism (or process) by which X influences Y. The paths in the model below all represent direct effects of one variable on another. Path a represents the direct effect of X on M. Path b represents the direct effect of the mediator (M) on the dependent variable (Y). Path c' (c-prime) represents the direct effect of X on Y. The indirect effect of X on Y in the model is computed as the product of the coefficients for paths a and b: IE=a*b. This effect (if significantly different from zero) is typically regarded as evidence supporting a putative claim of mediation.  We can compute the total effect of X on Y as the sum of its direct and indirect effects: c' + a*b.

Important: This model cannot be used to 'prove' mediation, as there are many other conditions that must be met for that to occur. The model simply allows one to evaluate evidence for its consistency with a proposed mediation mechanism. For a much deeper dive on these interpretational issues, see Hayes (2023). 


The simple mediation model described above does not have any conditional effects in its specification. A conditional effect refers to situations where the effect of one variable on another within the system changes across levels of another variable. This encompasses the idea of statistical moderation. This concept is explored further in the next part of my discussion.

Testing for first-stage moderated mediation using Process model 7

In the conceptual model below, you can see that paths b and c' continue to be interpreted as direct effects. However, path a has an arrow pointing to it from variable W (the proposed moderator). This specification indicates the coefficient for path a should not be treated as constant (as in the case of the previous simple mediation model), but rather as a quantity that varies across levels of a moderating variable (W). Since the coefficient for path a is not a single number (but rather varies as a function of W), the effect of X on M at a given value of W is a conditional direct effect. We saw in the simple mediation model that the indirect effect of X on Y is quantified as the product of paths a and b (i.e., a*b). If path a is proposed to be vary as a function of W (as in the current model), the indirect effect of X on Y must also vary as a function of W. Therefore, instead of the indirect effect of X on Y being a single estimate (as in the case of the simple mediation model), it is distributed across a set of conditional indirect effects (CIE's) - with each CIE dependent on the value of W. By adding the proposed moderator (W) to the previous simple mediation model (which was a process model), we end up with a conditional process model.

Since the indirect effect of X on Y is considered to vary as a function of the moderation of path a (the first stage of the mediation), this model can be described as a first-stage moderated mediation model. Hayes' (2023) template 7 is the one we can use when testing this type of model using the Process macro.



 Statistically, the model be broken down into two linear regressions, with the parameters from these two models quantifying the unconditional and conditional direct effects.  


Equation 1 involves the regression of the proposed mediator onto X, W, and an interaction variable (formed as the product of X and W). Equation 2 involves the regression of Y onto X and M. [Correlations among the predictors in these models are not represented in the diagrams, but are assumed.]

Each possible indirect effect of X on Y, conditional on W, is computed as the product of the direct effect of M on Y (path b) and the conditional direct effect of X on M (path a | W=w). Since the conditional indirect effects (CIE's) represent the indirect effect of of X on Y across different levels of the proposed moderator (W), tests of these effects do not convey information regarding their moderation across levels of the moderator. Any test of a given CIE (i.e., the indirect effect of X on Y at W=w) is a test of whether that indirect effect is different from zero. It is NOT a formal test of moderation of indirect effects. The Index of Moderated Mediation (IMM), however, quantifies linear moderation of the indirect effect (Hayes, 2023). The value for this index indicates the linear change in indirect effects as a function of a unit increment on the proposed moderator. If the IMM is significantly different from zero (the population null effect), this would support the assumption on the part of the researcher of statistical moderation of the indirect effect.

Two final notes
  1. The model I have described above includes only a single proposed mediator. However, it IS possible to specify parallel mediators that are moderated by a single variable W using Process Model 7. 
  2. In theory (and in actuality), it is possible to specify any of the three paths (a, b, and c') as being moderated, with moderation of path a, b, or both converting a simple mediation model into a conditional process model. [Moderation of path c' would not be a conditional process model, as it does not condition an indirect effect.] We have already seen that first-stage moderated mediation involves the specific case where only only the first-stage (path a) of the effect of X on Y is moderated by W. If, instead, path b is moderated by W (but not path a), the model would be a second-stage moderated mediation model. If both paths a and b are conditioned on W, then the model would be a first- and second-stage moderated mediation model. [Other permutations also exist involving different variables W and Z moderating different paths.] I will address these other types of models in future posts. 

Example of process model 7 using Process macro with RStudio

For our example, we will be using data associated with an article by Zhou et al. (2016). We will not be testing the authors' original article. Rather, we will be testing the following conceptual model using Hayes Process macro template 7:


For this model, we are assuming an indirect effect of burnout on anxiety by way of positive coping. However, we are also speculating the direct effect of burnout on positive coping and the indirect effect of burnout on anxiety via positive coping is moderated by extraversion. Although not shown in this figure, we are including a covariate, neuroticism.

Download a copy of the Excel file containing the data for this example here

To learn about how to obtain a copy of the macro and activate it, go here: 



You can download a copy of the RStudio file containing the Process macro and the syntax I wrote (and describe next). To activate the macro, find it under the first tab in the top box, right click and then Select All, and then press the Run button. To run the code (I describe below), first import the Excel data file. Make sure the data frame (after importing) is MODERATED_MEDIATION_EXAMPLE_SUB . The first line of code saves the data frame under a shorter name, mmdata.

Model Specification (syntax)

The syntax below was used to run the basic process model. The information in green are comments describing various arguments and options in Process. 


Here, we are specifying model #7, the use of 95% confidence intervals, drawing 1000 bootstrap resamples, setting the seed number to 12345, including a single covariate (“neuroticism”), centering our X (burnout) and W (extraversion) variables, requesting data for plotting, and requesting simple slopes always (even when the interaction term in our model is not significant).

For a 'real-time' presentation of specifying the model in Process (and also generating output, including simple slopes), see video below.



Here is an additional text file with the same code: file

Overview of output



Evaluating overall fit of regression model (equation 1)

Equation 1 involves the regression of positive coping (M; mediator) onto burnout (X), extraversion (W), the burnout*extraversion interaction (Int_1), and the neuroticism covariate. As a set, the predictors account for significant variation in positive coping, R-square=.1271, F(4,1124)=40.925, p<.001. The R-square value indicates the set of predictors in the model jointly account for 12.71% of the variation in positive coping.


Interpreting regression coefficients (equation 1)

The regression coefficients for burnout and extraversion are simple slopes (as opposed to main effects like you would discuss in ANOVA). The slope (b=-.2149) for burnout is the effect of burnout when (i.e., conditional upon) for individuals scoring 0. Since extraversion was mean-centered, this is the predicted effect of burnout on positive coping for individuals at the mean on extraversion. The regression slope (b=.1650) for extraversion is the simple slope for the effect of extraversion for individuals scoring at the mean on burnout (since burnout is also mean centered).

Summary: We can say that burnout negatively and significantly (b=-.2149, s.e.=.0184, p<.001) predicted burnout for individuals at the mean on extraversion. Extraversion positively and significantly (b=.1650, s.e.=.0712, p=.0207) predicted positive coping for individuals at the mean on burnout. 


Interpreting regression coefficients cont'd (equation 1)


 Although not of primary interest, the neuroticism covariate emerged as a negative and significant (b=-.1646, s.e.=.0674, p=.0148) predictor of positive coping.

Interpreting regression coefficients cont'd (equation 1)

The interaction between burnout and extraversion was statistically significant (b=.2698, s.e.=.0625, p<.001), which is consistent with proposed moderation of the path from burnout to positive coping. The slope for the interaction term represents the predicted linear change in the slope for X (burnout) for every unit change on the moderator (extraversion). For example, since we know the effect of burnout on positive coping is -.2149 at the mean on extraversion, then we can say that an increment of one unit on extraversion will result in a .2698 increment in the slope for burnout. The slope for burnout becomes -.2149+.2698 = .0549. A decrease of one unit relative to the mean for extraversion would produce a slope for burnout of -.2149-.2698 = -.4847.

Since interaction effects are symmetric, could have instead postulated burnout as a moderator of the effect of extraversion. Had that been the case, the slope for the interaction would have represented the linear change for the effect of extraversion on positive coping for every unit increment on burnout. The slope for extraversion at one unit below the mean on burnout would be: .1650 - .2698 = -.1048. At one unit above the mean on burnout, the slope for extraversion would be: .1650 + .2698 = .4348.


Interpreting regression coefficients cont'd (equation 1)

The last boxed portion (above) is the change in R-square as a result of including the interaction term into the model. That is, the difference in R-square between a model with and without the interaction term is .0145 and is statistically significant (p<.001). By adding the interaction term into the model there is a significant increment in the variation accounted for, where we account for an additional 1.45% of the variation over and above a model where the interaction term is not included.


Interpreting simple slopes and test results (which should be done if interaction is statistically significant in the model above)

This table contains simple slopes that represent the conditional effect of burnout (our focal antecedent, X) on positive coping (mediator) at three levels (at the 16th, 50th, an 84th percentiles) of extraversion (the moderator, W). The values in the Conditional column reflect percentiles of the mean-centered moderator variable.

For a person scoring at the 16th percentile of (centered) extraversion (i.e., P16 = -.3914), the simple slope was negative and statistically significant (b=-.3205, s.e.=.0308, p<.001).

For a person scoring at the 50th percentile of (centered) extraversion (i.e., P50 = .0372), the simple slope was negative and statistically significant (b=-.2049, s.e.=.0815, p<.001).

For a person scoring at the 84th percentile of (centered) extraversion (i.e., P84 = .3229), the simple slope was negative and statistically significant (b=-.1278, s.e.=.0271, p<.001).

As you can see, the slope for the effect of burnout on positive coping appears to become less negative (i.e., more positive) as we move from lower to higher levels of extraversion.


Data for visualizing conditional direct effect in model


Setting plot=1 in our earlier syntax resulting in this information being generated. The levels for the burn and extraversion variables are the 16th, 50th, and 84th percentiles of those (centered) variables. The 'poscop' variable contains predicted values for positive coping at each combination of levels for burnout and extraversion. This information will be used later to plot simple slopes.

Evaluating overall fit of regression model (equation 2)


Equation 2 involves the regression of anxiety (Y; the consequent) onto burnout (X), extraversion (W), the burnout*extraversion interaction (Int_1), and the neuroticism covariate. As a set, the predictors account for significant variation in positive coping, R-square=.1271, F(4,1124)=40.925, p<.001. The R-square value indicates the set of predictors in the model jointly account for 12.71% of the variation in positive coping.


Interpreting regression coefficients cont'd (equation 2)

In this portion of the model, there is no proposed moderation. The path from positive coping to anxiety is the ‘b’ path in traditional mediation analysis.

The direct effect of burnout on anxiety was positive and significant (b=.2536, s.e.=.0154, p<.001). The direct effect of positive coping on anxiety was negative and significant (b=-.128, s.e.=.0235, p<.001). The direct effect of neuroticism (our covariate) was positive and significant (b=.2758, s.e.=.0492, p<.001).


Interpreting results for test of moderated mediation


The IMM is used to test whether the indirect effect of X (burnout) on Y (anxiety) is moderated by W (extraversion). You can think of this as an omnibus test of differences in conditional indirect effects (seen just above). If zero does not fall within the 95%bootstrap confidence interval, then we infer a non-zero IMM in the population. That is, we infer that the effect of burnout on anxiety is conditional on extraversion in the population. If zero falls between the lower and upper bounds, then we infer the IMM is not different from zero in the population and that there is no moderation of the indirect effect. In the current case, zero does not fall between the lower and upper bounds of our confidence interval (i.e., -.0649 and -.0108, respectively). We infer the indirect effect of burnout on anxiety is conditioned upon level of extraversion in the population.


Conditional indirect effects



These are the conditional indirect effects of X (burnout) on Y (anxiety) at the 16th, 50th, and 84th percentiles of the moderator (extraversion). Each is tested using the bootstrap confidence interval. If zero falls outside a confidence interval, then the conditional indirect effect is significant. If zero falls between the lower and upper bound, it is not judged as being significant.

For cases falling at the 16th percentile on our (centered) moderator, the indirect effect of burnout is .0410, which is statistically significant. For cases falling at the 50th percentile on (centered) extraversion, the indirect effect of burnout is .0262, which is statistically significant. For cases falling at the 84th percentile on (centered) extraversion, the indirect effect is .0164, which is significant. As you can see, the conditional indirect effect appears to decrease as we move from lower to higher values on the (centered) extraversion variable. Essentially, the positive effect of burnout on anxiety is weaker for persons higher, as opposed to lower, in extraversion. 


Final portion of output contains additional notes related to model


Plotting simple slopes

To plot the simple slopes for path a, you will need the data for plotting that is located in our output. 


We will use the package ggplot2 to plot the slopes. You will first need to use the library() function to activate ggplot2 (if you haven't installed it yet, you'll need to do that). I provide 3 steps in the syntax below. Step 1 involves creating a data frame with the variables (I've renamed them a bit so they look more attractive in the plot). Step 2 requires converting the moderator (W; Extrav_c) into a factor variable. Step 3 contains the code for generating the plot of simple slopes. Note that each of the three variables in the data frame are included in the code, with Extrav_C used in relation to the 'group' argument.




Highlight and run the code above to get the following plot.



References

Hayes, A.F. (2023). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach (3rd edition). New York: The Guilford Press. 

Zhou J., Yang, Y., Qiu, X., Yang, X., Pan H, Ban, B., et al. (2016) Relationship between anxiety and burnout among Chinese physicians: A moderated mediation model. PLoS ONE 11(8): e0157013. doi:10.1371/journal.pone.0157013.Downloaded from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0157013






Comments

Popular posts from this blog

Factor analysis of EBI items: Tutorial with RStudio and EFA.dimensions package

Multilevel path analysis in lavaan using RStudio