MANCOVA

Multiple numercial Vs. categorical, adapted for covariate

Dr. Peng Zhao (✉ peng.zhao@xjtlu.edu.cn)

Department of Health and Environmental Sciences
Xi’an Jiaotong-Liverpool University

1 Learning objectives

  1. Why we use MANCOVA.
  2. Explain the results of MANCOVA.

2 Principles

2.1 Definition

Multivariate Analysis of Covariance (MANCOVA):

= multivariate ANCOVA = MANOVA with covariate(s)

Dependent variables: multiple continuous variables

Independent variables: one or multiple categorical variables, one or multiple continuous variables (covariates)

Analysis for the differences among group means for a linear combination of the dependent variables after adjusted for the covariate

Test whether the independent variable(s) has a significant influence on the dependent variables, excluding the influence of the covariate (preferably highly correlated with the dependent variable)

One-way MANCOVA

Two-way MANCOVA

2.2 Assumptions

  • Independent Random Sampling: Independence of observations from all other observations.
  • Level and Measurement of the Variables: The independent variables are categorical and the dependent variables are continuous or scale variables. Covariates are continuous.
  • Homogeneity of Variance: Variance between groups is equal.
  • Normality, for each group, each dependent variable follows a normal distribution and any linear combination of dependent variables are normally distributed

3 Workflow

3.1 Question

Q: Are there differences in productivity (measured by income and hours worked) for individuals in different age groups after adjusted for the education level?

  • Dependent variables
    • wage (continuous)
    • age (continuous)
  • Independent variables
    • education (categorical)
    • year (continuous, covariate)

3.2 Visualization

library(tidyverse)
library(ISLR)
library(car)

ggplot(Wage, aes(age, wage))+
  geom_point(alpha = 0.3)+
  geom_smooth(method = lm)+
  facet_grid(year~education)

3.3 Models

manova() or jmv::mancova()

wage_manova <- manova(cbind(wage, age) ~ education * year, data = Wage)
wage_manova
Call:
   manova(cbind(wage, age) ~ education * year, data = Wage)

Terms:
                education    year education:year Residuals
wage              1226364   16807           2404   3976510
age                  4608     494            724    393723
Deg. of Freedom         4       1              4      2990

Residual standard errors: 36.4683 11.47519
Estimated effects may be unbalanced
summary.aov(wage_manova)
 Response wage :
                 Df  Sum Sq Mean Sq  F value    Pr(>F)    
education         4 1226364  306591 230.5306 < 2.2e-16 ***
year              1   16807   16807  12.6374  0.000384 ***
education:year    4    2404     601   0.4519  0.771086    
Residuals      2990 3976510    1330                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Response age :
                 Df Sum Sq Mean Sq F value    Pr(>F)    
education         4   4608 1151.90  8.7477 5.108e-07 ***
year              1    494  494.13  3.7525   0.05282 .  
education:year    4    724  180.90  1.3738   0.24043    
Residuals      2990 393723  131.68                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(jmv)
wage_manova2 <- jmv::mancova(data = Wage,
                             deps = vars(wage, age),
                             factors = education, 
                             covs = year)
wage_manova2

 MANCOVA

 Multivariate Tests                                                                            
 ───────────────────────────────────────────────────────────────────────────────────────────── 
                                      value          F             df1    df2     p            
 ───────────────────────────────────────────────────────────────────────────────────────────── 
   education    Pillai's Trace        0.240590546    102.353675      8    5988    < .0000001   
                Wilks' Lambda           0.7605539    109.739015      8    5986    < .0000001   
                Hotelling's Trace     0.313326457    117.184095      8    5984    < .0000001   
                Roy's Largest Root    0.308447997    230.873326      4    2994    < .0000001   
                                                                                               
   year         Pillai's Trace        0.004790217      7.203065      2    2993     0.0007573   
                Wilks' Lambda           0.9952098      7.203065      2    2993     0.0007573   
                Hotelling's Trace     0.004813274      7.203065      2    2993     0.0007573   
                Roy's Largest Root    0.004813274      7.203065      2    2993     0.0007573   
 ───────────────────────────────────────────────────────────────────────────────────────────── 


 Univariate Tests                                                                                         
 ──────────────────────────────────────────────────────────────────────────────────────────────────────── 
                Dependent Variable    Sum of Squares    df      Mean Square    F             p            
 ──────────────────────────────────────────────────────────────────────────────────────────────────────── 
   education    wage                    1226364.4849       4    306591.1212    230.699570    < .0000001   
                age                        4607.5852       4      1151.8963      8.743336     0.0000005   
   year         wage                      16806.9907       1     16806.9907     12.646699     0.0003821   
                age                         494.1336       1       494.1336      3.750664     0.0528805   
   Residuals    wage                    3978914.2941    2994      1328.9627                               
                age                      394446.4359    2994       131.7456                               
 ──────────────────────────────────────────────────────────────────────────────────────────────────────── 

3.4 Results

  • Since the interaction effect is not significant (p = 0.77 for salary and p = 0.24 for age), the slopes are parallel.
  • The wage and age differ significantly among education groups (p for both wage and age are far below 0.05).
  • Differences in salary also significantly (p = 0.00038) increase over time (variable year), due to some economic reasons, while differences in age don’t change (p = 0.053) much.

4 Further readings