library(tidyverse)
library(ISLR)
library(car)
ggplot(Wage, aes(age, wage))+
geom_point(alpha = 0.3)+
geom_smooth(method = lm)+
facet_grid(year~education)
MANCOVA
Multiple numercial Vs. categorical, adapted for covariate
Dr. Peng Zhao (✉ peng.zhao@xjtlu.edu.cn)
Department of Health and Environmental Sciences
Xi’an Jiaotong-Liverpool University
1 Learning objectives
- Why we use MANCOVA.
- Explain the results of MANCOVA.
2 Principles
2.1 Definition
- Multivariate Analysis of Covariance (MANCOVA):
-
= multivariate ANCOVA = MANOVA with covariate(s)
Dependent variables: multiple continuous variables
Independent variables: one or multiple categorical variables, one or multiple continuous variables (covariates)
Analysis for the differences among group means for a linear combination of the dependent variables after adjusted for the covariate
Test whether the independent variable(s) has a significant influence on the dependent variables, excluding the influence of the covariate (preferably highly correlated with the dependent variable)
2.2 Assumptions
- Independent Random Sampling: Independence of observations from all other observations.
- Level and Measurement of the Variables: The independent variables are categorical and the dependent variables are continuous or scale variables. Covariates are continuous.
- Homogeneity of Variance: Variance between groups is equal.
- Normality, for each group, each dependent variable follows a normal distribution and any linear combination of dependent variables are normally distributed
3 Workflow
3.1 Question
Q: Are there differences in productivity (measured by income and hours worked) for individuals in different age groups after adjusted for the education level?
- Dependent variables
- wage (continuous)
- age (continuous)
- Independent variables
- education (categorical)
- year (continuous, covariate)
3.2 Visualization
3.3 Models
manova()
or jmv::mancova()
<- manova(cbind(wage, age) ~ education * year, data = Wage)
wage_manova wage_manova
Call:
manova(cbind(wage, age) ~ education * year, data = Wage)
Terms:
education year education:year Residuals
wage 1226364 16807 2404 3976510
age 4608 494 724 393723
Deg. of Freedom 4 1 4 2990
Residual standard errors: 36.4683 11.47519
Estimated effects may be unbalanced
summary.aov(wage_manova)
Response wage :
Df Sum Sq Mean Sq F value Pr(>F)
education 4 1226364 306591 230.5306 < 2.2e-16 ***
year 1 16807 16807 12.6374 0.000384 ***
education:year 4 2404 601 0.4519 0.771086
Residuals 2990 3976510 1330
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Response age :
Df Sum Sq Mean Sq F value Pr(>F)
education 4 4608 1151.90 8.7477 5.108e-07 ***
year 1 494 494.13 3.7525 0.05282 .
education:year 4 724 180.90 1.3738 0.24043
Residuals 2990 393723 131.68
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(jmv)
<- jmv::mancova(data = Wage,
wage_manova2 deps = vars(wage, age),
factors = education,
covs = year)
wage_manova2
MANCOVA
Multivariate Tests
─────────────────────────────────────────────────────────────────────────────────────────────
value F df1 df2 p
─────────────────────────────────────────────────────────────────────────────────────────────
education Pillai's Trace 0.240590546 102.353675 8 5988 < .0000001
Wilks' Lambda 0.7605539 109.739015 8 5986 < .0000001
Hotelling's Trace 0.313326457 117.184095 8 5984 < .0000001
Roy's Largest Root 0.308447997 230.873326 4 2994 < .0000001
year Pillai's Trace 0.004790217 7.203065 2 2993 0.0007573
Wilks' Lambda 0.9952098 7.203065 2 2993 0.0007573
Hotelling's Trace 0.004813274 7.203065 2 2993 0.0007573
Roy's Largest Root 0.004813274 7.203065 2 2993 0.0007573
─────────────────────────────────────────────────────────────────────────────────────────────
Univariate Tests
────────────────────────────────────────────────────────────────────────────────────────────────────────
Dependent Variable Sum of Squares df Mean Square F p
────────────────────────────────────────────────────────────────────────────────────────────────────────
education wage 1226364.4849 4 306591.1212 230.699570 < .0000001
age 4607.5852 4 1151.8963 8.743336 0.0000005
year wage 16806.9907 1 16806.9907 12.646699 0.0003821
age 494.1336 1 494.1336 3.750664 0.0528805
Residuals wage 3978914.2941 2994 1328.9627
age 394446.4359 2994 131.7456
────────────────────────────────────────────────────────────────────────────────────────────────────────
3.4 Results
- Since the interaction effect is not significant (p = 0.77 for salary and p = 0.24 for age), the slopes are parallel.
- The wage and age differ significantly among education groups (p for both wage and age are far below 0.05).
- Differences in salary also significantly (p = 0.00038) increase over time (variable year), due to some economic reasons, while differences in age don’t change (p = 0.053) much.