Our Experiment: Each eyetrackingR vignette uses the eyetrackingR package to analyze real data from a simple 2-alternative forced choice (2AFC) word recognition task administered to 19- and 24-month-olds. On each trial, infants were shown a picture of an animate object (e.g., a horse) and an inanimate object (e.g., a spoon). After inspecting the images, they disappeared and they heard a label referring to one of them (e.g., “The horse is nearby!”). Finally, the objects re-appeared on the screen and they were prompted to look at the target (e.g., “Look at the horse!”).
Overview of this vignette
In this vignette, we want to take an initial look at our data and perform the most fundamental eye-tracking analysis there is: a window analysis. This allows us to ascertain whether, in a certain window, infants looked more to the ‘animate’ when it was named than when the ‘inanimate’ was named.
Before performing this analysis, we’ll need to prepare and clean our dataset. Here we will to do this quickly and with few notes but, for more information, see the vignette on preparing your data.
set.seed(42) library("Matrix") library("lme4") library("ggplot2") library("eyetrackingR") data("word_recognition") data <- make_eyetrackingr_data(word_recognition, participant_column = "ParticipantName", trial_column = "Trial", time_column = "TimeFromTrialOnset", trackloss_column = "TrackLoss", aoi_columns = c('Animate','Inanimate'), treat_non_aoi_looks_as_missing = TRUE ) # subset to response window post word-onset response_window <- subset_by_window(data, window_start_time = 15500, window_end_time = 21000, rezero = FALSE)
## Avg. window length in new data will be 5500
# remove trials with > 25% of trackloss response_window_clean <- clean_by_trackloss(data = response_window, trial_prop_thresh = .25)
## Performing Trackloss Analysis... ## Will exclude trials whose trackloss proportion is greater than : 0.25 ## ...removed 33 trials.
# create Target condition column response_window_clean$Target <- as.factor( ifelse(test = grepl('(Spoon|Bottle)', response_window_clean$Trial), yes = 'Inanimate', no = 'Animate') )
Here, we’ll inspect the descriptive statistics for our data, and make some initial visualizations just to make sure everything is in order. We basically want a glance at each subject’s looking to the
Animate for each condition. To do so, we’ll:
- use the
describe_datafunction to generate a quick data summary (telling us the means and variance within each Target condition for each subject, and the number of trials contributed after cleaning above), and
- quickly visualize these differences by plotting a line for each subject. We expect they should all slope in the same direction, with lower mean looking to the Animate when the Inanimate was named than when the Animate was named.
(data_summary <- describe_data(response_window_clean, describe_column='Animate', group_columns=c('Target','ParticipantName')))
## Source: local data frame [54 x 9] ## Groups: Target [?] ## ## Target ParticipantName Mean SD Var Min Max ## (fctr) (fctr) (dbl) (dbl) (dbl) (dbl) (dbl) ## 1 Animate ANCAT18 0.3593750 0.4807572 0.23112745 0 1 ## 2 Animate ANCAT22 0.7907489 0.4069982 0.16564751 0 1 ## 3 Animate ANCAT23 0.8152260 0.3882681 0.15075211 0 1 ## 4 Animate ANCAT26 0.6991597 0.4590093 0.21068953 0 1 ## 5 Animate ANCAT39 0.8799058 0.3251993 0.10575459 0 1 ## 6 Animate ANCAT45 0.7153846 0.4515786 0.20392320 0 1 ## 7 Animate ANCAT50 0.8959350 0.3055935 0.09338736 0 1 ## 8 Animate ANCAT53 0.7265625 0.4460721 0.19898034 0 1 ## 9 Animate ANCAT55 0.9545455 0.2084066 0.04343330 0 1 ## 10 Animate ANCAT58 0.8554779 0.3517551 0.12373163 0 1 ## .. ... ... ... ... ... ... ... ## Variables not shown: N (int), NumTrials (int)
Performing a simple paired t-test
The first way to ask our question is to get a single mean proportion score (of looking to the Animate AOI) for each Target condition (Animate, Inanimate) for each subject and perform a paired t-test.
To do so, we will use the
make_time_window_data function to aggregate by participant and calculate this mean value. We also specify a couple additional predictors (Age, MCDI_Total [vocabulary score]) that we will use in a follow-up analysis.
Note that the dataframe returned by
make_time_window_data will actually give us a few different dependent variables to choose from:
Prop– The mean of raw proportion-looking
LogitAdjusted– The logit is defined as
log( Prop / (1 - Prop) ). This transformation attempts to map bounded
0,1data to the real number line. Unfortunately, for data that is exactly 0 or 1, this is undefined. One solution is add a very small value to any datapoints that equal 0, and subtract a small value to any datapoints that equal 1 (we use 1/2 the smallest nonzero value for this adjustment).
Elog– Another way of calculating a corrected logit transformation is to add a small value
epsilonto both the numerator and denominator of the logit equation (we use 0.5).
ArcSin– The arcsine-root transformation of the raw proportions, defined as
While the first DV is the most intuitive, you might consider using the others to account for the bounded nature of proportions. The boundedness of proportions is an issue because it can mean that CI’s around your estimates fall above 1 or below 0 (which is not possible) and that there is an inherent link between group means and variance (means closer to 0 and 1 will necessarily have lower variance than means around .5). These are violations of many parametric models, such as OLS regression and t-tests.
(EyetrackingR also lets you specify any arbitrary DV aside from the above. For example, you could analyze mean pupil-dilation by supplying a column name to the
other_dv_columns argument. This is true for
make_time_sequence_data as well.)
We will start out by using the ArcSin DV.
# aggregate by subject across the response window response_window_agg_by_sub <- make_time_window_data(response_window_clean, aois='Animate', predictor_columns=c('Target','Age','MCDI_Total'), summarize_by = "ParticipantName") # take a quick peek at data plot(response_window_agg_by_sub, predictor_columns="Target", dv = "ArcSin")
# show condition means describe_data(response_window_agg_by_sub, describe_column = "ArcSin", group_columns = "Target")
## Source: local data frame [2 x 7] ## ## Target Mean SD Var Min Max N ## (fctr) (dbl) (dbl) (dbl) (dbl) (dbl) (int) ## 1 Animate 1.0830125 0.2063904 0.04259698 0.6428499 1.570796 27 ## 2 Inanimate 0.6920932 0.2476029 0.06130718 0.0000000 1.135368 27
# simple paired t-test between conditions t.test(ArcSin ~ Target, data= response_window_agg_by_sub, paired=TRUE)
## ## Paired t-test ## ## data: ArcSin by Target ## t = 7.2705, df = 26, p-value = 1.012e-07 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## 0.2803975 0.5014410 ## sample estimates: ## mean of the differences ## 0.3909192
That t-test looks pretty good. But we also want to know whether this effect of Target is mediated by age and vocabulary. To do so, we can fit a quick linear model:
# you should almost always sum-code and center your predictors when performing regression analyses response_window_agg_by_sub$AgeC <- response_window_agg_by_sub$Age - mean(response_window_agg_by_sub$Age) response_window_agg_by_sub$MCDI_TotalC <- response_window_agg_by_sub$MCDI_Total - mean(response_window_agg_by_sub$MCDI_Total) model <- lm(ArcSin ~ Target*AgeC*MCDI_TotalC, data=response_window_agg_by_sub) broom::tidy(model)
## term estimate std.error statistic ## 1 (Intercept) 1.0775721324 0.046238333 23.30473585 ## 2 TargetInanimate -0.4054231787 0.065390877 -6.19999601 ## 3 AgeC 0.0315232445 0.041326027 0.76279397 ## 4 MCDI_TotalC -0.0014166820 0.003332816 -0.42507058 ## 5 TargetInanimate:AgeC -0.0012671012 0.058443828 -0.02168067 ## 6 TargetInanimate:MCDI_TotalC -0.0040144872 0.004713314 -0.85173353 ## 7 AgeC:MCDI_TotalC 0.0009042498 0.002559896 0.35323691 ## 8 TargetInanimate:AgeC:MCDI_TotalC 0.0024107246 0.003620240 0.66590191 ## p.value ## 1 4.111432e-27 ## 2 1.450985e-07 ## 3 4.494804e-01 ## 4 6.727671e-01 ## 5 9.827965e-01 ## 6 3.987746e-01 ## 7 7.255245e-01 ## 8 5.087995e-01
This linear model shows no main effects or interactions of Target with those predictors, but confirms the large effect of Target.
Using mixed-effects models
A more rigorous approach is not to aggregate by participants but, instead, to aggregate by trials within participants and fit a linear mixed-effects model using lme4’s
lmer function. This predicts infants’ looking to the
Animate AOI based on the
Target condition of each trial while accounting for random intercepts and slope across
Trial (i.e., items) and
ParticipantName (i.e., subjects).
Here, for demonstration’s sake, we’ll predict the
Elog transformed DV.
Note that the
make_time_window_data call no longer summarizes by Participant. When the
summarize_by argument is not passed, this function will default to summarizing by trials within participants.
response_window_agg <- make_time_window_data(response_window_clean, aois='Animate', predictor_columns=c('Target','Age','MCDI_Total')) # sum-code and center predictors response_window_agg$TargetC <- ifelse(response_window_agg$Target == 'Animate', .5, -.5) response_window_agg$TargetC <- as.numeric(scale(response_window_agg$TargetC, center=TRUE, scale=FALSE)) # mixed-effects linear model on subject*trial data model_time_window <- lmer(Elog ~ TargetC + (1 + TargetC | Trial) + (1 | ParticipantName), data = response_window_agg, REML = FALSE) # cleanly show important parts of model (see `summary()` for more) (est <- broom::tidy(model_time_window, effects="fixed"))
## term estimate std.error statistic ## 1 (Intercept) 1.145093 0.2056680 5.567679 ## 2 TargetC 2.476711 0.3234191 7.657901
# use model comparison to attain p-values drop1(model_time_window,~.,test="Chi")
## Single term deletions ## ## Model: ## Elog ~ TargetC + (1 + TargetC | Trial) + (1 | ParticipantName) ## Df AIC LRT Pr(Chi) ## <none> 505.40 ## TargetC 1 512.35 8.9491 0.002776 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Those results look very much in line with the results of the simple t-test, though they are slightly more conservative.
Interpreting the estimates
The estimates from
lmer can be interpreted as follows:
- (Intercept) – what are the overall log-odds of looking at the Animate regardless of Target condition?
- TargetC – what is the difference in log-odds between our two conditions’ looking at the Animate?
Because we sum-coded our
TargetC predictor before fitting the model, we can calculate the looking to the Animate in the Animate target condition by taking the intercept estimate, +/- the slope estimate divided by two.
condition_estimate <- with(est, c(estimate[term=="(Intercept)"] + estimate[term=="TargetC"] / 2, estimate[term=="(Intercept)"] - estimate[term=="TargetC"] / 2))
And we can convert that back into a proportion value by reversing the transformation into log-odds:
##  0.9155565 0.4767013
You might notice that these are higher than we might have expected, based on the raw proportions (the actual proportion-looking for the animate condition was closer to .80). This could suggest our choice of transformation (empirical logit) was problematic. It’s important to choose a transformation that satisfies the assumptions of the model we are using. We can plot the fitted vs. residuals of our model using the
plot method for
The empirical-logit transformation appears to have created some problematic outliers, so we might want to consider another transformation, such as the adjusted logit.
model_time_window_logit <- lmer(LogitAdjusted ~ TargetC + (1 + TargetC | Trial) + (1 | ParticipantName), data = response_window_agg, REML = FALSE) plot(model_time_window_logit)
## Single term deletions ## ## Model: ## LogitAdjusted ~ TargetC + (1 + TargetC | Trial) + (1 | ParticipantName) ## Df AIC LRT Pr(Chi) ## <none> 382.11 ## TargetC 1 389.80 9.6917 0.001851 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
est_logit <- broom::tidy(model_time_window_logit, effects="fixed") condition_estimate_logit <- with(est_logit, c(estimate[term=="(Intercept)"] + estimate[term=="TargetC"] / 2, estimate[term=="(Intercept)"] - estimate[term=="TargetC"] / 2)) exp(condition_estimate_logit)/(1+exp(condition_estimate_logit))
##  0.8620134 0.4741567
See Wharton & Hui (2011) for more discussion of these issues.
Adding additional predictors
We can also throw in some additional predictors to the model, as we did above, so long as we specified these columns when we called
make_time_window_data. Here we will look at the effect of Age and MCDI_Total (after centering, of course):
response_window_agg$AgeC <- response_window_agg$Age - mean(response_window_agg$Age) response_window_agg$MCDI_TotalC <- response_window_agg$MCDI_Total - mean(response_window_agg$MCDI_Total) model_time_window_add_predictors <- lmer(Elog ~ TargetC*AgeC*MCDI_TotalC + (1 + TargetC | Trial) + (1 | ParticipantName), data = response_window_agg, REML = FALSE) # cleanly show important parts of model (see `summary()` for more) broom::tidy(model_time_window_add_predictors, effects="fixed")
## term estimate std.error statistic ## 1 (Intercept) 1.137212449 0.20098157 5.65829217 ## 2 TargetC 2.743117503 0.35312049 7.76821960 ## 3 AgeC 0.228878027 0.17309693 1.32225350 ## 4 MCDI_TotalC -0.033612643 0.01445055 -2.32604598 ## 5 TargetC:AgeC 0.080194937 0.30294334 0.26471926 ## 6 TargetC:MCDI_TotalC -0.001256492 0.02524465 -0.04977261 ## 7 AgeC:MCDI_TotalC 0.002632154 0.01056465 0.24914729 ## 8 TargetC:AgeC:MCDI_TotalC -0.029611893 0.01827399 -1.62043971
# use model comparison to attain p-values drop1(model_time_window_add_predictors,~.,test="Chi")
## Single term deletions ## ## Model: ## Elog ~ TargetC * AgeC * MCDI_TotalC + (1 + TargetC | Trial) + ## (1 | ParticipantName) ## Df AIC LRT Pr(Chi) ## <none> 509.26 ## TargetC 1 516.59 9.3312 0.002253 ** ## AgeC 1 508.95 1.6957 0.192848 ## MCDI_TotalC 1 512.14 4.8815 0.027146 * ## TargetC:AgeC 1 507.33 0.0695 0.792084 ## TargetC:MCDI_TotalC 1 507.26 0.0025 0.960331 ## AgeC:MCDI_TotalC 1 507.32 0.0620 0.803385 ## TargetC:AgeC:MCDI_TotalC 1 509.85 2.5966 0.107095 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here we do see an effect of MCDI_Total (the participants’ productive vocabulary): participants with larger vocabularies looked less to the Animate overall (regardless of Target).
For more of a basic introduction to linear mixed-effects models, covering topics like p-values, confidence intervals, estimates, interpreting random effects, and logistic regression, see this tutorial.
For more information on the empirical-logit transformation, see Barr (2008).
For more information on the dangers of analyzing raw proportion means, see Dixon (2008) and Jaeger (2008).
A word of caution about window analyses
It is important to consider what we are losing in a window analysis. By collapsing across time within trials or subjects, we are completely removing the greatest asset of eye-tracking data: time. This can mask important properties of our data. Therefore, while window analyses are attractive in terms of parsimony and computational demands – and are often a necessary first step – other analyses may allow you to better understand and report your data.
Barr, D. J. (2008). Analyzing “visual world” eyetracking data using multilevel logistic regression. Journal of Memory and Language, 59, 457–474.
Dixon, P. (2008). Models of accuracy in repeated-measures designs. Journal of Memory and Language, 59(4), 447–456. http://doi.org/10.1016/j.jml.2007.11.004
Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(434-446). http://doi.org/10.1016/j.jml.2007.11.007
Warton, D. I., & Hui, F. K. (2011). The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92(1), 3-10.