Packages and Helper functions

Packages

rm(list=ls())
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(lmerTest))
suppressPackageStartupMessages(library(lme4))
suppressPackageStartupMessages(library(Matrix))
suppressPackageStartupMessages(library(gridExtra))
suppressPackageStartupMessages(library(plyr))
suppressPackageStartupMessages(library(knitr))
suppressPackageStartupMessages(library(irr))
suppressPackageStartupMessages(library(cowplot))
suppressPackageStartupMessages(theme_set(theme_bw()))
opts_chunk$set(fig.width=8, fig.height=5, echo=TRUE, warning=FALSE, message=FALSE, cache=TRUE) suppressPackageStartupMessages(library("kableExtra")) Helper Functions logodds This function takes a proportion and returns the log odds. logodds <- function(p){log(p/(1-p))} inverse_logodds This function takes a log odds and returns the equivalent proportion. inverse_logodds <- function(lo) {exp(lo)/(1+exp(lo)) } myCenter This function outputs the centered values of a variable, which can be a numeric variable, a factor, or a data frame. It was taken from Florian Jaeger’s blog - https://hlplab.wordpress.com/2009/04/27/centering-several-variables/. From his blog: • If the input is a numeric variable, the output is the centered variable • If the input is a factor, the output is a numeric variable with centered factor level values. That is, the factor’s levels are converted into numerical values in their inherent order (if not specified otherwise, R defaults to alphanumerical order). More specifically, this centers any binary factor so that the value below 0 will be the 1st level of the original factor, and the value above 0 will be the 2nd level. • If the input is a data frame or matrix, the output is a new matrix of the same dimension and with the centered values and column names that correspond to the colnames() of the input preceded by “c” (e.g. “Variable1” will be “cVariable1”). myCenter = function(x) { if (is.numeric(x)) { return(x - mean(x, na.rm = T)) } if (is.factor(x)) { x= as.numeric(x) return(x - mean(x, na.rm = T)) } if (is.data.frame(x) || is.matrix(x)) { m= matrix(nrow = nrow(x), ncol = ncol(x)) colnames(m)= paste("c", colnames(x), sep = "") for (i in 1:ncol(x)) { m[,i] = myCenter(x[,i]) } return(as.data.frame(m)) } } lizCenter This function provides a wrapper around myCenter allowing you to center a specific list of variables from a data frame. • x: data frame • listfname: a list of the variables to be centered (e.g. list(variable1, variable2)) The output is a copy of the data frame with a column (always a numeric variable) added for each of the centered variables. These columns are labelled with the each column’s previous name, but with “.ct” appended (e.g., “variable1” will become “variable1.ct”). lizCenter = function(x, listfname) { for (i in 1:length(listfname)) { fname = as.character(listfname[i]) x[paste(fname,".ct", sep="")] = myCenter(x[fname]) } return(x) } lizContrasts This function can be used to create two centered dummy variables which stand in place of a three-way factor (condition). This allows us to inspect each contrast separately, as well as their interactions with other factors. Other fixed effects in the model can be evaluated as the average effects across all levels of the factor. The function takes a data frame (d), a factor from that database (condition), which must have three levels, and the name of the level of the factor which is to be used as the baseline for the contrasts (base level). For example, if d is a data frame with a factor “condition” with three levels “(”lex_skew“,”lex_noskew“, and”mixed“) then lizContrasts(d, d$condition,”lex_no_skew“) returns a data frame with two (numeric) columns added labelled”lex_noskew_VERSUS_lex_mixed" and “lex_noskew_VERSUS_lex_skew”.

Wherever you would normally use “condition” in a formula in an LME, it can be replaced by (lex_noskew_VERSUS_lex_mixed + “lex_noskew_VERSUS_lex_skew) e.g. ~ (a * condition) becomes ~ (a * (lex_noskew_VERSUS_lex_mixed + lex_noskew_VERSUS_lex_skew)).

lizContrasts = function(d, condition, baselevel)
{

condition = factor(condition)
condition = relevel(condition, baselevel)
a = contrasts(condition)-apply(contrasts(condition),2,mean)
d$dummy1[condition == rownames(a)] <- a d$dummy1[condition == rownames(a)] <- a
d$dummy1[condition == rownames(a)] <- a d$dummy2[condition == rownames(a)] <- a
d$dummy2[condition == rownames(a)] <- a d$dummy2[condition == rownames(a)] <- a
name1 = paste(baselevel, rownames(a),sep ="_VERSUS_")
name2 = paste(baselevel, rownames(a),sep ="_VERSUS_")
d[name1] = d$dummy1 d[name2] = d$dummy2
d$dummy1 <-NULL d$dummy2 <-NULL

return(d)
}

get_coeffs

This function allows us to inspect particular coefficients from the output of an LME model by putting them in a table.

• x: the output returned when running lmer or glmer (i.e. an object of type lmerMod or glmerMod)
• list: a list of the names of the coefficients to be extracted (e.g. c(“variable1”, “variable1:variable2”))
get_coeffs <- function(x,list){(as.data.frame(summary(x)$coefficients)[list,])} Bf This function is equivalent to the Dienes (2008) calculator which can be found here: http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/Bayes.htm. The code was provided by Baguely and Kayne (2010) and can be found here: http://www.academia.edu/427288/Review_of_Understanding_psychology_as_a_science_An_introduction_to_scientific_and_statistical_inference Bf <- function(sd, obtained, uniform, lower = 0, upper = 1, meanoftheory = 0, sdtheory = 1, tail = 1){ area <- 0 if(identical(uniform, 1)){ theta <- lower range <- upper - lower incr <- range / 2000 for (A in -1000:1000){ theta <- theta + incr dist_theta <- 1 / range height <- dist_theta * dnorm(obtained, theta, sd) area <- area + height * incr } }else {theta <- meanoftheory - 5 * sdtheory incr <- sdtheory / 200 for (A in -1000:1000){ theta <- theta + incr dist_theta <- dnorm(theta, meanoftheory, sdtheory) if(identical(tail, 1)){ if (theta <= 0){ dist_theta <- 0 } else { dist_theta <- dist_theta * 2 } } height <- dist_theta * dnorm(obtained, theta, sd) area <- area + height * incr } } LikelihoodTheory <- area Likelihoodnull <- dnorm(obtained, 0, sd) BayesFactor <- LikelihoodTheory / Likelihoodnull ret <- list("LikelihoodTheory" = LikelihoodTheory,"Likelihoodnull" = Likelihoodnull, "BayesFactor" = BayesFactor) ret } Bf_table This works by calling the Bf function above on a dataframe which has the following columns: contrast std.Error estimate h1_sd h1_motivation (some text) It computes the BF for each row using std.Error and estimate as the model of the data, and h1 as a half normal with mean = 0 and sd = h1_sd and tails = tails. The function allows for positive and negative h1 values. If sd_theory is negative, since the BF function requires a positive value for sdtheory, both sdtheory and obtained are multiplied by -1. #df Bf_table<-function(df) { Bfs = vector('double') estimates = vector('double') sterrors = vector('double') sdtheorys = vector('double') contrasts = as.character(df$contrast)
motivation = as.character(df$h1_motivation) df$tails = as.numeric(df$tails) for (i in 1:dim(df)){ #i=5 sd_error = df$std.Error[i]
obtained = df$estimate[i] stdtheory= df$h1_sd[i]
tail = df$tails[i] if(df$h1_sd[i]<0) {
stdtheory = h1_sd[i]*-1
obtained = (df$estimate[i]*-1) } Bfs[i] = Bf( sd_error, obtained, uniform = 0, meanoftheory = 0, sdtheory=stdtheory , tail = tail)$BayesFactor
estimates[i] = obtained
sdtheorys[i] = stdtheory
sterrors[i] = sd_error
}

df2 = data.frame(cbind(contrasts,  round(sterrors, 3), round(estimates,3), round(sdtheorys,3),    motivation, df$tails, round(Bfs,3) ) ) colnames(df2) = c("contrast", "std.Error", "estimate", "sdtheory", "h1 motivation", "tail", "Bf" ) return(df2) df2[i,] } Bf_range This works with the Bf function above. It requires the obtained mean and SE for the current sample and works out what the BF would be for a range of predicted means (which are set to be sdtheoryrange (with meanoftheory=0). Bf_range<-function(sd, obtained, meanoftheory=0, sdtheoryrange, tail=1) { x = c(0) y = c(0) for(sdi in sdtheoryrange) { #sdi = sdtheoryrange B = as.numeric(Bf(sd, obtained, meanoftheory=0, uniform = 0, sdtheory=sdi, tail)) #following line corrects for the fact that the calcuator does not correctly compute BF when sdtheory==0; this code ensures that if sdtheory ==0, BF=1 if (sdi ==0 ) {B=1} x= append(x,sdi) y= append(y,B) output = cbind(x,y) } output = output[-1,] colnames(output) = c("sdtheory", "BF") return(output) } addBF_powercalc This function takes as its input the output of BF table above (BF_df), the number of participants in the current sample (N) and the maximum number of participants we would consider testing. It then works out what the minimal N would be to get a substantial BF, using the principle that the standard error is proportional to the square root of new-sample/ N (see Dienes video: https://www.youtube.com/watch?v=10Lsm_o_GRg “how many participants might I need”). It returns the table with an additional column saying how many participants would be needed to get substantial evidence for the null or H1, or that we still wouldn’t have substantial evidence either way even with the maximum number of participants. addBf_powercalc <-function(Bf_df, N, max) { Number = vector() for (b in 1:dim(Bf_df)){ for(newN in N : max){ BF= as.numeric(Bf(sd=as.numeric(as.character(Bf_df$std.Error[b]))*sqrt(N/newN), obtained=as.numeric(as.character(Bf_df$estimate[b])), meanoftheory=0, uniform = 0, sdtheory=as.numeric(as.character(Bf_df$h1_sd[b])), tail=as.numeric(as.character(Bf_df$tail[b])))) if(BF>=3) { Number[b] = paste ("evidence for H1 with", newN, "participants") break } else if (BF< (1/3)){ Number[b] = paste ("evidence for H0 with", newN, "participants") break } else if(max== newN) { Number[b] = paste ("evidence still ambiguous with", max, "participants") } } } df2 = cbind(Bf_df, Number) colnames(df2)[ncol(df2)]= "N needed" return (df2 ) } addBf_ranges This function takes as its input a dataframe which is the output of the BF_table function. Each row provides the values which give the model of the data. It also requires a range of values to test as sd of h1. The function adds an additional column to this table which writes out the range of values within this range which give values which meet the criteria for: (i) strong evidence for null (BF < 1/10); substantial evidence for null (BF < 1/3); ambiguous (3 > BF > 1/3 ); substantial evidence for H1 (BF >= 10). addBf_ranges <-function(Bf_df, sdtheoryrange) { BFranges = vector() for (b in 1:dim(Bf_df)){ #b=1 range = Bf_range(sd=as.numeric(as.character(Bf_df$std.Error[b])), obtained=as.numeric(as.character(Bf_df$estimate[b])), meanoftheory=0, sdtheoryrange=sdtheoryrange, tail=as.numeric(as.character(Bf_df$tail[b])))

from_table = vector()
to_table = vector()
cat = vector()
category_table = vector()

for(i in 1:dim(range)) {       # go through each value in the range and categorize it

#i=1

#categorize current BF
if (range[i,2] <= (1/10)) {
cat[i] = "strong_null"      ## NOT below or equal to 1/10
} else if (range[i,2] <= (1/3)) {
cat[i] = "subst_null" ## NOT below or equal to 1/10, IS below or equal to 1/3
} else if (range[i,2] < 3) { ## NOT below or equal to 1/10, NOT below or equal to 1/3, IS below 3
cat[i] = "ambiguous"
} else if (range[i,2] >= 10) {## NOT below or equal to 1/10, NOT below or equal to 1/3, NOT below 3, IS above or equal to 10
cat[i] = "strong_h1"
} else {                ## NOT below or equal to 1/10, NOT below or equal to 1/3, NOT below 3, NOT above or equal to 10
cat[i] = "subst_h1"
}

j = length(category_table)

if (i==1){                      # first one
category_table[j+1] = cat[i]
from_table[j+1] = range[i,1]

} else if (cat[i] != cat[i-1]) { # NOT the first one, IS one where need to start new range
to_table[j] = range[i-1,1]
category_table[j+1] = cat[i]
from_table[j+1] = range[i,1]
}

if (i==dim(range)){        # if its the last one, finish off the table
to_table[j] = range[i,1]
}
}

# go through the little table and turn it int a string of ranges
string = ""
for(i in 1: length(category_table)){
string = paste(string, category_table[i], ": From" , round(from_table[i],4) , "To" , round(to_table[i],4) , ";")
}

BFranges[b] = string
}
out = cbind(Bf_df, BFranges)
return(out)
}

Load and set up data sets

#setwd("datafiles")
#getwd()
setwd("..")
The working directory was changed to C:/Users/liz/Dropbox/SharedFolders/mandarin_paper_liz_hanyu_helen_megan/PeerJ_Revision/key files inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
#Set up aptitude measure- i.e. pcpt pre-test scores
pcpt$condition = factor(pcpt$condition, levels = c("0", "1", "2"), labels = c("LV", "HV", "HVB"))
pcpt$session = factor(pcpt$session, levels = c("1", "2"), labels = c("Pre-test", "Post-test"))
ID_pre = with(subset(pcpt, session == "Pre-test"), aggregate(accuracy ~ subject + session + condition, FUN = mean))
ID_pre$session = NULL ID_pre$aptitude = ID_pre$accuracy *10 # Set up date for PI PI$condition = factor(PI$condition, levels = c("0", "1", "2"), labels = c("LV", "HV", "HVB")) PI$voicetype = factor(PI$voicetype, levels = c("nv1", "tv1"), labels = c("New voice", "Trained voice")) PI2 = merge(ID_pre, PI) # setting up production data PRO$session = factor(PRO$session, levels = c("pretest", "posttest", "nativespeaker", "picturenaming"), labels = c("Pre-test", "Post-test","Native-Speaker", "Picture-Naming")) #Get rid of participant48's production data and other unidentifiable trials with quality issues. PRO = PRO[!(PRO$subject=="48"),]
PRO = PRO[PRO$tone != 0, ] PC = subset(PRO, session == "Native-Speaker") CWR = subset(PRO, session %in% c("Pre-test", "Post-test")) CPN = subset(PRO, session == "Picture-Naming") ## Set up data for Word Repetition NWR = CWR NWR$condition = factor(NWR$condition, levels = c("0", "1","2"), labels = c("LV", "HV", "HVB")) NWR$wordtype = factor(NWR$wordtype, levels = c("trained", "untrained"), labels = c("Trained", "Untrained")) NWR2 = merge(NWR, ID_pre) ## Set up data for Picture Naming NPN = CPN NPN$condition = factor(NPN$condition, levels = c("0", "1", "2"), labels = c("LV", "HV", "HVB")) NPN2 = merge(ID_pre, NPN) # Set up date for 3IO dis$condition = factor(dis$condition, levels = c("0", "1", "2"), labels = c("LV", "HV", "HVB")) dis$session = factor(dis$session, levels = c("1", "2"), labels = c("Pre-test", "Post-test")) dis$trialtype = dis$voicetype = factor(dis$voicetype, levels = c("fff", "ffm", "fmf"), labels = c("Neutral", "Easy", "Hard"))
dis$wordtype = factor(dis$wordtype, levels = c("newword", "oldword"), labels = c("Untrained Item", "Trained Item"))
dis2 = merge(ID_pre, dis)
#Set up date for Training
train$condition = factor(train$condition, levels = c("0","2","1"), labels = c("LV", "HVB", "HV"))
train2 = merge(ID_pre, train)

Evidence for/ against hypothesis of greater generalization to novel voices/ in production after multiple voice training

Picture Identification

Run lmer model with just novel voices data and factor condition2 levels LV versus HV+HVB

PI2$condition2 = PI2$condition
PI2$condition2[PI2$condition=="HVB"]= "HV"
PI2.NewVoice = subset (PI2, voicetype == "New voice")
PI2.NewVoice= lizCenter(PI2.NewVoice, list("voicetype", "aptitude", "condition2"))
p_glmer = glmer(score ~  condition2.ct
+ (1|subject)
, family = "binomial",
control = glmerControl(optimizer = "bobyqa"),
data = PI2.NewVoice)
kable(round(summary(p_glmer)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) 1.71 0.111 15.364 0.000 condition2.ct 0.13 0.228 0.570 0.569 Model of the data is the estimate and st.error for condition2.ct Value to inform H1: Work out a plausible maximum - and use half this as the estimate - using the intercept (grand mean). • if “LV” is the logodds of the proportion correct in LV condition • “HV” is the logodds of the proportion correct in HV condtion Assumptions for plausible maximum: - minimal performance in LV - i.e. at chance (note chance here is logodds of 50%, i.e. 0) - above chance in HV, meaning that post-test drives all of the effect thus: condition = (HV - chance) - (LV - chance) = HV - chance - 0 = HV - 0 = HV intercept = (HV+LV))/2 = (HV+0))/2 = HV/2 2 * intercept = HV condition = 2 * intercept since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = intercept values for Bayes factor calculation (computed below) contrast = "PI, NovelVoice: LV versus HV" std.Error = round(summary(p_glmer)$coeff["condition2.ct", "Std. Error" ],3)
estimate = round(summary(p_glmer)$coeff["condition2.ct", "Estimate" ],3) h1_sd = round(summary(p_glmer)$coeff["(Intercept)", "Estimate" ],3)
tails = 1
PI_table = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(PI_table)
contrast std.Error estimate h1_sd tails
PI, NovelVoice: LV versus HV 0.228 0.13 1.71 1

Picture Naming: tone accuracy

Run lmer model with factor condition2 levels LV versus HV+HVB

NPN2$condition2 = NPN2$condition
NPN2$condition2[NPN2$condition=="HVB"]= "HV"
NPN2 = lizCenter(NPN2, list("condition2"))
NPN_glmer = glmer(tone_score ~ condition2.ct
+ (1|subject)
, family = "binomial",
control = glmerControl(optimizer = "bobyqa"),
data = NPN2)
kable(round(summary(NPN_glmer)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) -0.023 0.080 -0.284 0.776 condition2.ct -0.225 0.168 -1.341 0.180 Model of the data is the estimate and st.error for condition2.ct Value to inform H1: Work out a plausible maximum - and use half this as the estimate - using the intercept (grand mean). if “LV” is the logodds of the proportion correct in LV condition and “HV” is the logodds of the proportion correct in HV condition Assumptions for plausible maximum: - chance production of tones in LV condition - 1/4 (i.e. if they were at chance in producing each of four tones) in logodds space - above minimal performance in HV, meaning that HV drives all of the difference between the intercept and chance thus: condition = (HV - chance) - (LV - chance) = HV - logodds(1/4) - 0 = HV - logodds(1/4) intercept = (HV+LV)/2 2 intercept = HV + LV = HV + logodds(1/4) HV = 2 intercept - logodds(1/4) subbing in for HV: condition = (2 * intercept - logodds(1/4)) - logodds(1/4) = 2*(intercept - logodds(1/4)) since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = intercept - logodds(1/4) contrast = "PN: LV versus HV" std.Error = round(summary(NPN_glmer)$coeff["condition2.ct", "Std. Error" ],3)
estimate = round(summary(NPN_glmer)$coeff["condition2.ct", "Estimate" ],3) h1_sd = round(summary(NPN_glmer)$coeff["(Intercept)", "Estimate" ]- logodds(1/4),3)
tails = 1
PN_table = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(PN_table)
contrast std.Error estimate h1_sd tails
PN: LV versus HV 0.168 -0.225 1.076 1

Picture Naming: pinyin accuracy

Run lmer model with factor condition2 levels LV versus HV+HVB

NPN2$condition2 = NPN2$condition
NPN2$condition2[NPN2$condition=="HVB"]= "HV"
#table(NPN2$condition2, NPN2$condition)
NPN2 = lizCenter(NPN2, list("condition2"))
NPN_pinyin_glmer = glmer(pinyin_score ~ condition2.ct
+ (1|subject)
, family = "binomial",
control = glmerControl(optimizer = "bobyqa"),
data = NPN2)
kable(round(summary(NPN_pinyin_glmer)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) -0.212 0.093 -2.289 0.022 condition2.ct 0.104 0.196 0.531 0.595 Model of the data is the estimate and st.error for condition2.ct Value to inform H1: Work out a plausible maximum - and use half this as the estimate - using the intercept (grand mean). if “LV” is the logodds of the proportion correct in LV condition and “HV” is the logodds of the proportion correct in HV condition Assumptions for plausible maximum: - minimal performance in LV - this is actually 0% but we can’t compute this in logodds space. We therefore set this to be equivalent to one correct (i.e. 1/72) in logodds space - above minimal performance in HV, meaning that HV drives all of the difference between the intercept and the minimum thus: condition = (HV - minimum) - (LV - minimum) = HV - logodds(1/72) - 0 = HV - logodds(1/72) intercept = (HV+LV)/2 2 * intercept = HV+LV = HV + logodds(1/72) HV = 2 * intercept - logodds(1/72) subbing in for HV: condition = (2 * intercept - logodds(1/72)) - logodds(1/72) = 2*(intercept - logodds(1/72)) since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = intercept - logodds(1/72) contrast = "PN, pinyin: LV versus HV" std.Error = round(summary(NPN_pinyin_glmer)$coeff["condition2.ct", "Std. Error" ],3)
estimate = round(summary(NPN_pinyin_glmer)$coeff["condition2.ct", "Estimate" ],3) h1_sd = round(summary(NPN_pinyin_glmer)$coeff["(Intercept)", "Estimate" ]- logodds(1/72),3)
tails = 1
PN_pinyin_table = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(PN_pinyin_table)
contrast std.Error estimate h1_sd tails
PN, pinyin: LV versus HV 0.196 0.104 4.05 1

Word Repetition: tone accuracy

Run lmer model with factor condition2 levels LV versus HV+HVB

NWR2$condition2 = NWR2$condition
NWR2$condition2[NWR2$condition == "HVB"]= "HV"
NWR2 = lizCenter(NWR2, list("wordtype","condition2"))
nw_glmer = glmer(tone_score ~ wordtype.ct * condition2.ct * session
+ (wordtype.ct * session||subject)
, family="binomial",
control = glmerControl(optimizer = "bobyqa"),
data = NWR2)
kable(round(summary(nw_glmer)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) 0.958 0.073 13.207 0.000 wordtype.ct -0.020 0.070 -0.288 0.773 condition2.ct 0.060 0.153 0.394 0.693 sessionPost-test 0.395 0.075 5.291 0.000 wordtype.ct:condition2.ct 0.081 0.148 0.546 0.585 wordtype.ct:sessionPost-test 0.132 0.104 1.263 0.207 condition2.ct:sessionPost-test -0.108 0.157 -0.689 0.491 wordtype.ct:condition2.ct:sessionPost-test -0.073 0.220 -0.331 0.741 Model of the data is the estimate and st.error for condition2.ct:sessionPost-test (condition by session) Value to inform H1: Work out a plausible maximum - and use half this as the estimate - using the main effect of session (overall difference between pre and post) . if “LV.POST” is the logodds of the proportion correct in LV condition at posttest “LV.PRE” is the logodds of the proportion correct in LV condition at pretest “HV.POST” is the logodds of the proportion correct in HV condition at posttest “HV.PRE” is the logodds of the proportion correct in HV condition at pretest Assumptions for plausible maximum: - improvement from pre to post in HV - NO improvement from pre to post in LV i.e. LV.POST=LV.PRE, meaning all of the effect of session comes from the HV condition thus: condition by session = (HV.POST - HV.PRE) - (LV.POST- LV.PRE) = (HV.POST - HV.PRE) session = ((HV.POST - HV.PRE) + (LV.POST- LV.PRE))/2 = (HV.POST - HV.PRE)/2 2 * session = (HV.POST - HV.PRE) condition by session = 2 * session since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = session (which is sessionPost-test in the above model) Tone_accuracy contrast = "Word Repetition, Tone score: LV versus HV by session" std.Error = round(summary(nw_glmer)$coeff["condition2.ct:sessionPost-test", "Std. Error" ],3)
estimate = round(summary(nw_glmer)$coeff["condition2.ct:sessionPost-test", "Estimate" ],3) h1_sd = round(summary(nw_glmer)$coeff["sessionPost-test", "Estimate" ],3)
tails = 1
wordrep_table = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(wordrep_table)
contrast std.Error estimate h1_sd tails
Word Repetition, Tone score: LV versus HV by session 0.157 -0.108 0.395 1

Word Repetition: pinyin accuracy

Run lmer model with factor condition2 levels LV versus HV+HVB

nw_glmer_pinyin = glmer(pinyin_score ~ wordtype.ct * condition2.ct * session
+ (wordtype.ct * session||subject)
, family="binomial",
control = glmerControl(optimizer = "bobyqa"),
data = NWR2)
unable to evaluate scaled gradientModel failed to converge: degenerate  Hessian with 1 negative eigenvalues
kable(round(summary(nw_glmer_pinyin)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) 0.181 0.047 3.871 0.000 wordtype.ct 0.204 0.066 3.067 0.002 condition2.ct -0.019 0.099 -0.190 0.850 sessionPost-test 0.152 0.045 3.366 0.001 wordtype.ct:condition2.ct 0.069 0.140 0.494 0.621 wordtype.ct:sessionPost-test 0.055 0.090 0.614 0.539 condition2.ct:sessionPost-test -0.034 0.095 -0.356 0.722 wordtype.ct:condition2.ct:sessionPost-test -0.013 0.189 -0.070 0.944 Model of the data is the estimate and st.error for condition2.ct:sessionPost-test (condition by session) Value to inform H1: Work out a plausible maximum - and use half this as the estimate - using the main effect of session (overall difference between pre and post) . if “LV.POST” is the logodds of the proportion correct in LV condition at posttest “LV.PRE” is the logodds of the proportion correct in LV condition at pretest “HV.POST” is the logodds of the proportion correct in HV condition at posttest “HV.PRE” is the logodds of the proportion correct in HV condition at pretest Assumptions for plausible maximum: - improvement from pre to post in HV - NO improvement from pre to post in LV i.e. LV.POST=LV.PRE, meaning all of the effect of session comes from the HV condition thus: condition by session = (HV.POST - HV.PRE) - (LV.POST- LV.PRE) = (HV.POST - HV.PRE) session = ((HV.POST - HV.PRE) + (LV.POST- LV.PRE))/2 = (HV.POST - HV.PRE)/2 2 * session = (HV.POST - HV.PRE) condition by session = 2 * session since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = session (which is sessionPost-test in the above model) contrast = "Word Repetition, Pinyin score: LV versus HV by session" std.Error = round(summary(nw_glmer_pinyin)$coeff["condition2.ct:sessionPost-test", "Std. Error" ],3)
estimate = round(summary(nw_glmer_pinyin)$coeff["condition2.ct:sessionPost-test", "Estimate" ],3) h1_sd = round(summary(nw_glmer_pinyin)$coeff["sessionPost-test", "Estimate" ],3)
tails = 1
wordrep_table_pinyin = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(wordrep_table_pinyin)
contrast std.Error estimate h1_sd tails
Word Repetition, Pinyin score: LV versus HV by session 0.095 -0.034 0.152 1

Run lmer model with factor condition2 levels LV versus HV+HVB

dis2$condition2 = dis2$condition
dis2$condition2[dis2$condition == "HVB"]= "HV"
dis2 = lizCenter(dis2, list( "wordtype", "aptitude","condition2"))
dis2 = lizContrasts(dis2, dis2$condition, "LV") dis2 = lizContrasts(dis2, dis2$trialtype, "Neutral")
d_glmer = glmer(score ~ session * wordtype.ct *  condition2.ct
+ (Neutral_VERSUS_Easy + Neutral_VERSUS_Hard): session
+ (Neutral_VERSUS_Easy + Neutral_VERSUS_Hard)
+ (session * wordtype.ct||subject)
, family = "binomial",
control = glmerControl(optimizer = "bobyqa"),
data = dis2)
kable(round(summary(d_glmer)$coeff,3)) Estimate Std. Error z value Pr(>|z|) (Intercept) 0.396 0.059 6.711 0.000 sessionPost-test 0.310 0.048 6.523 0.000 wordtype.ct -0.314 0.064 -4.951 0.000 condition2.ct 0.061 0.125 0.486 0.627 Neutral_VERSUS_Easy 0.400 0.079 5.091 0.000 Neutral_VERSUS_Hard -0.138 0.077 -1.805 0.071 sessionPost-test:wordtype.ct 0.138 0.091 1.511 0.131 sessionPost-test:condition2.ct -0.001 0.100 -0.014 0.989 wordtype.ct:condition2.ct -0.020 0.134 -0.150 0.881 sessionPost-test:Neutral_VERSUS_Easy -0.270 0.113 -2.395 0.017 sessionPost-test:Neutral_VERSUS_Hard 0.117 0.111 1.056 0.291 sessionPost-test:wordtype.ct:condition2.ct 0.010 0.193 0.050 0.960 Model of the data is the estimate and st.error for sessionPost-test:condition2.ct (condition by session) Value to inform H1 (note: this is identical to wordrep above): Work out a plausible maximum - and use half this as the estimate - using the main effect of session (overall difference between pre and post) . if “LV.POST” is the logodds of the proportion correct in LV condition at posttest “LV.PRE” is the logodds of the proportion correct in LV condition at pretest “HV.POST” is the logodds of the proportion correct in HV condition at posttest “HV.PRE” is the logodds of the proportion correct in HV condition at pretest Assumptions for plausible maximum: - improvement from pre to post in HV - NO improvement from pre to post in LV i.e. LV.POST=LV.PRE, meaning all of the effect of session comes from the HV condition thus: condition by session = (HV.POST - HV.PRE) - (LV.POST- LV.PRE) = (HV.POST - HV.PRE) session = ((HV.POST - HV.PRE) + (LV.POST- LV.PRE))/2 = (HV.POST - HV.PRE)/2 2 * session = (HV.POST - HV.PRE) condition by session = 2 * session since this is an estimate of the maximum value of condition, we halve it to get our estimate to use as the sd of the half normal: sdtheory = session (which is sessionPost-test in the above model) contrast = "3 Int. Odd, Tone score: LV by HV by session" std.Error = round(summary(d_glmer)$coeff["sessionPost-test:condition2.ct", "Std. Error" ],3)
estimate = round(summary(d_glmer)$coeff["sessionPost-test:condition2.ct", "Estimate" ],3) h1_sd = round(summary(d_glmer)$coeff["sessionPost-test", "Estimate" ],3)
tails = 1
discrim_table = data.frame(cbind(contrast, std.Error, estimate, h1_sd,tails))
kable(discrim_table)
contrast std.Error estimate h1_sd tails
3 Int. Odd, Tone score: LV by HV by session 0.1 -0.001 0.31 1

compute bFs and ranges

#df = rbind(wordrep_table,discrim_table, wordrep_ap_table, discrim_ap_table)
df= rbind(PI_table, PN_table, PN_pinyin_table, wordrep_table, wordrep_table_pinyin, discrim_table)
df$std.Error = as.numeric(as.character(df$std.Error))
df$estimate = as.numeric(as.character(df$estimate))
df$h1_sd = as.numeric(as.character(df$h1_sd))
df$tails = as.numeric(as.character(df$tails))
df\$h1_motivation = ""
df2 = Bf_table(df)
#kable(df2)