Looking while listening
This eye tracking task is a simplified version of a visual world paradigm, in which every trial presents pairs of familiar images/objects of roughly the same size (example: a chair and a bath), accompanied with a pre-recorded Dutch sentence that asks the participant to look at one of these images (e.g., 鈥榳here is a chair?鈥). This paradigm 鈥 known as 鈥渓ooking while listening鈥 鈥 is developed by Anne Fernald (cf. Fernald, Zangl, Pottillo & Marchman, 2008). Note that we collect the data using an eye tracker (Tobii TX 300Hz), which measure gaze direction objectively, as opposed to video recordings of subject鈥檚 eye movements.
There are typically two key variables that you can obtain:
1) Reaction time (how quickly participants respond to verbal instruction: this includes only trials when participants are fixating the distracter image at target word onset and the DV is the latency it takes participants to switch to the target image) 
2) Accuracy (after word onset, proportion looking time to target relative to total looking time, e.g. how long they fixate target 鈥榗hair鈥 relative to distracter 鈥榖ath鈥//or total looking time.
Note that if researchers only report accuracy data, they tend to refer to this paradigm as the inter-modal preferential looking paradigm (cf. Golinkoff, Ma, Song & Hirsh-Pasek, 2013). More complex measures are growth curve analyses (with proportion target fixation on y-axis, and time as a continuous variable on the x-axis).
Participants:
Participants (age range: 2 year, 0 months - 4 years, 11 months, 30 days) came to the Child Research Center for half a day to participate in a battery of tasks. The described task is always the last task in the set of our four eye-tracking tasks (1. Social gaze; 2. Gap-overlap; 3; face pop-out; 4; looking while listening).
Stimuli:
Visual stimuli: Objects were typical highly frequent items from categories considered to be familiar for most Dutch infants by the age of 15 months. There were 12 categories, a subset from the 20 categories used in Junge et al 2012, who presented these categories as examples of familiar items for Dutch monolingual 9-month-olds. We then created six object-pairs: objects always appeared with another object that was matched in semantic class (e.g., both food items); whose labels did not share speech sounds (e.g., the two labels did not both start with the sound /b/)/.
| Pair | Category 1 | Category 2 | Class | Syllable | Phonetic W1 | Phonetic W2 | 
| 1 | cookie | banana | food | 2 | /鈥檏耻办颈/ | /b蓹 鈥榥蓱:n/ | 
| 2 | chair | bath | furniture | 1 | /stul/ | /产蓱迟/ | 
| 3 | poes | baby | animate | 1, 2 | /pus/ | /鈥檅e: bi:/ | 
| 4 | hond | koe | animate | 1 | /丑蓴苍迟/ | /ku/ | 
| 5 | schoen | jas | clothing | 1 | /sxun/ | /箩蓱蝉/ | 
| 6 | voet | hand | body parts | 1 | /vut/ | /丑蓱苍迟/ | 
Each category-pair was presented four times: twice W1 was the target, and twice W1 was the distracter (and W2 was the target). To avoid too much repetition of visual stimuli, (and to keep the experiment interesting for the child), we selected per category two images from the set used in Junge et al. 2012. Thus each picture was presented twice, but always paired with the same stimulus; and occurred once as target and once as distracter.
We used Photoshop CS6 to make the two images to appear roughly of the same size. Each object had to fall within an AOI of 730 x 820 pixels (see Left image below). Objects appeared on a dark grey background (see for example the Right image below).
 
 Auditory stimuli:
A female native speaker of Dutch (35yr; no children) produced the stimuli in a child-friendly manner in a sound-proof booth, recorded and digitized at 44.1KHz, mono-channel. For each category, we recorded multiple utterances of the type carrier sentence + x. (thus target words are recorded in natural contexts; not-spliced across different utterances; the speaker read the stimuli in a randomized order). Again, to keep the interest maximum, we varied the carrier sentence. There were three possible carrier sentences (鈥渮ie je een X鈥 鈥 鈥榙o you see a X 鈥 /鈥漦ijk! Een X鈥 鈥 鈥楲ook! A X鈥/ 鈥渨aar is een鈥︹ 鈥榳here is a X?鈥), counterbalanced across trials. Via Praat, we edited the sound files, and set the mean intensity of all waveforms to 75 DB (maximum measured value in the pilot was 70.1 dB). We then added silence prior to the carrier sentence to make sure that each target starts at 3s.
 
 Each target word appeared twice in the experiment, always with different carrier sentences. The mean length of carrier sentences including target words was 2262.9 ms (range 1788 鈥 3129; SD 419). The mean length of target words was 894 ms (range 618 鈥榢oe_waarIsEen鈥 鈥 1194 鈥榖anaan_KijkEen鈥, SD 122).
Trial Structure:
Before every trial 鈥 a fixation star in centre of 55 x 55 pixels. The fixation star is on screen for at least 1.5 seconds. If 0.5 seconds of gaze samples is available in the 5x5掳 bounding box around the fixation star, the trial will commence. After 3 seconds the trial starts regardless of the available gaze data.
Trial: Picture with paired images are presented for 5s, together with the matched audio file. The audio file is manipulated such that the target onset word is presented at 3000ms from audio/picture onset.
Design
The experiment consisted of 24 trials (12 categories x target position (2) Left/Right). Each object-pair appeared twice, with the image once as distracter and once as target. There was a pseudo-random order, fixed for every child. We counter-balanced target position across object-pairs. There was no direct picture repetition or word repetition across trials. Targets appeared no more than twice at the same side in a row, and carrier sentences were also no more than twice repeated in a row.
| trial | carrier | target | pair (left-right) | picture_token | position_target (1=l;2=r) | 
| 1 | Where | banana | cookie - banana | 1 | 2 | 
| 2 | See | chair | chair-bath | 1 | 1 | 
| 3 | Look | baby | baby-cat | 1 | 1 | 
| 4 | Where | dog | cow-dog | 1 | 2 | 
| 5 | Look | coat | shoe-coat | 1 | 2 | 
| 6 | See | foot | foot-hand | 1 | 1 | 
| 7 | Look | cookie | banana-cookie | 2 | 2 | 
| 8 | See | bath | bath-chair | 2 | 1 | 
| 9 | Where | cat | cat-baby | 2 | 1 | 
| 10 | Where | cow | dog-cow | 2 | 2 | 
| 11 | See | shoe | coat-shoe | 2 | 2 | 
| 12 | Look | hand | hand-foot | 2 | 1 | 
| 13 | See | cat | baby-cat | 1 | 2 | 
| 14 | Look | cow | cow-dog | 1 | 1 | 
| 15 | Where | chair | bath-chair | 2 | 2 | 
| 16 | Look | banana | banana-cookie | 2 | 1 | 
| 17 | See | hand | foot-hand | 1 | 2 | 
| 18 | Where | bath | chair-bath | 1 | 2 | 
| 19 | Look | shoe | shoe-coat | 1 | 1 | 
| 20 | See | cookie | cookie - banana | 1 | 1 | 
| 21 | Where | foot | hand-foot | 2 | 2 | 
| 22 | Look | dog | dog-cow | 2 | 1 | 
| 23 | Where | baby | cat-baby | 2 | 2 | 
| 24 | See | coat | coat-shoe | 2 | 1 | 
A video attention getter can be played whenever a child鈥檚 attention start waning. De current trial will be skipped from the moment the key is pressed for the attention getter.
General set-up
Infants sit in a car seat (10 month-olds; R3) approximately 65 cm away from the eyetracker. Testing occurs in a bright small room (300-350 Lux, Temperature 18-25 C), which does not have windows.
The Tobii TX300 eye-tracker (Tobii Technology, Stockholm, Sweden) with an integrated 23-inch monitor (1920 by 1080 pixels; 60 Hz refresh rate) was used to record eye movements. The Tobii TX300 ran at 300 Hz and communicated with MATLAB (version R2015b, MathWorks Inc., Natick, MA, USA) and the Psych Toolbox (version 3.0.12; Brainard, 1997) running on a MacBook Pro (OS X 10.9) via the Tobii SDK.
An operator-controlled calibration was run, which consisting of colored expanding and contracting spirals presented at the four corners and the center of the screen. The spirals were accompanied by a sound. A web-cam was used to monitor the participant. When the operator judged the participant to be looking at the spiral, a button was pressed, after which the spiral contracted and was calibrated. Details of the calibration stimuli are given in Hessels et al. (2015). The operator judged the calibration output from the Tobii SDK, after which a decision was made to accept the calibration or re-calibrate.
Once the child is calibrated, the experimenter closed the curtain that divided the room in two halves, and sat in the other half of the room, behind a desk with the stimulus MAC laptop. The experimenter could also see the child via a closed-circuit camera.
After the calibration, the experiment began.
Reference:
Hessels, R. S., Andersson, R., Hooge, I. T. C., Nystro鈧琺, M., & Kemner, C. (2015). Consequences of eye color, positioning, and head movement for eye-tracking data quality in infant research. Infancy, 20, 601鈥633.
