Speech-evoked cortical auditory responses in children with normal hearing

A Almeqbel, PhD, MSc, BSc Audiology

Hearing and Speech Sciences Department, Health Science Center, Kuwait University, Kuwait

Corresponding author: A Almeqbel ([email protected])

Objective. Cortical auditory-evoked potentials (CAEPs), an objective measure of human speech encoding in individuals with normal or impaired auditory systems, can be used to assess the outcomes of hearing aids and cochlear implants in infants, or in young children who cannot co-operate for behavioural speech discrimination testing. The current study aimed to determine whether naturally produced speech stimuli /m/, /g/ and /t/ evoke distinct CAEP response patterns that can be reliably recorded and differentiated, based on their spectral information and whether the CAEP could be an electrophysiological measure to differentiate between these speech sounds.

Method. CAEPs were recorded from 18 school-aged children with normal hearing, tested in two groups: younger (5 - 7 years) and older children (8 - 12 years). Cortical responses differed in their P1 and N2 latencies and amplitudes in response to /m/, /g/ and /t/ sounds (from low-, mid- and high-frequency regions, respectively). The largest amplitude of the P1 and N2 component was for /g/ and the smallest was for /t/. The P1 latency in both age groups did not show any significant difference between these speech sounds. The N2 latency showed a significant change in the younger group but not in the older group. The N2 latency of the speech sound /g/ was always noted earlier in both groups.

Conclusion. This study demonstrates that spectrally different speech sounds are encoded differentially at the cortical level, and evoke distinct CAEP response patterns. CAEP latencies and amplitudes may provide an objective indication that spectrally different speech sounds are encoded differently at the cortical level.

Keywords: cortical auditory-evoked potentials (CAEPs), electroencephalogram (EEG), P1 and N2 components, children with normal hearing

S Afr J CD 2013;60:38-43.DOI:10.7196/SAJCD.132


Cortical auditory-evoked potentials (CAEPs) are one type of electroencephalogram (EEG) that is a good method for understanding the neural representation of time-varying signals, including acoustic cues important for speech perception (Taylor & Baldeweg, 2002). The CAEPs, developed by Davis (1939), are a non-invasive instrument that can provide objective information on the functioning of the central auditory nervous system.

One component of the CAEP is the P1-N1-P2 complex, made up of three peaks, the latencies and amplitudes of which are determined partly by the characteristics of the stimulus used to evoke them. Traditionally, the P1-N1-P2 complex has been described in terms of amplitude and latency. Amplitude refers to the strength of the responses (measured in microvolts) and latency refers to the time after stimulus onset (measured in milliseconds). The number of recruited neurons, extent of neuronal activation and synchrony of the neural responses all contribute to the resulting morphology of the CAEPs.

The CAEPs are believed to reflect the activity of excitatory postsynaptic potentials at the level of the thalamus and higher auditory cortex (primary auditory cortex and association areas) (Cone-Wesson & Wunderlich, 2003; Ponton & Don, 2003; Purdy, Kelly & Thorne, 2001; Sharma & Dorman, 2006; Tremblay, Piskosz & Souza, 2003; Wunderlich & Cone-Wesson, 2006). Each peak of the CAEP waveform appears to originate from multiple neural generators (Ponton & Don, 2003; Sharma & Dorman, 2006). For example, research has shown that P1 peak is a positive waveform occurring at around 50 ms post stimulus onset, N1 is a negative peak occurring at 80 - 100 ms, and is followed by P2, the next positive waveform, at around 175 - 200 ms (Näätänen & Picton, 1987). Individual components are labelled and can be seen as distinct peaks on the waveform. They are also called the vertex potential, because it was originally best recorded at the vertex (Davis, Mast, Youshie & Zerlin, 1966).

Early studies focused on how the P1-N1-P2 complex is evoked by acoustic changes in the auditory environment. For instance, some of the earliest changes noted were systematic changes in the P1-N1-P2 complex with changes in stimulus intensity (Davis et al., 1966). Since that time, the P1-N1-P2 complex has been used to study neural detection of consonant-vowel transition, periodicity changes, amplitude envelope variation and spectral contact change (Ostroff, Martin & Boothroyd, 1998). Because these evoked potentials are sensitive to acoustic parameters of speech, this response is frequently used to study the neural detection of spectral and temporal cues in people with impaired speech understanding.

In young adults with normal hearing, the CAEPs contain two main vertex positive peaks, commonly termed P1 and P2, and one main vertex negative peak, N1 (Purdy et al., 2001). One other negative peak, N2, frequently follows P2. The response represents combined activity from a number of generators in the higher auditory nervous system. The P1-N1-P2 complex has been shown to be sensitive to acoustic parameters of speech signal, and as result this complex is frequently used to study the neural detection of spectral or temporal cues in populations with impaired speech difficulties.

In children, the N1 component of the CAEPs is not consistently present until 9 years of age and older, and maturation of CAEPs is very complex. The maturation and developmental course of the CAEPs are important, because changes in children’s CAEP continue well into adolescence. The maturation of the thalamocortical portions of the central auditory system can be assessed by recording age-related changes in the neurophysiological responses evoked by auditory stimulation. In contrast to adult CAEP waveforms, which are dominated by N1, children’s waveforms are typically dominated by P1 (Sharma & Dorman, 2006). This positive peak typically occurs at a longer latency of 100 - 300 ms post stimulus onset (Kraus & McGee, 1994; Sharma & Dorman, 2006). Sharma, Kraus, McGee and Nicol (1997) demonstrated that the development of P1 component can vary between 14 and 26 years of age, and beyond. The age when the N1 and P2 components first appear in young children is debated. Some authors have noted that they appear around 3 - 8 years, while others have suggested that they are absent in all young children (Ceponiene, Rinne & Näätänen, 2002; Pang & Taylor, 2000; Sharma et al., 1997) and that the N1 peak is not consistently present until age 9 years (Ponton, Don, Eggermont, Waring & Masuda, 1996). Therefore, N1 cannot be used as a reliable measure in young children, and this reflects maturation of the neural processes generating the N1 response (Wunderlich, Cone-Wesson & Shepherd, 2006).

Many studies have demonstrated that CAEP component peak latencies are shorter in adults than in young infants, children or adolescents. Sharma et al. (1997) compared the maturation of CAEPs in 86 normal-hearing children aged 6 - 15 years with 10 normal-hearing adults. They identified clear age-related changes in the P1 and N1 components of CAEPs. The maturation changes included a decrease in the latencies of both P1 and N1 and a decrease in the amplitude of P1 with increasing age.

Sharma, Dorman and Spahr (2002a) then studied the maturational changes in the latency of P1 component in 136 normal-hearing children, aged from 1 month to 20 years. The authors found that the latency of P1 decreases with increasing age, and that this continues until around 20 years of age. The changes in P1 latency occur at a more gradual rate in the second decade of life than the rapid decrease seen earlier (Sharma et al., 2002a). The maturational changes reported in the components of CAEP waveforms appear to be consistent with the general development of the central nervous system. Specifically, a general decrease was observed in all CAEP components, which appears to agree with the positive changes in myelination and synaptogenesis that continue during the first 20 years of life (Sharma et al., 1997).

Cunningham, Nicol, Zecker and Kraus (2000) recorded CAEPs to synthesised /ga/ in 150 participants aged 5 - 78 years. They showed that the P1-N1-P2 changes throughout the period from childhood to old age, and that the P1 latency did not change significantly during the primary school years (5 - 7, 8 - 10 and 11 - 12 years). Additionally, P1 latency was similar for those aged 19 - 27 and 55 - 78 years. P1 amplitude was not significantly different between those aged 5 – 7 and 8 - 10 years, but it was considerably larger for these younger children than for older children and adults (11 - 12, 13 - 15, 19 - 27 and 55 - 78 years). Moreover, the authors found that N1 presence was variable in children but was more consistent in both younger and older adults. On the other hand, the N2 component was seen more consistently in the younger children than in the adult participants.

Similarly, Ceponiene et al. (2002) reported that children’s CAEPs are dominated by P1 and N2 peaks. They compared CAEPs in eight 4-year-olds, eight 9-year-olds and 17 young adults (mean age 26 years). They used complex tones of 100 ms duration, presented bilaterally through loudspeakers to evoke the CAEPs at a 700 ms inter-stimulus interval (ISI). Their findings demonstrated that the 4-year-olds and 9-year-olds showed P1-N2 responses, whereas adults showed P1-N1-P2-N2 responses.

Wunderlich et al. (2006) studied the maturation of CAEPs in three groups of paediatric populations: (i) newborns with normal hearing (less than 7 days old); (ii) toddlers (13 - 41 months old); and (iii) young children (4 - 6 years old). They compared these with nine adult participants’ CAEPs. The researchers showed that the peak latencies of all CAEP components were considerably shorter in adult participants than in children. This trend is typical of most auditory-evoked potentials, which decrease in latency with maturational changes in myelination and synaptic efficiency. The above studies all illustrate that the CAEPs maturation is complex and the negative N1 component is not well developed in children; rather, the dominant negative component in children is N2.

The development of speech and language in children with prelingual hearing impairment depends on the perception and discrimination of a broad spectrum of speech sounds. The recording of CAEPs to human speech sounds in infants and young children can provide an index of the functional integrity of the regions of the cortex which process those acoustically complex stimuli that are critical for developing normal speech and language (Novak et al., 1989). For example, Rance, Cone-Wesson, Wunderlich & Dowell (2002) found that the development of reasonable speech perception performance in children with auditory neuropathy spectrum disorder (ANSD) was correlated with P1 and N2 of normal latency, amplitude and morphology, whereas the absence of CAEPs was associated with poor speech recognition scores. For these reasons, CAEPs are thought to reflect the functional integrity of the auditory pathways involved in processing complex speech stimuli (Novak et al., 1989; Ostroff et al., 1998; Tremblay, Friesen, Martin & Wright, 2003).

In the study reported here, we evaluated the P1 and N2 of the CAEP components as a function of three speech sounds - /m/, /g/ and /t/ - of low-, mid- and high-frequency regions, respectively, at the level of the auditory cortex in school-aged children with normal hearing. We aimed to determine whether the naturally produced speech stimuli /m/, /g/ and /t/ evoked distinct CAEP response patterns that could be reliably recorded in normal-hearing children, and whether the CAEP could be an electrophysiological measure to differentiate between these speech sounds.

Method

Objective

The main aim of the study was to determine whether the naturally produced speech stimuli /m/, /g/ and /t/, with different frequency patterns, evoked distinct P1 and N2 response patterns in terms of morphology, latencies and amplitudes that could be reliably recorded in children with normal hearing. We also investigated whether the CAEP is an objective tool that can measure the ability to detect and discriminate between each of these different speech sounds in normal-hearing children, which, in turn, provides us with normative data.

Research design

A cross-sectional research design was employed to describe the basic analysis of CAEP amplitudes and latencies for different speech sounds.

Participants

We recruited 18 school-aged children (8 males, 10 females) with normal hearing, aged 5 - 12 years (mean 8.77, standard deviation (SD) 2.28). Participants had pure-tone air conduction thresholds within a normal limit (≤15 dB HL at octave frequencies from 250 Hz to 8 kHz) and normal middle-ear function (shown with normal tympanograms), and no history of hearing problems or difficulties in understanding speech in quiet or noisy environments. Parents of all children reported no previous history of language and reading problems. All participants scored ≥90% using the Manchester Junior Word Lists (MJWL) (Watson, 1957). Prior to testing, the children and their parents signed a consent form.

Data collection

All participants had otoscopic evaluation prior to testing to check for occluded wax or any sign of tympanic membrane abnormality. Air conduction and bone conduction pure-tone thresholds were determined using a calibrated clinical audiometer (AC33 Interacoustics 2 channel Audiometer) using a modified version of the Hughson and Westlake procedure (Cahart & Jerger, 1959). Tympanometry was undertaken using a calibrated immittance audiometer (GSI-Tympstar V2, calibrated as per ANSI, 1987) using a standard 226 Hz probe tone. Speech perception was performed using the MJWL (Watson, 1957).

Materials

We used the natural speech stimuli /m/, /g/ and /t/ to evoke CAEP in the participants. We chose naturally produced speech stimuli, as recommended by Picton, Alain, Otten, Ritter and Achim (2000), because our goal was to apply the results to speech perception in everyday life.

The speech stimuli /mum/, /guy/ and /two/ were produced by an English female speaker, and were of 30 ms duration and ramped with 5 ms rise and fall time to prevent any audible click arising from the rapid onset or offset of the waveform. The stimuli were generated using a 16-bit digital-to-analogue converter with a sampling frequency of 44.1 kHz and low pass filtered with mono recording. The speech stimuli /m/, /g/ and /t/ included very little vowel transition and were chosen because they have a spectral emphasis in the low-, mid- and high-frequency regions, respectively. Therefore they have the potential to provide information about the perception of speech sounds at the level of the auditory cortex in different frequency regions. The frequency spectrums for /m/, /g/ and /t/ tokens are shown in Figure 1.


Fig. 1. Frequency spectrum for /m/ /g/ /t/ tokens smoothed.


Speech sounds were edited using the programme Cool Edit 2000. The onset-to-onset duration ISI was 1030 ms. Details are summarised in Table 1.



Stimulus presentation

All speech stimuli used in the electrophysi­ology procedure were presented at 65 dB sound pressure level (SPL), as measured at the participant’s head, which approximates normal conversational level. Presentation was via a loudspeaker placed 1 metre from the participant’s seat at a 0 azimuth.

Set-up

Participants sat on a comfortable chair in a quiet room at an electrophysiology clinic and watched a DVD of their own choice. The volume was set at silent and the subtitle was activated to make sure that participants would be engaged with the movie and pay no attention to the stimuli. All participants were instructed to be relaxed and not to pay attention to the sounds being presented, and not to fall asleep while the CAEPs were recorded.

Data acquisition

A NeuroScan and 32-channel NeuAmps evoked potential system was used for evoked potential recording. All sounds were presented using Neuroscan STIM 2 stimulus presentation system.

Recording

Evoked potentials were recorded in continuous mode (gain 500, filter 0.1 - 100 Hz) and converted using analogue-to-digital sampling rate of 1 000 Hz using scan (version 4.3) via gold electrodes placed at C3, C4 and Cz with reference electrode A2 on the right mastoid bone and ground on the contralateral ear on the mastoid bone. Electrode impedances were maintained below 5 kΩ for all participants. The test session, including electrode application and evoked response recording, lasted about 20 minutes. None of the participants showed signs of fatigue during the testing. All sound levels presented were calibrated using Bruel & Kjaer SLM 2.

Off-line data analysis

The EEG files with a -100 to 500 ms time window were obtained from the continuous file relative to the stimulus onset. Any responses on scalp electrodes exceeding ±100 µV were rejected. Prior to averaging, EEG files were baseline corrected using a prestimulus period (-100 ms); this -100 ms prestimulus time window was used to establish a baseline to correct for the DC level of background EEG activities. Averaging was digitally band pass filtered from 1 to 30 Hz (zeroshift, using a 12 dB/octave slope zero-phase filter) to smooth the waves for the final figures.

Larger amplitude responses are recorded from the midline scalp locations, and neural patterns to speech stimuli have been reported to be similar at electrode locations surrounding the midline (Tremblay et al., 2003). Therefore the peak P1 and N2 latencies were detected based on the recordings from the Cz electrode site (Cz refers to the vertex midline placement). Individual participants had two averaged CAEP waveforms, resulting in a grand average waveform of 200 (i.e. 100 x 2) sweeps for each stimulus presentation.

We measured the P1 in the children from the highest positive peak after the stimulus onset, and the N2 was measured as the immediate negative component occurring after the P1, usually present around 200 - 250 ms after stimulus presentation.

Participants were divided into two groups, a younger group aged 5 - 7 years and an older group aged 8 - 12 years, and the results are discussed separately below.

Data analysis

For each participant, the individual grand average waveform was computed, visually identified by two experts in electrophysiology. The same EEG files were transferred to Excel and subjected to suitable statistical analyses using SPSS (version 18). The smaller groups of participants necessitated the use of non-parametric analysis.

Results

CAEP: Younger children (5 - 7-year age group)

The CAEP components P1 and N2 were easily recognised for all speech stimuli, /m/, /g/ and /t/, at the Cz electrode for all children in this group (Fig. 2).



Fig. 2. Grand averaged CAEP waveforms in response to the three speech stimuli in the 5 - 7-year age group.


P1 amplitude

Mean and SD peak amplitudes for the P1 component in response to /m/, /g/ and /t/ speech stimuli for the 5 - 7-year age group are shown in Table 2. Visual inspection revealed robust P1 amplitude of /m/, /g/ and /t/ at about 2.7, 4.5 and 2.1 (μV) respectively. Changes in P1 amplitude were observed as a function of three different speech stimuli that differ in frequency information. The Friedman test compared the mean P1 amplitude in response to the three speech stimuli /m/, /g/ and /t/. The results indicated a statistically significant difference in P1 amplitude of the CAEP across the speech stimuli (chi-squared=6.000, p<0.05). Inspection of the median values showed changes in P1 amplitude of the CAEP from /m/ (3.2), /g/ (4.4) and /t/ (2.7).



The Wilcoxon signed ranks test was performed to test (post hoc analysis) for a significant difference in the means of pairs of speech stimuli. Results revealed a significant difference in mean of the P1 amplitude between the speech sounds /g/ and /t/ (Z= -2.375, p≤0.05), and the speech sounds /g/ and /m/ (Z= -2.261, p≤0.05).

P1 latency

Mean peak and SD latencies for P1 in response to /m/, /g/ and /t/ for the 5 - 7-year age group are shown in Table 2. Visual inspection revealed robust P1 latency of /m/, /g/ and /t/ at about 93, 91 and 100 ms, respectively. Changes in P1 latency were observed as a function of three different speech sounds that differ in frequency information. The Friedman test was performed to compare the mean P1 latency in response to the three speech tokens /m/, /g/ and /t/. The results indicated no statistically significant difference in P1 latency of the CAEP across the speech sounds (chi-squared=2.333, p>0.05). Inspec­ -tion of the median values showed changes in P1 latency of the CAEP from /m/ (100), /g/ (96) and /t/ (105).

N2 amplitude

Mean and SD peak amplitudes for N2 in response to /m/, /g/ and /t/ speech stimuli for the 5 - 7 years old group are shown in Table 2. Visual inspection revealed robust N2 amplitudes of /m/, /g/ and /t/ at about 4.3, 5.5 and 3.8 μV, respectively. Changes in N2 amplitude were observed as a function of three different speech sounds that differ in frequency information. The Friedman test was performed to compare the mean N2 amplitude as function of the three speech stimuli /m/, /g/ and /t/. The results indicated a statistically significant difference in N2 amplitude of the CAEP across the speech sounds (chi-squared=5.900, p<0.05). Inspection of the median values showed changes in N2 amplitude of the CAEP from /m/ (3.6), /g/ (5.9) and /t/ (3.5).

The Wilcoxon signed ranks test was performed to test (post hoc analysis) which pairs of speech stimuli were significantly different. Results revealed a significant difference in N2 amplitude in the younger age group between the speech sounds /g/ and /t/ (Z= -2.201, p≤0.05), and the speech sounds /g/ and /m/ (Z= -2.176, p≤0.05).

N2 latency

Mean and SD N2 latency in response to /m/, /g/ and /t/ speech sounds for the 5 - 7-year age group are shown in Table 2. Visual inspection revealed a robust N2 latency of /m/, /g/ and /t/ at about 233, 219 and 245 ms, respectively. As expected in this age group, there was no evidence of an N1 component in the majority of children. In both the individual and grand average responses, changes in N2 latency were observed as a function of three different speech sounds that are different in frequency region. The Friedman test was performed to compare the mean N2 latency as function of the three speech tokens /m/, /g/ and /t/. The results indicated a statistically significant difference in N2 latency of the CAEP across the speech stimuli (chi-squared=7.000, p<0.05). Inspection of the median values showed changes in N2 latency of the CAEP from /m/ (235), /g/ (222) and /t/ (250).

Finally, the Wilcoxon signed ranks test was performed to test (post hoc analysis) which pairs of speech stimuli were significantly different. Results revealed a significant difference in N2 latency in the younger age group between the speech sounds /g/ and /t/ (Z= -2.201, p≤0.05), and the speech sounds /g/ and /m/ (Z= -2.131, p≤0.05).

CAEP: older children (8 - 12-year age group)

P1 amplitude

Mean and SD peak amplitudes for P1 in response to /m/, /g/ and /t/ speech sounds for the 8 - 12-year age group are shown in Table 3. Visual inspection revealed smaller P1 amplitude of /m/, /g/ and /t/ at about 3.0, 3.1 and 1.2 μV, respectively. Changes in P1 amplitude were observed as a function of three different speech sounds that differ in frequency region.



The Friedman test was performed to compare the mean P1 amplitude as function of the three speech tokens /m/, /g/ and /t/. Results indicated a statistically significant difference in mean of the P1 amplitude of the CAEP across the speech tokens (chi-squared= 7.185, p<0.05). Inspection of the median values showed changes in P1 amplitude of the CAEP from /m/ (2.8), /g/ (3.7) and /t/ (2.3).

The Wilcoxon signed ranks test was then performed to test (post hoc analysis) which pairs of speech stimuli were significant. Results showed a significant difference in P1 amplitude in the older group between the speech sounds /g/ and /t/ (Z= -2.197, p=<0.05), and between /m/ and /t/ (Z= -2.201, p≤0.05).

P1 latency

Mean and SD peak latencies of N2 in response to /m/, /g/ and /t/ for the 8 - 12-year age group are shown in Table 3. Visual inspection revealed a robust P1 latency of /m/, /g/ and /t/ at about 93, 96 and 94 ms, respectively. Changes in P1 latency were observed as a function of three different speech sounds that differ in frequency region. The Friedman test was performed to compare the mean P1 latency as function of the three speech tokens /m/, /g/ and /t/. Results indicated no statistically significant difference in P1 latency of the CAEP across the speech tokens (chi-squared=2.909, p>0.05). Inspection of the median values showed changes in N2 latency of the CAEP from /m/ (92), /g/ (95) and /t/ (90).

N2 amplitude

Mean and SD peak amplitudes for N2 in response to /m/, /g/ and /t/ speech sounds for the 8 - 12-year age group are shown in Table 3. Visual inspection revealed smaller N2 amplitude of/m/, /g/ and /t/ at about 4.1, 4.7 and 2.3 μV, respectively. Changes in N2 amplitude were observed as a function of three different speech sounds that differ in frequency region. The Friedman test was performed to compare the mean N2 amplitude as function of the three speech tokens /m/, /g/ and /t/. Results indicated a statistically significant difference in N2 amplitude of the CAEP across the speech tokens (chi-squared=11.143, p<0.005). Inspection of the median values showed changes in N2 amplitude of the CAEP from /m/ (4.8), /g/ (5.9) and /t/ (3.4).

Further analysis using Wilcoxon signed ranks test to test (post hoc analysis) which pairs of speech stimuli were significant revealed that the significant difference in N2 amplitude in the older group was between the speech sounds /g/ and /t/ (Z= -2.371, p≤0.05).

N2 latency

The mean and SD latencies of N2 in response to /m/, /g/ and /t/ for the 8 - 12-year age group are shown in Table 3. Visual inspection revealed a robust N2 latency of /m/, /g/ and /t/ at about 249, 237 and 244 ms, respectively. Changes in N2 latency were observed as a function of three different speech sounds that differ in frequency region. The Friedman test was performed to compare the mean N2 latency as function of the three speech tokens /m/, /g/ and /t/. Results indicated no statistically significant difference in N2 latency of the CAEP across the speech tokens (chi-squared=2.905, p>0.05). Inspection of the median values showed changes in N2 latency of the CAEP from /m/ (235), /g/ (232) and /t/ (240).

Test and retest reliability

The Spearman’s rho test was performed to individually test the correlation between the first and second trial of each stimulus. There was a high correlation between the test and retest condition, r≥0.5 and p<0.05.


Discussion

The main aim of this study was to determine whether different speech sounds that encompass three spectral information frequencies (low, mid and high) could be differentiated from each other based on response latency and amplitude of the CAEP measures, in particular the P1 and N2 components, in school-aged children with normal hearing. Robust P1 and N2 responses were obtained from all participants for all speech stimuli tested. Notably, there was a distinct difference in morphology of the CAEPs between the two groups (younger vs. older): the younger group had a more robust and larger P1 component than the older group. The N2 component was also larger in amplitude in the younger group. In the older group of children, a small amplitude but a definite N1 peak was revealed in particular for the speech sounds /g/ and /m/ (Fig. 3). This finding supports the ongoing maturation of the CAEPs in childhood until they reach the adult-like CAEP waveforms; however, the maturation aspect of the CAEP is not discussed here (for reviews, see Eggermont & Ponton, 2003; Fox, Anderson, Reid, Smith & Bishop, 2010; Gilley, Sharma, Dorman & Martin, 2005; Novak et al., 1989; Pang & Taylor, 2000; Pasman, Rotteveel, Massen & Visco, 1999; Ponton et al., 1996; Ponton, Eggermont, Kwong & Don, 2000; Ponton, Eggermont, Khosla, Kwong & Don, 2002; Sharma & Dorman, 2006; Sharma et al., 1997; Sharma, Dorman & Spahr, 2002b; Sharma et al., 2002a; Wunderlich et al., 2006).


Fig. 3. Grand averaged CAEP waveforms in response to the three speech stimuli in the 8 - 12-year age group.

In the younger group, P1 amplitude changed significantly as a function of the /m/, /g/ and /t/ speech sounds. The largest amplitude was for /g/ and the smallest amplitude was for /t/. The statistical analysis revealed that a significant mean difference of P1 amplitude occurred between /g/ and /t/ and between /g/ and /m/. P1 latency changed slightly as a function of these speech stimuli; however, the change was not significant.

N2 amplitude in the younger group changed significantly as a function of these sounds, and the significant changes were between the speech sounds /g/ and /t/ and between /g/ and /m/. There was no significant difference between /m/ and /t/. The largest N2 amplitude was for /g/ and the smallest was for /t/. N2 latency also changed significantly as a function of the speech sounds /g/, /m/ and /t/; the significant changes were between /g/ and /t/ and between /g/ and /m/.

In the older group, there was significant mean difference in P1 amplitude between /g/ and /t/ and between /m/ and /t/. However, there was no significant mean difference in P1 latency across these speech stimuli. N2 amplitude in the older group showed a significant difference only between /g/ and /t/. N2 latency changed as a function of these speech sounds; however, the changes were not significant.

Taken together, the results indicate that the amplitude of the P1 and N2 in both groups showed significant changes in mean in response to /m/ (low-), /g/ (mid-) and /t/ (high-frequency) sounds. The largest amplitude of the P1 and N2 component was always for the speech sound /g/ and the smallest was for the speech sound /t/. The P1 latency in both age groups, however, showed no significant difference between any of these speech sounds. The N2 latency showed a significant change in the younger group but not in the older group. Moreover, the N2 latency of the speech sound /g/ was always noted earlier in both groups.

There are different morphology, latency and amplitude P1 and N2 responses evoked by the different speech tokens that encompass different frequency patterns recorded in children with normal hearing. Therefore the CAEP can be used as an objective measure to detect and discriminate between each of these different speech sounds.

Research on the tonotopic organisation of the primary auditory cortex and the effect of the stimulus frequency on the auditory cortex has found that the cortical areas that respond to low-frequency auditory information are located more superficially (closer to the surface of the scalp) than cortical regions for high frequency (located more deeply within the scalp) (Jacobson, Lombardi, Gibbens, Ahmed & Newman, 1992; Wunderlich et al., 2006; Yetkin, Roland, Christensen & Purdy, 2004). Low-frequency speech sounds such as /m/ and /g/ may therefore activate more superficial cortical layers and so produce a larger amplitude than higher frequency speech sounds such as /t/, which might activate a deeper cortical layer in the scalp and thus have a smaller amplitude when surface scalp recording electrodes are used.

The results from this study demonstrate that the frequency organisation of the auditory system can be revealed in both amplitude and latency measures of the CAEP. Specifically, we have demonstrated that the P1 and N2 amplitude for both younger and older groups is larger in amplitude in response to the /g/ speech sound than to /m/ or /t/ speech sounds. We have also shown that the peak latency of N2 is earlier in response to the /g/ speech sound than to the /m/ and /t/ speech sounds.

These findings demonstrate that the speech stimuli for low-, mid- and high-frequency information, such as /m/, /g/ and /t/, encode differently at the cortical level, and it may be assumed that each of the three speech sounds activates a different neuronal population in the auditory cortex, based on their latency and amplitude responses. Thus these CAEP evoked by speech sounds could be used as an objective test to evaluate speech discrimination.

Conclusion

The cortical auditory-evoked response can be elicited by speech sounds that encompass three frequency information groups, namely low, mid and high frequencies. These speech sounds are /m/, /g/, /t/. Interestingly, the /g/ speech sounds evoked larger amplitude and shorter latencies than /m/ and /t/ speech sounds that were dominated by low- and high-frequency energy. Thus P1 and N2 responses of the CAEP may be used to objectively measure differences in neural encoding and the perception of spectrally different speech sounds. In the current study we have described and provided normative data from school-aged children in response to three different speech sounds that differ in spectral information. Therefore this objective measure provides us with a tool to investigate the neurophysiological processes that underlie our ability to perceive speech. Further studies should test these results on a population who are difficult to assess by behavioural responses, and investigate how these sounds are evoked in a disordered population.

References

Cahart, R. & Jerger, J. (1959). Preferred method for clinical determination of pure tone thresholds. J Speech Hear Disord, 24, 330-345.

Ceponiene, R., Rinne, T. & Näätänen, R. (2002). Maturation of cortical sound processing as indexed by event-related potentials. Clin Neurophysiol, 113(6), 870-882.

Cunningham, J., Nicol, T., Zecker, S. & Kraus, N. (2000). Speech-evoked neurophysiologic responses in children with learning problems: Development and behavioral correlates of perception. Ear and Hearing, 21(6), 554-568.

Cone-Wesson, B.K. & Wunderlich, J.L. (2003). Auditory evoked potentials from the cortex: Audiology application. Current Opinion in Otolaryngology & Head & Neck Surgery, 11(5), 372-377.

Davis, P.A. (1939). Effects of acoustic stimuli on the waking human brain. Journal of Neurophysiology, 2(1), 494-499.

Davis, H., Mast, T., Youshie, N. & Zerlin, S. (1966). The slow response of the human cortex to auditory stimuli: Recovery process. Clin Neurophysiol, 21(2), 105-113.

Eggermont, J. & Ponton, C. (2003). Auditory-evoked potential studies of cortical maturation in normal hearing and implanted children: Correlations with changes in structure and speech perception. Acta Otolaryngol, 123, 249-252.

Kraus, N. & McGee, L. (1994). Mismatch negativity in the assessment of central auditory function. Am J Audiol, 3(2), 39-51.

Fox, A., Anderson, M., Reid, C., Smith, T. & Bishop, D. (2010). Maturation of auditory temporal integration and inhibition assessed with event-related potentials (ERPs). BMC Neuroscience, 11, 49.

Gilley, P., Sharma, A., Dorman, M., & Martin, K. (2005). Developmental changes in refractoriness of the cortical auditory evoked potentials. Clin Neurophysiol, 116, 648-657.

Jacobson, G., Lombardi, D., Gibbens, N., Ahmed, B. & Newman, C. (1992). The effects of stimulus frequency and recording site on the amplitude and latency of multichannel cortical evoked potentials (CAEP) component N1. Ear Hear, 13, 300-306.

Novak, G., Kurtzberg, D., Judith, A., Kreuzer, G., Herbert, G. & Vaughan, J. (1989). Cortical response to speech sounds and their formants in normal infants: Maturational sequence and spatiotemporal analysis. Clin Neurophysiol, 73, 295-305.

Näätänen , R. & Picton, T. (1987). The N1 wave of the human electric and magnetic responses to sound: A review and an analysis of the component structure. Psychophysiology, 24(4), 375-425.

Ostroff, J.M., Martin, B.A. & Boothroyd, A. (1998). Cortical evoked response to acoustic change within a syllable. Ear Hear, 19(4), 290-297.

Pang, E. & Taylor, M. (2000). Tracking the development of N1 from age 3 to adulthood: An examination of speech and non-speech stimuli. Clin Neurophysiol, 111, 388-397.

Pasman, J., Rotteveel, J., Massen, B. & Visco, Y. (1999). The maturation of auditory cortical evoked responses between (preterm) birth and 14 years of age. Eur J Paediatr Neuro, 3, 79-82.

Picton, T., Alain, C., Otten, L., Ritter, W. & Achim, A. (2000). Mismatch negativity: different water in the same river. Audiol Neurootol, 5, 111-139.

Ponton, C.W. & Don, M. (2003). Cortical auditory evoked potentials recorded from cochlear implant users: Methods and applications. In H.E. Cullington (Ed.), Cochlear implants: Objective measures. London: Whurr Publishers.

Ponton, C.W., Don, M., Eggermont, J., Waring, M. & Masuda, A. (1996). Maturation of human cortical auditory function: Differences between normal-hearing children and children with cochlear implants. Ear Hear, 17(5), 430-437.

Ponton, C., Eggermont, J., Khosla, D., Kwong, B. & Don, M. (2002). Maturation of human central auditory system activity: Separating auditory evoked potentials by dipole source modeling. Clin Neurophysiol, 113, 407-420.

Ponton, C., Eggermont, J., Kwong, B. & Don, M. (2000). Maturation of human central auditory system activity: Evidence from multi-channel evoked potentials. Clin Neurophysiol, 111, 220-236.

Purdy, S.C., Kelly, A.S. & Thorne, PR. (2001). Auditory evoked potentials as measures of plasticity in humans. Audiol Neurootol, 6(4), 211-215.

Rance, G., Cone-Wesson, B., Wunderlich, J. & Dowell, R. (2002). Speech perception and cortical event related potentials in children with auditory neuropathy. Ear Hear, 23, 239-253.

Sharma, A. & Dorman, F. (2006). Central auditory development in children with cochlear implants: Clinical implications. Adv Otorhinolaryngol, 64, 66-88.

Sharma, A., Dorman, F. & Spahr, A. (2002a). A sensitive period for the development of the central auditory system in children with cochlear implants: implications for age of implantation. Ear Hear, 23, 532-539.

Sharma, A., Dorman, F., & Spahr, A. (2002b). Rapid development of cortical auditory evoked potentials after early cochlear implantation. Neuroreport, 13, 1365-1368.

Sharma, A., Kraus, N., McGee, T. & Nicol, T. (1997). Developmental changes in P1 and N1 central auditory responses elicited by consonant-vowel syllables. Clin Neurophysiol, 104(6), 540-545.

Taylor, M.J. & Baldeweg, T. (2002). Application of EEG, ERP and intracranial recordings to the investigation of cognitive functions in children. Developmental Science, 5(3), 318-334.

Tremblay, K., Friesen, L., Martin, B. & Wright, R. (2003). Test-retest reliability of cortical evoked potentials using naturally produced speech sounds. Ear Hear, 24, 225-232.

Tremblay, K., Piskosz, M & Souza, P. (2003). Effects of age-related hearing loss on the neural representation of speech cues. Clin Neurophysiol, 114, 1332-1343.

Watson, T. (1957). Speech audiometry in children. In A. Ewing (Ed.), Educational guidance and the deaf child. Manchester: Manchester University Press.

Wunderlich, J.L. & Cone-Wesson, B.K. (2006). Maturation of CAEP in infants and children: A review. Hear Res, 212(1-2), 212-223.

Wunderlich, J., Cone-Wesson, B. & Shepherd, R. (2006). Maturation of the cortical auditory evoked potential in infants and young children. Hear Res, 212, 185-202.

Yetkin, Z., Roland, P., Christensen, W. & Purdy, D. (2004). Silent functional magnetic resonance imaging (fMRI) of tonotopicity and stimulus intensity coding in human primary auditory cortex. The Laryngoscope, 114, 512-518.