Background
Audio recordings and transcriptions are recommended for ELLs
working to improve their spoken English intelligibility (Gorsuch,
Meyers, Pickering, & Griffee, 2013). By analyzing these
transcriptions, students can understand what they are not communicating
clearly and can more easily see how they need to improve their
communication. For students, international teaching assistants (ITAs) in
particular, who think that their speech is sufficiently comprehensible,
these transcriptions may serve as a wake-up call that they have room
for improvement (Wallace, 2013).
While this activity provides insight into a student’s speech,
transcribing the speech itself is not a pedagogical goal, and because it
is time-consuming, many students do not do it well; some even skip this
crucial step. With Google’s voice to text software, however, many
students are now able to save time and receive feedback instantly.
In my TESOL 2015 presentation, the audience will learn how to
record and have speech transcribed simultaneously, how to correct the
transcript, and how to mark the transcription for features of discourse
intonation. The audience will then analyze parts of a transcription to
learn what pronunciation issues the student may face. Finally,
suggestions will be given on how this assignment may be graded.
How to Record and Have Speech Transcribed Simultaneously
As of now, it is still necessary to use two different
applications for recording and transcribing. Due to the complexity and
coordination required to complete this step successfully, it is
recommended that this activity be done during class so the instructor
can offer guidance and train the learners in the effective use of the
technology (Hubbard, 2013). Furthermore, the audio should be recorded
through headsets, because the quality of the transcription will suffer
due to ambient noise from using the built-in microphone.
Steps
1. Open the Google
Web Speech API Demonstration. When you speak, Google Web
Speech will transcribe what you say. Do a test run of this first by
clicking the microphone icon (you may need to use the Chrome web
browser; you may need to “allow” the microphone). It is working when
there is a pulsing red dot behind the microphone icon. Your words will
appear in the box as you speak.
2. Open an audio recording application to record your
voice—Audacity or QuickTime are reliable—and make a test recording to
ensure that the quality is clear. If you use Audacity, make sure that
you are able to export the file as an MP3, because computers that have
not installed Audacity cannot read .AUP (Audacity-specific) files. If
you cannot export the file or save it as an MP3, you will need to download
LAMElib first. This will enable you to export the AUP file as
an MP3, which can be played in other programs.
3. Resize your windows so that you can see both your browser
and audio recorder. You will speak for 1.5–2 minutes on the topic of
choice (sample topics below). Once you are ready to begin, click the
microphone icon on the browser, then click the record button on the
audio recorder. Click “stop” when you are finished recording, then click
“copy and paste” on your web browser (below the text box). Here is a
sample topic of introduction:
- Give your full name and your name preference (“My name is ___, but you can call me ___”).
- Say where you are from, which languages you speak.
- State your major and area of study.
- Discuss a hobby (something you like to do).
- (If you still have time) Describe your happiest day, one of
the most interesting things you have ever done, something that really
has surprised you about the United States, or something you would like
to do some day and why.
4. Export or save this audio recording as
“YourNameBaseRec.mp3”. If you cannot save it as an MP3, MOV or MP4 files
work as well. Copy and paste Google Web Speech’s transcription of your
voice into a document, and name it “YourNameTranscription”.
Correcting the Transcription and Marking It for Discourse Intonation
Comparing Google Web Speech’s transcription to the corrected
transcription can be both interesting and useful. For that reason,
students should copy and paste the transcription twice: once for
revision, and once for comparison.
Correcting the Transcription
Even with native speakers, Google Web Speech makes the
occasional mistake (especially with names of people and places), so it
is important to listen to the recording in order to make any
corrections. This corrected transcription needs to mirror exactly what
was said in the recording, and that includes the following:
- False starts or recasts (th-th-the, I run-I ran)
- Brief pauses (,) and silences / hesitation (…)
- Sentence boundaries (? . !)
Be aware that if a speaker has a very strong accent, Google’s
accuracy will be low, and it may be easier for the person to transcribe
his or her speech directly from the recording rather than Google’s
transcription.
Marking Discourse Intonation
The transcription should not only show the reader what was
said, but how it was said. Depending on the students’
knowledge and the goals of the class, the following discourse
intonation features can also be marked:
- Prominence (write the STRESSED words in uppercase letters)
- Pitch movement at the end of a statement (
)
- Tone choice or key (
)
Please see Gorsuch et al. (2013) for an explanation of these features and suggestions on how to transcribe them.
Analyzing the Transcription
Students analyze their recordings and transcriptions to
determine what they need to improve to be better understood when
speaking. Even if they are unable to pinpoint specific segmental or
suprasegmental errors, students usually notice speech rate (too fast or
too slow), fluency, fillers, whether there are clear sentence
boundaries, grammatical and lexical mistakes, and which words might be
mispronounced. Some students find through the act of correcting Google’s
transcription that even they have difficulty understanding some of what
was said.
Taking it a step further by contrasting their transcription
against Google’s interpretation can illuminate other areas of students’
potential miscommunication. For example, if Google Web Speech
transcribed “the person page” but the student actually said “the
percentage,” we could guess that the student likely did not reduce the
last syllable and might have even misplaced the word stress. This is
often useful as long as the students’ accents are not too strong; it
should be cautioned that this activity might not be useful to every
learner. Furthermore, instructors should be warned that Google Web
Speech interprets whatever it hears, and does not take into
consideration the context of the topic or the setting. To illustrate, a
student once said “you will find that in 1983,” but Google transcribed
it as “you will find sh[*]t on a 1983.” Although a possibly-offensive
curse word, the discrepancy can be interpreted as the student not
pronouncing the “th” or the subsequent vowel clearly, and did not link
“in-1983.” The instructor can determine whether or not it is useful to
compare the students’ revised transcripts with Google’s after looking at
the two versions.
Once students complete their analysis, they practice improved
delivery based on the analysis. For this reason, a second recording is
made on the same topic, but without looking at the transcription.
Afterward, students comment on how they practiced for the second
recording, what they feel they improved in the second recording, and
what they feel they still need to improve. This way, the instructor can
gain insight into students’ practice strategies as well as their
understanding of pronunciation and discourse intonation features.
Grading
Although I have assigned this activity for students to do on
their own once familiar with the process, the quality of the work is
often better when done as an in-class activity. This saves time for busy students, and
through the instructor’s guidance, students can deepen their analysis by
focusing on the topic at hand (intonation variation, phrasal stress,
reduction). That said, if it is done as an in-class, grading is done at the
instructor’s discretion (letter grade, complete/incomplete, comments
only). If it is an assignment, instructors can use a rubric and give
feedback, generalized or specific. A grading rubric will be shared at
the TESOL convention during my session.
Conclusion
Because self-monitoring while speaking can be difficult, this
activity allows learners to slow down and really hear and see what they
said and how. Analyzing their speech for what they did well and what
they need to improve helps students to better understand where they have
improved and what to work on next. Furthermore, if the students’ speech
is sufficiently clear, Google Web Speech can save them much time, as
there is no need to transcribe from scratch.
References
Gorsuch, G., Meyers, C., Pickering, L., Griffee, D. (2013). English communication for international teaching
assistants (2nd ed.). Long Grove, IL: Waveland
Press.
Hubbard, P. (2013). Making a case for learner training in
technology enhanced language learning environments. CALICO
Journal, 30(2). Retrieved from http://journals.sfu.ca/CALICO/index.php/calico/article/view/945
Wallace, L. (2013). Taking the first step: International
teaching assistants’ motivation to improve their spoken English
intelligibility. ITAIS Newsletter. Retrieved from http://newsmanager.commpartners.com/tesolitais/issues/2013-08-19/1.html
Dr. Lara
Wallace is a lecturer and the English Language Improvement
Program pronunciation lab coordinator in Ohio University’s Department of
Linguistics. She will present at TESOL on 26 March 2015 at 1 pm in room
205B at the Toronto Convention Centre. |