June 2020
ARTICLES
EVIDENCE FROM AUTHENTIC ITA DISCOURSE AND TOEFL IBT SPEAKING RESPONSES IN SUPPORT OF ITA ASSESSMENT

Elena Cotos, Iowa State University, Ames, Iowa, USA

As international teaching assistant (ITA) professionals, we experience multiple challenges related to ITA assessment and training. Variations in institutional operations as well as ITAs’ disciplinary backgrounds and teaching responsibilities are demanding issues; addressing these issues requires that we design and exercise both needs-responsive and systematic testing and instructional practices. Testing, in particular, is a very labor- and cost-intensive endeavor. It is also higher stakes for students because their proficiency level may determine their eligibility for a teaching assistantship, stipend amount, progress in the academic program, and so on. However, expertise and resources are generally limited, and we are constantly in search of feasible and scalable solutions, including making use of information that is readily available.

Using results from computer-based standardized language proficiency tests like the TOEFL iBT seems to be a particularly appealing solution, as universities already accept these scores for admission purposes and such information is typically accessible to ITA program administrators. Thus, many of us have been seriously considering using TOEFL iBT Speaking scores for ITA screening and certification, possibly setting cutoff scores between 23 and 28 or higher. Undoubtedly, this is a reasonable use of the TOEFL iBT Speaking scores. However, such secondary use of these scores must be validated empirically to support this test as an appropriate measure of speaking ability in contexts representative of those where ITAs engage in different forms of instruction. Both decision makers and practitioners need to know whether, or to what extent, TOEFL iBT Speaking can assess language skills needed to teach subject content in instructional settings (as opposed to language skills needed for academic study).

For ITA professionals to assess ITAs in a valid way and provide further language training to help them effectively perform their teaching duties, validation research needs to supply empirical evidence to support TOEFL iBT Speaking “as a measure of speaking ability in instructional settings and the use of the scores for making decisions about teaching assistantship (TA) assignments” (Xi, 2007, p. 319). For that, researchers need to encompass an important aspect—the target domain of language use. However, compared to other topics (e.g., intelligibility and comprehensibility, cultural awareness, perceptions and attitudes towards ITAs, ITA training, university policies for ITA practice), descriptions of the language features characteristic of ITA discourse in instructional settings are rare, with previous studies in this vein being limited to discourse markers and textual features in classroom talk.

These issues motivated me to investigate the ITA target language domain and determine if TOEFL iBT Speaking scores can be used for ITA screening and certification purposes. Making judgements about the appropriateness and usefulness of these scores to either complement or replace institutional ITA assessments required evidence of whether the language elicited by this test’s tasks can be identified in authentic discourse produced by ITAs with different instructional roles. This, in turn, required first providing a comprehensive description of authentic ITA instructional discourse; that is, what ITAs do with language to accomplish teaching tasks and convey propositional content in instructional encounters with their undergraduate students.

Consequently, my study at the intersection of ITA instruction and testing focused on functional language, which is defined in systemic functional linguistics as “language that is doing some job in some context” (Halliday & Hasan, 1989, p. 10). Following the theoretical tenets of systemic functional linguistics to analyze ITAs’ use of functional language, my research group adopted and further refined a heuristic known as the knowledge framework (Mohan, 1989). This framework reflects how teachers may integrate language and content through six knowledge structures (KSs), which are conceptualized as pairs indicative of background knowledge and action knowledge. Each KS shown in Table 1 is associated with specific language functions (italicized) that are instantiated by specific language choices (exemplified in parentheses).[1]

Table 1. Knowledge Structures and Language Functions

Background Knowledge

Action Knowledge

KS1 Classification

KS2 Description

classifying (contain, be a kind/type of)
defining (be called, mean)

describing (there is/are/was/were, show)
comparing (similar to, unlike)
exemplifying (in this case, such as)
quantifying (“number”, many)
spatial positioning (inside, in front of)

KS3 Principles

KS4 Sequence

explaining (in other words, reason why)
predicting (will, probably)
concluding (in short, to sum up)
hypothesizing (if…then, assume)
demonstrating cause-effect (because of, affect)
setting rules (as established, based on)
specifying means (how, using)
specifying ends (in order to, such that)

reporting (say, according to)
indicating order (before, secondly)
indicating process (complete, lift)
instructing (look, yes/no/wh– questions)
narrating (tell, give an account of)

KS5 Evaluation

KS6 Choice

evaluating (good, disapprove)
conceiving ideas (reflect, mislead)
making judgments (criticize, doubt)

making choices (decide, would rather)
presenting options (either…or, instead of)
expressing desire (want to, let me)
advising (suggest, might want to)
expressing opinions (certainly, disagree)
presenting arguments (claim, advocate)

KS = knowledge structure.

Based on Mohan (1989) and Cotos (in press).

These KS and language function categories were examined in two corpora: an ITA speech corpus and a TOEFL iBT speech corpus. The corpus of ITA speech contained 119 texts (311,613 words) and was collected from 52 ITAs in 16 disciplines in three curriculum genres (laboratory, recitation, and lecture settings). The TOEFL iBT Speaking data, which was provided by the Educational Testing Service, served to create a principled compilation of 2,738 spoken responses (311,570 words) to both independent and integrated tasks gathered from 481 speakers.

Both corpora were annotated in terms of KSs and respective language functions by three coders, with different measures indicating acceptable levels of reliability (Cohen’s Kappa 0.75, Light’s Kappa 0.82, Conger’s Kappa 0.82). Then, we extracted subsets of annotated units per ITA curriculum genre, per ITA discipline, and per TOEFL iBT Speaking tasks 1–2 (independent) and 3–6 (integrated). Relative frequencies of the annotated units were calculated for each KS and language function, representing the proportion of a particular category over the total frequency of units annotated within the subsets. Subsequently, we analyzed these data using Correspondence Analysis to identify relative associations with curriculum genres, disciplines, and test tasks.[2] For the disciplines, we were able to use data from only from chemistry, physics, and English because other disciplines did not contain a sufficient number of texts for statistical analysis.

The frequency analysis of the ITA speech corpus showed that all knowledge framework categories were relatively equally distributed across the three curriculum genres and the three disciplines. The most frequent was the KS4 (sequence), not surprisingly with instructing being the most typically employed language function, followed by evaluating (KS5), describing (KS2), and expressing opinions (KS6). The ITAs made use of a variety of other functions, but those of KS1 (classification) were used the least. Furthermore, the correspondence analysis went beyond comparative frequencies to determine whether particular KSs and language functions could be related to particular curriculum genres and disciplines. Table 2 summarizes the associations found.

The analysis of the TOEFL iBT speech corpus allowed for a comparison of test-takers’ responses to speaking tasks and ITAs’ classroom discourse. Overall, all the KSs and language functions identified in ITAs’ speech were also found in test-takers’ responses. KS2 (description), KS5 (evaluation), and KS6 (choice) occurred with similar frequencies in the subsets of both corpora, KS1 (classification) again being used the least both by ITAs and TOEFL test-takers. However, KS3 (principles) was more frequent in the TOEFL iBT speech corpus, while KS4 (sequence) appeared much more in the ITA speech corpus. The language functions that were most prominent in both corpora were evaluating (KS5), describing(KS2), expressing opinions (KS6), predicting (KS3), indicating process (KS4),and indicating order (KS4); similarly infrequent were classifying and defining (KS1). Not surprisingly, instructing (KS4) was the most common function in ITAs’ speech but barely used by test-takers. Also, the former used explaining (KS3) twice more often than the latter. Despite the similarity in frequency distributions, the associations between the language functions and comparable subsets of the two corpora were not quite the same (see Table 2). Specifically, defining (KS1), comparing (KS2), exemplifying (KS2), demonstrating cause-effect (KS3), making judgements (KS5), presenting arguments (KS6), and making choices (KS6) were associated only with the TOEFL tasks. The inference to be made from this is that the task prompts determine the test-takers’ use of functional language to a considerable extent.

Table 2. Associations of Knowledge Structures and Language Function Categories in the International Teaching Assistants Speech Corpus and the TOEFL iBT Speech Corpus Subsets

ITA Speech Corpus

Curriculum genres

Disciplines

KS1 Classification – recitation
KS5 Evaluation – lab
classifying (KS1) – recitation
specifying ends (KS3) – lab
narrating (KS4) – lecture
setting rules (KS3) – recitation
reporting (KS4) – lecture

KS1 Classification – Physics
KS3 Principles – Chemistry
KS4 Sequence – English
KS6 Choice – Physics
classifying (KS1) – Physics
spatial positioning (KS2) – Physics
quantifying (KS2) – Chemistry
setting rules (KS3) – Physics
hypothesizing (KS3) – Chemistry
concluding (KS3) – Physics and Chemistry
narrating (KS4) – English
reporting (KS4) – English

TOEFL iBT Speech Corpus

Independent tasks

Integrated tasks

KS5 Evaluation – Tasks 1–2

KS3 Principles – Tasks 3–6

comparing (KS2) – Tasks 1–2
demonstrating cause-effect (KS3) – Tasks 1–2

defining (KS1) – Tasks 3–6
exemplifying (KS2) – Tasks 3–6
demonstrating cause-effect (KS3) – Tasks 3–6
reporting (KS4) – Tasks 3–6
narrating (KS4) – Tasks 3–6
presenting arguments (KS6) – Tasks 3–6
making judgements (KS5) – Tasks 3–6
making choices (KS6) – Tasks 3–6

This snapshot of descriptive and comparative insights obtained through corpus-based analyses allow for the following conclusions:

  • ITAs in different disciplines use a wide range of KSs and language functions (six and 29, respectively) when performing different instructional roles. Their use of functional language is largely similar across instructional settings, although there may be associational differences depending on the discipline and curriculum genre.
  • TOEFL iBT Speaking can elicit most KSs and language functions identified in ITAs’ discourse in comparable ways. However, the test tasks do not elicit instructing (KS4).

From a validity standpoint, these conclusions support the assumption that the spoken performance of TOEFL iBT Speaking test-takers may contain language functions similar to those used in the target domain and, therefore, the use of scores for ITA assessment-related purposes is plausible. Nevertheless, major implications rest with the institutional, in-house ITA testing and subsequent ITA training. The test items may need to be revised such that the construct definition includes functional language. Similarly, the curriculum may need to be adjusted to account for and create opportunities for practicing the characteristics of functional language in curricular genres (e.g., recitation – classification, lab – evaluation) and in the disciplines (e.g., physics – classification and choice; chemistry – principles; English – sequence).

Recommended Readings

Cotos, E., & Chung, Y.-R. (2019). Functional language in curriculum genres: Implications for screening international teaching assistants. Journal of English for Academic Purposes, 41, 100766. https://doi.org/10.1016/j.jeap.2019.06.009

Cotos, E., & Chung, Y.-R. (2018). Domain description: Validating the interpretation of TOEFL iBT® Speaking scores for international teaching assistant screening and certification purposes. TOEFL Research Reports, No. RR-85. Educational Testing Service. https://doi.org/10.1002/ets2.12233.

References

Cotos, E. (in press). Corpus-based knowledge framework analysis: A deliberation of methodology and outcomes. In T. Slater (Ed.), Social practices in higher education: A knowledge framework approach to linguistic research and teaching. Equinox.

Halliday, M. A. K., & Hasan, R. (1989). Language, context and text: Aspects of language in a social-semiotic perspective. Oxford University Press.

Mohan, B. A. (1989). Knowledge structures and academic discourse. Word, 40, 99–115. https://doi.org/10.1080/00437956.1989.11435799

Xi, X. (2007). Validating TOEFL® iBT Speaking and setting score requirements for ITA screening. Language Assessment Quarterly, 4, 318–351. https://doi.org/10.1080/15434300701462796.


Elena Cotos is an associate professor of applied linguistics and the director of the Center for Communication Excellence at Iowa State University. Her select works can be accessed through the Digital Repository.

[1] More detailed descriptors can be found in Cotos (in press).

[2] Correspondence analysis is a statistical method used in exploratory investigations of associational relationships among multiple categorical variables.