I just want to share a few comments on Alan Davies’s contribution to Language Testing Reconsidered.

Alan Davis is another one researcher that I met at LTRC 2008. He is very knowledgeable on the subject of language testing, and he is interested in exploring and answering deep-level issues related to language testing and not simply debating which statistical method is best for a certain analysis. Although his questions do not result in simple answers (and hence why many novice attendees at LTRC did not seem interested in what he had to say), I admire his drive to pose these socially-responsible issues even though he is far enough along in his career that he could simple sit back and rehash his previous research.

In this chapter, Davies explains the revisions in the test that eventually became IELTS. It is interesting to note that Davies seems almost aloof to the language testing idealogies that he describes. It’s as if he is fully aware of the perspective outlined in Bachman’s chapter, but that Davies doesn’t even bother to get into the debate, because from his perspective, they are all variations of the same thing, only with a new name and a new method, but still resulting in the same kind of measure.

So although my take on Davies chapter is that he objectively details testing evolution while implying that it’s all rather silly, I paid extra attention to his comments about defining academic language and how current trends aim to assess it.

There is some consensus in the notion of an integrated set of language skills required to socialize students in to the acquisition of academic language” writing…is not…a stand-alone skill but part if the whole process of text response and creation; when students use both reading and writing in crucial ways, they can become a part of the academic conversation – they signal their response to academic ideas and invite others to respond to their ideas in turn” (Hamp-Lyons and Kroll, 1997, p. 19). (p. 75)

If we believe that involving students in the academic conversation is important, then using test tasks that similate that conversation (such as integrated tasks) could get closer to that reality. Of course, it’s possible that Davies views these kinds of tasks indifferently: we may still end up with the same kinds of scores whether or not we use these better tasks. It may have more to do with face validity and pandering to stakeholders than it has to do with measuring a better construct (p. 84).

Then why do we do it? If not for better scores, what value do better tests hold? Davies comes to the same conclusion that I do: better tests have a more positive impact on learning aka washback. If we test in better ways, then students will focus on learning more valuable aspects of language rather than simply learning what they need to pass the test. “In other words, test validity must now take account of washback or, even more widely of test impact (Hawkey, 2006)” (p. 84).

