Computer adaptive assessment, fairer approaches to assessing English learners, virtual worlds for assessing science inquiry--Will the next generation of state assessments embrace these ideas to give a fuller sense of what students are learning? This morning, PACE of California and the Rennie Center of Massachusetts just released three papers discussing these frontiers in assessment. I flew to D.C. to introduce a panel discussion of the three papers, collectively called The Road Ahead for State Assessments.
Mark Reckase of Michigan State University started off the session by discussing the promise and difficulty of computer adaptive assessment, which is used in tests as diverse as the GRE's for graduate school and the Army's vocational aptitude test. In computer adaptive assessment, the difficulty level of later questions is determined by how a student answers earlier questions. The fact that the test difficulty level adjusts to the student allows a finer discrimination of student skill and knowledge levels with fewer questions. Computer adaptive tests are also better at distinguishing student performance at the tails of the ability distribution. A common test has only a few very easy or very difficult questions, with most questions in the middle; a computer adaptive test can offer many questions at about the level that matches the student's knowledge. But such a test requires a very large question bank and lots of computer hardware that is regularly updated. One of the two new large assessment consortia, the SMARTER Balanced Assessment Consortium or SBAC, plans to include computer adaptive testing in the model it rolls out in 2014.
Robert Linquanti of WestEd next discussed the complex question of testing English language learners in content areas other than English proficiency. He pointed out that group reporting on how "English learners" are performing is always problematic, becasue unlike other demographic subgroups, the subgroup "English learners" is a revolving door. New immigrants arrive all the time, while those who have finally mastered English (and are therefore likely to be higher performers in other content areas as well) are moved out of the category.
Lack of academic English knowledge can lower students' test scores even when they actually understand content well. Linquanti suggested a number of accommodations, such as English dictionaries or glossaries, use of "plain English," math or science tests involving very little language, and others, to allow better estimates of what English learners actually understand in content areas such as math and science. He also suggested that while English proficiency should be reported as a function of time in the English learning system, proficiency in other content areas should be reported as a function of both time in the system and English proficiency.
Finally, Chris Dede and Jody Clarke-Midura discussed the possibility of measuring complex learning in science through immersive virtual environments. In their research, students choose an avatar that interacts with a scientific problem in a virtual world, generating hypotheses, gathering data, making inferences, and arguing from evidence. These virtual performance assessments can be more consistent and less expensive than hands-on performance assessments, and they offer promise for understanding how students use the tools of scientific inquiry. Dede and Clarke-Midura suggest that such assessments should form part of a comprehensive state assessment system.
The challenge for SBAC and the other assessment consortium, PARCC, is that they need to create a complete testing system in two subject areas for use in 2014. With such a short timeline, the temptation will be great to settle for creating systems a little better than what we have rather than groundbreaking innovations. But such innovations are what we need to reverse the curriculum-narrowing impact of current standardized tests and to encourage the teaching of more complex thinking and problem-solving skills. Educators interested in these issues should definitely take a look at the Rennie Center/PACE report.