Smarter Balanced tests are still a work in progress
Doug McRae
The Smarter Balanced Assessment Consortium provided a sneak peek for their terminal computer-adaptive tests in early October, tests to be administered to roughly 25 percentage of the country's course 3-8 and eleven students in jump 2022 to measure, initially, status and, eventually, growth in achievement on the new Common Cadre bookish standards for English language Language Arts and Mathematics. The peek reveals the prospective tests are a piece of work in progress – tests that I believe won't be ready for prime time until at least spring 2016.
The sneak peek was provided via the process Smarter Balanced is using to determine "cut scores" for test results, or essentially how many test questions a student must reply correctly to be labeled good.
Smarter Counterbalanced officials take all the same to make up one's mind how the achievement categories volition exist labeled. They have indicated they will take iv achievement categories for their results, which for now are just labeled Category i, 2, three, 4. For this commentary, I will use the labels below basic, bones, proficient, avant-garde or fifty-fifty A, B, C, D every bit substitutes for the concept of achievement categories.
The Smarter Balanced process involves structured judgments for test questions they plan to use in the spring of 2015. Judgments were elicited from volunteers who signed up for a iii-hour online session to review actual examination questions and provide a judgment where a Category 3 or practiced "cutting score" should be placed. The results of the online practise were to exist provided to more than 500 teachers and others nominated by 17 states to participate on "in-person" panels in mid-October to undergo formal cut-score-setting exercises for the fourteen tests being developed past Smarter Balanced.
The Smarter Balanced procedure also involved 2 panels (one each for ELA and Math) to coordinate proposed cut scores across grade levels. Recommended cut scores were to be endorsed by Smarter Balanced member states on Nov. 6, but that portion for approval of recommended cut scores has been delayed.
We should be reminded that the actual Smarter Balanced tests for spring 2022 have non still been finalized. Analyses from the Smarter Balanced field tests that students took in spring 2014, designed primarily to qualify test questions for apply in last tests, take non yet been completed.
But the practice that I participated in did provide a set of test questions that mirrored Smarter Balanced plans for their final tests, and a set of questions that mirrored the proposed residual betwixt multiple-selection (and other test questions that can be scored electronically) and open-ended test questions that are needed to test many of the new Common Core academic standards in depth.
So, with care taken not to disclose any of the secure textile involved in the online exercise, what were the observations of this experienced K-12 testing system designer?
I did the online do for course three English Arts, and for this form level and content area traditional multiple-selection questions dominated. In fact, 84 % of the questions were either multiple-choice or "check-the-box" questions that could be electronically scored, and these questions were very similar or identical to traditional "bubble" tests. But 16 per centum of the questions were open up-concluded questions, which many observers say are needed to measure Mutual Cadre standards.
The online exercise used a gear up of examination items with the questions bundled in sequence past order of difficulty, from easy questions to difficult questions. The exercise asked the participant to identify the first particular in the sequence that a Category 3 or B-minus student would have less than a fifty pct gamble to answer correctly. I identified that item after reviewing about 25 percent of the items to exist reviewed. If a Category three or proficient cut score is set at only 25 per centum of the available items or score points for a test that has primarily multiple-choice questions, clearly that cut score invites a strategy of randomly marker the reply sail. The odds are that if a student uses a random marking strategy, he or she will get a practiced score quite ofttimes. This circumstance would result in many random (or invalid and unreliable) scores from the exam, and reduce the overall credibility of the entire testing plan.
It troubled me greatly that many of the test questions later on in the sequence appeared to be far easier than the particular I identified as the item marking a Category 3 or good cut score, per the directions for the online exercise. I found at to the lowest degree a quarter of the remaining items to exist easier, including a cluster of clearly easier items placed nigh 2/3 of the way into the unabridged sequence. This calls into question whether or not the sequence of test questions used by Smarter Balanced was indeed in difficulty order from easy to difficult items. If the sequence used was non strictly ordered from easy to hard test questions, and then the results of the entire do accept to exist called into serious question.
There were several boosted concerns almost the Smarter Balanced cut-score-setting exercise this October that are too technical for full discussion in this commentary. Briefly, the exercise appeared not to include any use of "consequence" information that typically is included in a robust cutting-score-setting process. Result information is estimated information on what per centum of students volition fall in each operation category, given the cut scores being recommended. I also questioned whether the leap 2022 Smarter Balanced field examination data were used to guide the exercise in whatever significant style. Indeed, since the 2022 Smarter Balanced field test was essentially an particular-tryout exercise, an exercise designed to qualify exam questions for utilize in final tests, information technology did not generate the type of data needed for terminal cut score determinations in a number of significant ways.
Smarter Balanced calls their 2022 test administration test an "operational" test. Merely, any operational examination needs more than than qualified test questions to yield valid scores. It must likewise have valid scoring rules to generate meaningful scores for students, for teachers, for parents and for valid amass scores for schools, districts and of import subgroups of students.
Information technology is quite clear to me that the cut-score-setting exercises conducted by Smarter Balanced this calendar month volition not produce final or valid cut scores for timely utilize with spring 2022 Smarter Counterbalanced tests. Spring 2022 tests will instead be benchmark tests (to use test development parlance), tests that yield information that then tin can be used to generate valid cut scores. That practice will take to look for September 2022 at the earliest. The Smarter Counterbalanced website recognizes this by labeling the cut scores recommended in October 2022 as "preliminary" cut scores, to exist validated by spring 2022 information.
California plans to utilise the cut scores recommended by the panels that met in Oct for disseminating millions of test scores in spring 2015. These plans are faced with the prospect that those scores volition have to be "recalled" and replaced with truthful or valid scores just months after incorrect scores are disseminated. This is non a pretty film for any big-scale statewide assessment program.
The bottom line: Smarter Balanced tests are notwithstanding a work in progress. I think it will be jump 2022 before Smarter Counterbalanced tests will be able to generate valid, meaningful test scores in a timely manner for California students.
• • •
Doug McRae is a retired educational measurement specialist who has served every bit an educational testing company executive in accuse of design and development of K-12 tests widely used across the United States, as well as an adviser on the initial design and development of California's STAR cess system.
The opinions expressed in this commentary represent solely those of the author. EdSource welcomes commentaries representing various points of view. If you would like to submit a commentary, please contact us.
To become more reports like this one, click here to sign up for EdSource's no-cost daily electronic mail on latest developments in educational activity.
Source: https://edsource.org/2014/smarter-balanced-tests-are-still-a-work-in-progress/69828
0 Response to "Smarter Balanced tests are still a work in progress"
Post a Comment