Join us in Seattle on June 26-29 for the NCEE Leaders Retreat. Learn more here.

Cross-posted at Education Week.

Forgive me if I am a little cynical about the eternal dance between measurement and accountability when it comes to reporting on the progress and achievement of American school children.  From the beginning, the leaders of our state education systems have invited testing experts to help them set the cut points for passing or not passing the state tests.  They listen gravely to the advice of the experts, then ask them how many students will fail at the recommended cut point and set a new one at a point that is politically tolerable.

The heads of municipal school systems for a long time picked the test they would use to report student performance from vendors who offered to compare the performance of their students to that of any of many different student bodies elsewhere.  The superintendent would pick the comparison that would make their district look the best.  All the insiders knew that was how it worked.  Only the public was fooled

When George W. Bush became President, he wanted to hold every state, district and school to a common national standard.  He couldn’t get what he wanted, but he did the next best thing.  He required all the states to participate in the National Assessment of Educational Progress, “The Nation’s Report Card.”  For the first time, the performance of all the states could be compared on a common metric.

Well, that was interesting.  President Bush’s signature education program, No Child Left Behind, required each state to set its own standards for student performance and then commit to reaching those standards by 2014.  These standards, of course, could be different.  It turned out, rather famously, that the states claiming to make the most progress toward reaching their standards were those that performed the worst on NAEP.  The governors of the states that ended up with egg on their face were among the strongest supporters of the development of the Common Core State Standards.  They did not want to be embarrassed that way again.

But, as we all know now, not all state versions of the Common Core are the same and there are a number of states that have not embraced the Common Core in any form.  And the state consortia formed to create common assessments of the Common Core have withered on the vine, so there is now no prospect that the Common Core and its associated tests will enable all schools, districts and states to compare their performance to all the others on a common, honest metric

That leaves NAEP.  The way NAEP is done permits the observer to compare scores among states for the grade levels and subjects it assesses.  But what do those scores mean?  In an attempt to answer that question, the NAEP Governing Board, starting a quarter century ago, settled on three distinct levels of performance: NAEP Basic, NAEP Proficient and NAEP Advanced.  The Board’s policy statements define those terms and describe the process by which the Board will decide on the cut scores that demarcate the boundaries between performance levels.

What makes this moment special is that NAEP is now, today, engaged in the first major revision of the procedures by which those standards are set.  You have a chance to weigh in.  Look here to see how you can make your views known to the Board.

This policy is really important, because the views that Americans have about the performance of their schools are significantly affected by the generous press attention that the NAEP reports routinely get.  But what those reports mean has a lot to do with how NAEP defines performance.  That is what this blog is about.

Let’s take the meaning of the word “proficient.”  The new draft standards say proficient means “…solid academic performance for each NAEP assessment. Students reaching this level have demonstrated competency over challenging subject matter, including subject-matter knowledge, application of such knowledge to real world situations, and analytical skills appropriate to the subject matter.”

Hmmmm…  What does “solid” mean?  Who defines what it means to be “competent”?  What is terribly “challenging” to one child might be super easy for another.  This definition is quite obviously a matter of judgment.  And that is the way the issue is treated by the draft.  It says a panel of “subject matter experts will be convened to recommend achievement level cut scores….” What really counts here is their opinion.

And then, of course, it says that these subject matter experts do not decide on the cut scores, but instead make recommendations to the full NAEP Board.  It explicitly directs that the Board have information on the effects of setting the cut scores at different levels—that is, how many students are likely to be found proficient.  That strongly suggests that political judgment will play a decisive role in cut score setting, just as it has always done at the state level.

But then the document says that these judgments need to be “valid.”  You would think that would mean coming up with empirical data showing that students said to be proficient actually are, in some commonsense meaning of the word, proficient.  But it does not mean that.  It means that there is empirical evidence that students who are said to be proficient do in fact have the capacities specified in the definition.  But that is circular.  What do you mean by challenging?”  Answer: “Whatever I have measured.”

What would make it uncircular?  Answer:  Knowing whether a student is proficient or not would have some meaning for me if I knew whether that student could do something in particular that is important to me or the student.  For example, whether that student is ready for college or ready for a career.  Now, you say, fine, I can go with that, but can you?

Ready for which college?  The fly-by-night “institution” down the street that offers no instruction but is ready to take your college loan money this afternoon or Michigan State?  Ready for which career? A career as a cashier in a fast food chain or a career in finance on Wall Street?   The Department of Education’s draft provides no guidance on any of these points.

Nor does the history here give this reader confidence that the process described will get this right unless such guidance is provided. The NAEP Governing Board has done a good job of sponsoring research that correlates scores on NAEP with certain college outcomes and workers’ incomes.  But that does nothing to tell students, parents or teachers or even state policy makers what students need to study or how well they have to do to be college and career ready. If the members of the Board think that one can only be proficient in mathematics at the 12th-grade level if one has demonstrated a thorough command of the topics typically included in an Algebra II course, then the people who construct the test will include a lot of Algebra II questions on the test and policy makers will tell the schools everyone has to take Algebra II. If the Governing Board says that, in their judgement, that is what proficient means, who is to say they are wrong?

Actually, me.  One does not have to have mastered the content of the typical Algebra II course to succeed in college or career. This is how I know that:

First, a very large fraction of high school students going to college in the United State either do all their college work in a community college or take their first two years of a four-year college program in a community college.

Second, the nation’s primary provider of career training, meaning vocational education and training, is the nation’s community colleges.

Third, successfully completing the first year of a typical community college program is a good predictor of the likelihood that the individual will successfully complete a two-year degree program or acquire an occupational certificate of value to an employer.

Fourth, it follows from “1,” “2” and “3” above that, if one cannot succeed in the first year of a typical community college program, one’s chances of succeeding in further college or career are slim, and the converse is also true.  So, one could reasonably say that whether or not the high school student is ready for success in the first year of a typical community college program is a very good measure of the degree to which the student is “ready for college and career.”   It does not mean the students has a high probability of success in the first-year program at Stanford or in the first year of a program designed to train medical technicians to administer and interpret sonograms, but it does specify a standard of proficiency that is specific and broadly applicable, a standard that would have intuitive appeal to millions of American students, parents and college admissions officers.  In most advanced industrial countries, there are one or more high school leaving credentials that are matched to the requirements for going on to university or to advanced occupational training.  Would it not make sense for “proficient” to mean just that, using the demands of credit-bearing courses in the first year of our community colleges as the benchmark standard?  To do this right, the National Assessment Governing Board would have to know not just what cut score to use on a general test of mathematics, but what topics in mathematics would have to be mastered to what level to enable the student to succeed in College Mathematics or College Algebra.

Our organization has done the research needed to establish the benchmarks for such a standard, at least with respect to reading, writing and numeracy.  We did not do it the way NAEP has done it, by asking “experts” what they think the standard ought to be.  Industrial psychologists found out years ago that that approach almost never actually ends up with descriptions of what is required to do a particular job or the level and kind of education and training needed to do that job.  Instead they study the job itself, the way real people do it and then use that information to figure out what sort of education and training they need.

That’s what we did.  We gathered the most widely used texts used in the most commonly taken initial credit-bearing courses in a randomly selected set of community colleges and asked leading reading experts to determine their reading level.  We asked for graded writing samples from typical assignments given to students along with scored exams and had those reviewed by leading writing experts.  And we reviewed the texts for the courses called College Mathematics and College Algebra and had them reviewed by the nation’s leading mathematics experts.

It turns out that College Mathematics and College Algebra are mostly topics covered in Algebra I, and a little geometry, statistics and probability.  Students leaving high school do not need to be proficient in Algebra II in order to study Algebra I.  The first-year texts are mostly written at the 12th-grade level, but a large fraction of our high school graduates cannot comprehend what is written in them.  Many of the community college instructors told us that they do not assign writing to their students because the students cannot write and the instructors do not think they were hired to teach basic writing.  Yes, some of the NAEP benchmark standards for mathematics are well above any reasonable definition of “proficient.” But a standard of proficiency that was based on what it would actually take to succeed in the first year of community college would be way below the global standard for college level work in the advanced industrialized nations.

If I were on the NAEP Board, I would press for setting proficiency standards based on what empirical data—not anyone’s “expert opinion”—tell us about the content and performance requirements for success in the first year of the typical community college.  I would urge my fellow board members to adopt a policy for reporting to the American people on how many high school students reach that standard at the end of high school and how many students are on a trajectory to reach that standard in elementary and middle school.  I would push NAEP to tell the American people that this benchmark should be used by the states to set a target for what their students should achieve by the end of tenth grade, because that would represent a level of achievement for students of that age comparable to the level achieved by most students in the top-performing countries by that time and there is no reason why we should expect less.

And lastly, and most importantly, I would tell them that NAEP is the last redoubt, the last remaining hope that the United States will have an instrument that we can use to get an honest measure of how our students are doing.  If we lose that check, if anyone can say whatever they like about how our students are doing, ignorance will not be bliss.