Statistic of the Month: 2011 TIMSS and PIRLS Results

By Emily Wicken

In December, the results of the 2011 administration of the Trends in International Mathematics and Science Study (TIMSS) and Progress in International Reading Literacy Study (PIRLS) were published in three separate reports, each examining international performance in reading (at the fourth grade level), math (at the fourth and eighth grade levels) and science (at the fourth and eighth grade levels).  These assessments provide a picture of international student performance in the years before a student reaches the age of 15, which is the age at which students take the OECD’s Programme of International Student Assessment (PISA).  However, there are some central differences between the TIMSS/PIRLS and PISA assessments.  Michael Martin, the Co-Executive Director of the TIMSS and PIRLS Study Center at Boston College, notes that while PISA is intended to measure a student’s general skills in the arenas of reading, math and science, TIMSS and PIRLS are more focused on content mastery.  Additionally, Jack Buckley, the commissioner of the National Center for Education Statistics, has pointed out that the countries participating in both assessments do vary – the TIMSS and PIRLS groups are smaller and represent a mixture of countries at different levels of economic development as compared to the participants in PISA.

Because of the differences between the assessments, the countries that are in the top ten or fifteen of the TIMSS and PIRLS rankings are somewhat different than the top performers on the last incarnation of PISA in 2009.  While league tables of the top countries based on their average scores always garner the most press when the results of international assessments are released, we decided to take a more in-depth look at what level of proficiency students in the top fifteen countries are actually reaching in these subjects.

The IEA has established four “international benchmarks” on their score scale for these assessments.  While the score scale for both PIRLS and TIMSS runs from 0-1000, the vast majority of scores fall between 300 and 700.  The IEA has identified a score of 400 as the “low” international benchmark, indicating that students at this score point have been educated to a “basic” level.  Beyond that, there is a score of 475, or “intermediate;” a score of 550, or “high,” and a score of 625, or “advanced.”  Below, we have plotted the percent of students at each benchmark in the top fifteen countries on the 2011 administration of PIRLS and TIMSS.  This is useful when thinking about the top performers, because it shows, in a clearer way perhaps than the average scale score, what students in each country are really able to do.

Chart1
In the fourth grade PIRLS reading assessment, a student who reaches the “low” international benchmark is able to “locate and retrieve an explicitly stated detail” in a literary text, and “locate and reproduce explicitly stated information … at the beginning of the text” in an informational text.  By contrast, at the “advanced” international benchmark, students are able to “integrate ideas and evidence across a text,” and “distinguish and interpret complex information from different parts of a text,” among other skills.

The chart above, like the others to follow, is organized from top to bottom in the order of average scale score.  However, the average scale score does not always correlate to the highest percentage of students reaching the “advanced” benchmark in each country.  In this case, it does not, though Hong Kong does have the highest proportion of students meeting either the “high” benchmark or “advanced” benchmark – 67 percent – while in the United States, just 56 percent of students meet those levels.  The tail of students either meeting the “low” benchmark or not meeting a benchmark is also significantly smaller in the top three countries – Hong Kong, the Russian Federation, and Finland (7, 8 and 8 percent, respectively), than in the majority of the other countries.  This more specific data on student performance is useful in terms of thinking about a country’s overall performance, because it gives a clearer sense, potentially, of the equity of the school system, and the ability of the system to educate all students – or any students – to high levels.  It also demonstrates that there are clear differences in student performance between the top handful of countries and the rest of the countries rounding out the top ten or fifteen.

Chart2
For fourth grade math, in order to reach the “low” benchmark, a student must be able to demonstrate “basic mathematical knowledge,” such as adding and subtracting integers and being able to recognize familiar shapes.  At the “advanced” benchmark, a student must have an understanding of how to apply their knowledge, for example, by solving word problems with multiple steps, and they must show some understanding of more difficult concepts like fractions and decimals.

In the case of TIMSS fourth grade math, the percent of students reaching the “advanced” benchmark does correlate to the country’s average scale score, at least for the top six performers.  This chart indicates very clearly how well the East Asian countries do compared to the rest of the world in instilling advanced-level math skills in their students, even at an early age, with about a third of students or more reaching the “advanced” benchmark in Singapore, Korea, Hong Kong, Taiwan and Japan, and an overwhelming majority reaching either the “advanced” or “high” benchmarks in all cases.  These countries also have the smallest proportions of students who failed to meet the most basic level.  By contrast, starting with Northern Ireland, which is in sixth place in the overall league table in this subject, the other countries have higher proportions of students failing to reach at least the “intermediate” benchmark, and generally much lower proportions of students reaching the “advanced” benchmark.

Chart3
In eighth grade math, students at the “low” benchmark “have some knowledge of whole numbers and decimals, operations, and basic graphs.”  At the “advanced” level, students are able to demonstrate many mathematical skills, such as solving linear equations, reasoning with geometric figures, and expressing generalizations algebraically.

The pattern in proficiency seen in the TIMSS fourth grade math results is continued in the TIMSS eighth grade math results.  Andreas Schleicher from the OECD and US Education Secretary Arne Duncan have commented on the drop in math and science skills from fourth grade to eighth grade in the United States, and the data bears this out.  In fourth grade, 47 percent of American students met either the “high” or “advanced” benchmarks; in eighth grade, just 30 percent of students did.  Furthermore, twice as many American students – 8 percent – failed to meet any benchmarks in eighth grade than in fourth grade.  In Singapore, however, the number of students meeting the “advanced” or “high” benchmark holds steady at 78 percent in both grades, and the other East Asian countries also do not lose any substantial ground.  Taiwan increases the number of students at the “advanced” level from 30 percent in fourth grade to about half (49 percent) in eighth grade.

Chart4
In fourth grade science, students at the “low” benchmark “show some elementary knowledge of life, physical and earth sciences,” and “demonstrate knowledge of some simple facts … interpret simple diagrams, complete simple tables, and provide short written responses to questions requiring factual information.”  At the “advanced” benchmark, students can “apply knowledge and understanding of scientific processes … and show some knowledge of the process of scientific inquiry.”  Additionally, “they have a beginning ability to interpret results in the context of a simple experiment, reason and draw conclusions from descriptions and diagrams, and evaluate and support an argument.”

On the TIMSS fourth grade science assessment, the East Asian countries do not dominate in terms of student proficiency at the “advanced” benchmark as completely as they do in math, although perennial top performers South Korea and Singapore still top the list in this measure.  Fewer students overall, across the board, seem to have reached the “advanced” benchmark in science as compared to reading and math.  The United States seems to have a particular problem in this subject, with 19 percent of students either failing to meet any benchmark or only meeting the “low” benchmark.

Chart5
At the eighth grade level in science, students meeting the “low” benchmark are expected to “recognize some basic facts from the life and physical sciences,” and can display this knowledge by “interpret[ing] simple diagrams, complet[ing] simple tables, and apply[ing] basic knowledge.  Students at the “advanced” level can “communicate an understanding of complex and abstract concepts in biology, chemistry, physics and earth sciences.”  They also “understand basic features of scientific investigation … [and] combine information from several sources to solve problems and draw conclusions, and … provide written explanations to communicate scientific knowledge.”

Like in fourth grade science, overall, there seem to be fewer students who reach the “advanced” benchmark across the board.  The United States sees a 5 percent decline in the number of students reaching the “advanced” benchmark from fourth to eighth grade, and a four percent decline in students reaching the “high” benchmark.  This is compounded by a large jump in the percent of students who either do not meet any benchmarks (7 percent compared to 4 percent) or meet only the “low” benchmark (20 percent compared to 15 percent) – more than a quarter of all US students, in fact.

A separate, but equally interesting, set of data from the 2011 PIRLS results is the level of proficiency of students in two types of reading – literary and informational – as compared to a country’s overall score.  Debates over the value of each type of reading as emphasized in a curriculum have been raging for some time now, and while the PIRLS data does not solve this debate, it does provide interesting new fodder to the discussion.

Chart6
The chart above depicts the overall average reading score on PIRLS, which is administered to fourth grade students, for the top fifteen systems on that assessment, as well as the average score on the literary reading tasks and on the informational reading tasks.  The top performing countries (Hong Kong, the Russian Federation, Finland and Singapore) all have average informational reading scores that are higher than or equal to their overall reading score, with literary reading scores somewhat lower than or equal to both the overall score and the informational score.  By contrast, the United States, Northern Ireland, Denmark, Ireland, Canada and England all display the opposite trend – literary reading scores that are higher, often statistically significant, than either their informational reading scores or their overall scores.  There is also, in the case of the United States, Ireland and Northern Ireland, a statistical significance in the difference between the lower informational reading score and the overall score.

This suggests that informational reading may, in fact, help aid a student’s overall reading skills, at least as measured by the PIRLS assessment.  It is notable that several East Asian countries, including Singapore, Hong Kong, and Taiwan, all of which traditionally do very well in the math and science assessments, also have students who perform better on informational reading tasks than on literary reading tasks.  In the case of Hong Kong and Singapore, this results in a very high overall score.  In Taiwan, the informational reading score is extremely high compared to the literary reading score, and actually fairly comparable to Singapore’s informational reading score.  However, in this case the literary reading score of Taiwan’s students brings the overall score down, suggesting a need for balance.  In terms of balance, Finland seems to have gotten this just right; the informational, literary and overall scores are indistinguishable from one another, and are all very high.