Archive for the 'Assessment' Category

Don’t Be Rotten to the Core

Wednesday, October 2nd, 2013

I thought I’d take this opportunity, while the federal government is shut down over the question of its own power to legislate, to talk about another somewhat controversial initiative, namely the Common Core State Standards.

It should be noted that this is not a simple left-right issue. At a recent conference, I heard Kim Marshall joke that he never thought he’d see national standards because “the right doesn’t like national, and the left doesn’t like standards.” So, as you might expect, the Common Core seems to be embraced by moderates in both parties, while being attacked by extremists on both sides. Teachers and parents, who are the most directly affected by the changes, express the same range of opinions as policymakers and pundits. So, the discussion continues.

To get a sense of the issues involved, as well as the general tone, check out this New York Times editorial by Bill Keller, and this response by Susan Ohanian.

For the record, I agree with the Bill Keller editorial (you can just change that “K” into an “H” and we’re good). I’m a fan of the Common Core, though I have a number of concerns about the way it’s being implemented. But I respect the opinions of many who oppose it, and understand the quite valid reasons why they do. Unfortunately, most of the rhetoric that I encounter against the initiative is either focused on areas that have very little to do with the standards themselves, or are based in a fog of misinformation.

Now, if you’ve read the standards, and you honestly believe that we should not want our students to be able to cite evidence from informational texts to support an argument, I’m very willing to have that conversation. If you think the Common Core shifts aren’t the right direction for our students, I’m very willing to have that conversation. If you have a problem with emphasizing literacy in the content areas, I’m very willing to have that conversation. That’s just not the conversation I’ve been hearing about the Common Core, and if we’re going to discuss these very large-scale changes in the way they deserve to be discussed, we need to clear the air of distractions and distortions.

With that in mind, I present the Top Ten Most Common Objections to the Common Core, and my responses to them. This is meant to be the beginning of a conversation and not the last word, so please feel free to continue the discussion in the comments section below.

1. The Common Core is too rigorous. The standards are not developmentally appropriate.

I think we’re feeling that now because we’re transitioning into these standards from a less rigorous system. If students come in on grade level, what they’re being asked to learn in each year is very reasonable. The problem is that we’re so far from that “if,” that the standards can often seem very unreasonable. Add to that a rushed implementation, complete with career-destroying and school-closing accountability, and the Common Core expectations can leave a very bad taste in our mouths.

What’s more, the Common Core includes qualitative shifts as well as quantitative shifts, so students will be as unfamiliar with the new ways of learning as their teachers are. The good news is that each year we implement the Common Core, students will become more used to Common Core ideas such as text-based answers and standards of mathematical practice, and will be better prepared for the work of their grade each year. It will likely get worse before it gets better, but I do think there is a light at the end of the tunnel, and it will at least become visible in the next year or two.

2. The Common Core is not rigorous enough. My state had better standards before.

Well, the standards are meant to represent only the minimum of where students need to be in their grade level in order to be on track for college and career readiness by the end of Grade 12. So if you can meet these standards and then exceed them, more power to you. States that adopt the Common Core are also free to change up to 15%, and to add additional standards as well.

So here in New York State, we added Pre-K standards that aren’t in the national version, we put in additional standards throughout the documents (including Responding to Literature standards in ELA and teaching money in early-grade math classrooms), and we still retain the state-wide content standards in social studies and science that students need to pass their Regents. And even where states are slacking, a high-performing school won’t suddenly lower their standards just because they can. That’s not how they became high-performing schools in the first place.

3. The Common Core is a mandated top-down program that infringes on state control of schools.

The Common Core is not mandated by the federal government. States can choose to adopt the Common Core or opt out. I hesitate to present the most blindingly obvious of proofs, but here we go: not all of the states adopted the standards. Some states chose to opt out. That should suffice as proof enough that states can choose to opt out if they want to.

Did the federal government sweeten the deal by adding Race to the Top incentives for states that adopted the Common Core? Yes. But that’s bribery, not coersion. You can say no to a bribe, even if you need the money. And this wasn’t even that much of a bribe, as everyone knew there were only going to be a limited number of states that won Race to the Top funding and Common Core adoption was far from a guarantee.

Whether you love or hate the Common Core, it was your state legislature that adopted the standards, and the credit or blame should be placed there. States are just as capable of having cynical self-serving politicians as the federal government is, and don’t let anyone tell you otherwise. But some states may have genuinely adopted the Common Core to improve education for their students, even if you don’t think it will.

Frankly, I’m no more a fan of Race to the Top than I was of No Child Left Behind. I don’t think states should have to compete for education funding. And there were other incentives in the Race to the Top formula I had issues with, like the charter school expansions. But these are criticisms of federal education policy, and not the Common Core standards themselves.

4. The Common Core is a result of the corporate reform movement that’s undermining public education.

Maybe.

I don’t think the standards do undermine public education, though, and I believe the people who actually put them together are earnest in their attempts to improve it. I’m not blind to some of the strange bedfellows involved with the process, but if an idea leads to good things, I don’t care where it comes from. This is an argument that just doesn’t work on its own. Just because Bill Gates funded it, it isn’t necessarily Windows 7. Zing!

5. The Common Core is only about testing and accountability.

I hate to break it to you, but the testing and accountability movement has been around a lot longer than the Common Core. We’re already teaching to the test, so it makes sense to design a better test, one worth teaching to. You can read about early attempts to align New York’s state-wide exams to the Common Core in this article, and I’m quoted towards the end, but the bottom line is that they didn’t go very well.

Two multi-state consortiums are now hard at work to build a better test, though this turns out to be a tougher job than they originally thought. They are talking about having students take the state-wide (actually, consortium-wide) tests on computers, which means that every school needs to have computers. That could be a logistical nightmare in itself, but it could also mean more funding for computers in schools.

In New York City, teachers are being evaluated through a system that uses test scores, in one form or another, as 40% of a teacher’s score, while the other 60% will be based on the Danielson Framework. In my opinion, that’s a vast improvement over using test scores alone, which even the Gates-funded MET study doesn’t endorse.

6. The Common Core is a conspiracy to keep the poor uneducated.

No, that’s what we have now. There is a strong correlation between socioeconomic status and academic achievement. Having a set of common standards is one step in the process of attempting to close that gap.

7. The Common Core replaces literature with government manuals.

That simply isn’t true. There is an entire section of the standards that covers Reading Literature. There may be a government manual listed somewhere in the examples of informational texts, but it’s disingenuous to hold that up as the centerpiece of Common Core expectations for student reading. Anyone who makes this argument is either unfamiliar with the standards, or uninterested in engaging in a serious discussion about them.

8. People are making money from the Common Core!

This is true, in as much as we need people to write tests, publish classroom materials, and train teachers. But we would have needed this anyway, Common Core or no.

Liberals tend to think that everyone should do their jobs with the purest of motives, and if someone’s profiting from something, it must be an evil conspiracy. Conservatives tend to believe the opposite: that if you made money from an idea, then that proves the idea had market value, and those who improve the system deserve to profit from their innovations. I take a more neutral view of profit’s correlation with good in the world. I work in teacher training, and the Common Core affects what I teach, but not how often I teach or how much money I make. I have no financial interest in defending the Common Core.

Keller’s editorial estimates the costs of the new tests at about $29 per student, in a system that spends over $9,000 per student in a year. You might not like the Common Core for other reasons, but cost alone can’t be the only reason to oppose it.

9. These Common Core-aligned materials I have are bad.

I don’t doubt it. But just because a product claims to be “Common Core-aligned,” it doesn’t mean that it is Common Core-endorsed. I have no end of problems with the range of “Common Core-aligned” curricula being rolled out by New York City alone. This is not a function of poor standards, but rather poor implementation.

By the way, a lot of the Common Core-aligned materials were delayed getting into schools this year, even as teachers were required to start using them. You don’t have to convince me that we’re having implementation problems.

And I spent last summer modifying my own organization’s social studies curricula to be Common Core-aligned, and I feel strongly that our products improved immensely because of it.

10. The Common Core is untested, and shouldn’t be implemented on such a large scale without a pilot program.

This is from Reign of Error author Diane Ravitch, and she makes a fair point. But nothing’s written in stone. The standards will work in some ways and need mending in others. And where they need mending, we’ll mend them. Ten years from now, we may come to see the current version of the Common Core as a really good first draft. Or we may remember it as New Coke. There’s no way to know until we try it out. That can be used as an argument for it as well as against it.

I do think that we should do everything we can do to make it work. That’s the only way we’ll really know if it doesn’t.

Honorable Mention: President Obama is for it, and therefore I must be against it.

Hey, look! Someone over there is getting health care.

I really do see a lot of parallel between the Affordable Care Act debacle and the Common Core controversy. Tea party Republicans want to talk about how Obamacare will destroy the economy and force the government between you and your doctor and lead to the apocalypse, but they really oppose it on ideological principles. If they would talk about their principles, we could have an honest debate, but they know these principles sound cold and selfish, so they obfuscate. Common Core opponents dance around the actual changes being made in education because most of them make sense. The real concern, as I see it, is the danger of the larger corporate-funded movement to use testing and data to prove the ineffectiveness of public education in order to move to a privatized free-market system.

That’s a concern worth discussing directly, and I’m very willing to have that conversation.

The Wager

Sunday, April 28th, 2013

The year was 2002. I was teaching an advanced graduate course on Shakespeare, and I chose to give my final exam as a take-home. The questions included true/false, short answer, extended response, and one long essay.

I mentioned this while having dinner one night with friends. Brian, who runs a successful business he built himself, scoffed at the very notion of a take-home final in the age of the Internet. Couldn’t the students just look up all of the answers? This was around the time when people were starting to use “Google” as a verb, and many students were more tech-savvy than their professors. I assured Brian that the test would still be challenging as a take-home, but he remained unconvinced.

Brian offered me a wager. He would take the exam along with my students, despite not having taken the course or even knowing very much about Shakespeare. As long as he could research and plagiarize as much as he wanted, he claimed he could pass my final. I accepted the bet.

In the weeks to come, Brian became consumed with the task. He researched each question, writing and rewriting answers to perfection. He put way more time into that final than any of the students, and he plagiarized without shame. But, he completed the final on the same schedule as the students, and ended up scoring a 91 out of a possible 100 points. This was slightly below the class average, but he clearly won the bet.

However, he did admit that, in order to be successful on the final, he had to learn a whole lot about Shakespeare along the way. He may not have taken the course, but he ended up doing much of the work he would have had to do anyway, engaging with the material throughout the process.

It’s worth noting at this point that the exam only represented 10% of the final grade. Much more of the course was about participation in class discussions and completing projects. But with Brian’s self-guided work, he was able to earn 9.1% of the course grade without ever setting foot in my classroom. Had he attempted some of the projects, and applied the same level of drive to them, he could have earned even more points, learning even more about Shakespeare in the process.

This is a good way to think about assessment. We define what students should be able to do after a unit of study, and we define a way to measure whether or not they’ve learned it. The unit of study, then, should be designed to help students succeed in the measurement. If that sounds too much like teaching to the test, that’s fine, but then we should start designing tests worth teaching to.

This is the idea of the performance task. Rather than having students fill out multiple-choice bubble sheets, they do authentic tasks. They understand how the skills they are learning in school are applied in the real world. And when students show they are able to transfer their learning into unfamiliar contexts, as they should in any good performance task, they demonstrate deep understanding of the skills and concepts being covered.

So, if a student can succeed in the teacher-created assessment before the instruction, is the instruction really necessary? If students can take the initiative to demonstrate their meeting the same learning goals some other way, shouldn’t they get credit for it? And if real-world authenticity is the aim, shouldn’t students be able to use the same tools a real-world businessman would use when working toward the same goal?

These are questions we’re now grappling with in assessment. But I thank Brian for giving me a head start in thinking about them so many years ago.

In the Zone

Wednesday, March 6th, 2013

As we begin implementing the Common Core State Standards this year, many of the schools I advise are having very similar problems with grade-level readiness. This isn’t a new problem, to be sure, but it has become intensified by Common Core expectations. The Common Core standards are more rigorous than last year’s New York State standards, so even students who were on grade level last year have some catching up to do. Also, built into the DNA of the Common Core is the idea of a “staircase of complexity” in which students must master the standards of the prior year before they are ready for the standards of the current year. In other words, they must master the 5th-grade standards in order to become 6th-grade ready.

For example, students in Kindergarten learn to state an opinion (”My favorite book is…”). In Grade 1, they provide a reason for their opinion. In Grade 4, they support their reasons with information, while in Grade 6 they write arguments to support claims with reasons and evidence. In math, students are expected to be effortlessly fluent in addition and subtraction by the end of Grade 2, so they will be ready to begin fractions in Grade 3. By the end of Grade 5, their understanding of fractions is thorough enough to begin algebra in the 6th grade. It’s a well-structured progression that brings students step-by-step from Kindergarten to college and career readiness by providing incremental support based on the learning that has accrued through the previous years of instruction in every grade.

What happens, then, during the first year of implementation? Our students aren’t even coming in on grade level based on the old standards, let alone the more rigorous standards demanded by (and required for) the Common Core. Our 6th graders aren’t coming in having mastered fractions or the opinion essay. Their reading levels do not prepare them to approach the complex texts in the new reading band levels, which themselves are set higher than previous levels by the Common Core (as can be seen in the chart at the bottom of page 8 of the ELA Appendix A):

(Click for a larger image.)

And this problem is even more profound in high school, where the high-stakes Regents Exams are looming, and many students aren’t even prepared to read the instructions.

In a December 2011 keynote titled “What Must Be Done in the Next Two Years” (you can download the transcript here), David Coleman, the architect of the Common Core Standards, addresses the idea of grade-level readiness. He’s a brilliant man who speaks with a persuasive confidence, but he’s on the wrong side of this particular issue.

But for your sakes, the really exciting thing is for the first time there’s a measure in the standards that insists that students at each level are encountering texts of adequate complexity.

Nonetheless, you could nonetheless be defeated, because the most popular instructional practice for students who are behind is to replace their core reading with leveled text at their level, right? So if you were to actually look at what your kids are being given, they are constantly matched in this seeming noble idea that you should match everything they read to where they are today, often called a proximal zone of development, et cetera.

Let me be rather clear. Leveled readers and reading at your own level has a crucial role to play for kids in terms of their vocabulary growth, their love of reading, and has a very important role, so I’m not saying kind of just get rid of it. But what I am saying is the core of instruction, if we want kids to catch up, has to be the deliberate study of sufficiently complex texts, again and – we cannot exclude students from that and expect them to magically catch up. That’s a scaffolded environment, do you get me? Where their frustration – they are expected to be frustrated. That frustration is managed. It’s part of the classroom community, and they engage repeatedly in dealing with things that are more difficult than they can handle.

First of all, it’s the Zone of Proximal Development (ZPD), not the “proximal zone of development, et cetera.” I’m less bothered by his mixing the words around than I am by the “et cetera,” as if to say “yeah, there was more but I couldn’t be bothered to absorb it.” The ZPD is the range between what a child can do independently and what that same child can do with support. The concept was first described by Soviet psychologist Lev Vygotsky in the 1930’s, and has had a profound impact on developmental psychology and learning science. You can’t be dismissive of the ZPD in one breath, and then go on to recommend scaffolding in the next. The very idea of scaffolding is based on a Vygostkian model of development. The term was introduced by American psychologist Jerome Bruner, and it refers to the supports that we provide students within their ZPD to help them achieve at higher levels. As the metaphor suggests, once students can do these tasks independently, we can remove the scaffolding.

Coleman’s right that there should be managed frustration. If students read texts that are too easy for them, they may enjoy those texts, but it’s not the best way to support reading progress. When students have to read within their ZPD, they feel a frustration we might accurately describe as growing pains. They experience a stretch, and in that stretch, learning can actually help drive cognitive development. If, on the other hand, the material is above the upper limit of their ZPD, they will not experience that productive frustration. They will simply shut down and not attempt to read the material at all. And there is no amount of scaffolding that will make it possible. Think of a weight you can lift easily, a weight that requires some effort to lift, and a weight you can’t budge at all. Which of those three weights would you choose if you wanted to promote muscle growth?

So if you have students who are one or two grades below grade level, it might be worth trying to push them in the way Coleman describes. But students who are four, five, six years below their grade level, aren’t going to be reading on grade level by the end of the year no matter whose philosophical outlook you subscribe to. Nobody is expecting them to “magically catch up.” The idea is to support them in making the greatest progress possible. It is Coleman who is invoking magic when he expects that these students will be able to catch up simply through teacher patience, student frustration, and intense scaffolding.

But if anybody should be a proponent of Vygotsky, it’s David Coleman himself, for Vygotsky provides a clear developmental framework for the Common Core. If learning really can drive development, and I believe it can, then having a rigorous set of standards defined for each grade level organized into a staircase of complexity makes a lot of sense. If we adhere to these standards from Kindergarten, making sure that students receive support in a multi-tiered Response to Intervention system to ensure that they remain on grade level at the end of each year, then the Common Core might actually be a blueprint for making sure that our students are well prepared for the rigors of college and the workplace by the end of Grade 12. Wouldn’t it be a shame if that were all true and the Common Core really is a better way of doing business, but nobody ever knew it because the implementation was so badly botched?

So what can we do? If I were in charge of implementation, I would have had two years of bridge standards before fully adopting the Common Core. If the 5th grade NYS standards say ABC and the 8th grade Common Core standards say JKL, then we develop a logical DEF for 6th grade and a 7th-grade GHI that allow us to incrementally meet the higher standards. Instead, we’re going right from 5th-grade NYS to 6th-grade Common Core, and even students that were on grade level last year are being left behind. The folks at the New York City Department of Education, for their part, seem to understand the difficulties involved, and are trying to make the changes as gradually as possible to support teachers. But no such support is available for students, as the level of rigor expected for them is coming from Albany, and is out of the city’s hands.

I can’t tell you what the statewide assessments are going to look like at the end of this year, but I’m pretty sure the students are going to be expected to read on what is now considered grade level, and this is the problem. What do you do if you have 8th-grade students reading on a 4th-grade level, when you know you are going to be accountable for them passing an 8th-grade test at the end of the year? One option is, as Coleman describes, to give them 8th-grade reading selections anyway, have them read fewer overall texts, and heavily scaffold the texts being read. Another option is to try to give them two years of instruction in a year, committing to bring them from a 4th-grade level to 6th-grade level. Neither strategy will prepare them to read on the 8th-grade level by test time, but I prefer the latter method. It’s better to make meaningful progress in the time that you have than to squander the opportunity by fumbling around with inappropriately difficult texts. I understand, respect, and even admire Coleman’s desire to get everyone on grade level. It’s not going to happen this year.

Given that some of the quantitative targets may not be possible this year, another option is to focus on the qualitative shifts. Give students more exposure to informational texts. Give them more complex texts than they are reading now. Have them read more independently, and give them opportunities to cite evidence from the things they read to support their writing. These are all Common Core-aligned shifts, and can be implemented right away, regardless of student reading levels.

Finally, teachers can make a big difference by differentiating instruction. Some students may have higher upper bounds in their ZPD than might be apparent at first. And if you’ve agreed with me up until now, follow me the rest of the way. It’s important for teachers to challenge their students to the highest extent as is possible for them. Students will push back, but being a teacher means to encourage students to do more than they ever thought they could. Now is the time to do that. Please don’t mistake my nuanced understanding of cognitive development for timidity. I’ve taught Shakespeare, in the original language, to low-performing 5th graders. But to do that, I had to have some confidence that my learning goals were within their Zone of Proximal Development. And when they were, it turned out that it was possible!

As for the end-of-the-year tests, the whole state is in the same bind, so relative success is still very much in reach given the right strategies. Students feel growing pains, and so do teachers. But that pain just means that we’re working outside of our comfort zone, and are instead in a zone that is more conducive to growth.

Science!

Monday, January 7th, 2013

Today, I worked with science teachers on their performance tasks. Actually, I’ve been doing a lot of consulting this year on performance tasks, which is the hot new trend in assessment.

A performance task is an opportunity for students to demonstrate that they can independently apply the skills they’ve learned in a real-world context. So it’s like a post-test, only instead of multiple-choice questions, students have to do an authentic activity. Teachers examine the resulting student work with a rubric to measure whether or not students have learned the skills, and they can then use this information to plan future instruction. It’s much more effective than standardized-testing data in diagnosing student needs, though I do admit it is much more time-consuming.

This year, I’ve been working a lot with social studies and science teachers. Because of the Common Core shifts, these teachers are now required to teach literacy skills. There are no actual content standards in social studies or science in the Common Core; all of the standards for these subject areas are literacy standards. There are science content standards currently under development by Next Generation. When they are completed, states will have the option of adopting them in the same way they adopted Common Core. But until then, science content standards come from the states, and literacy standards from the Common Core are applied across the curriculum.

Now, I actually like the idea of literacy across the curriculum, but it is a big adjustment for science and social studies teachers, and so the schools where I consult have asked me to work with these teachers to help them infuse literacy skills into their curriculum and their assessments, particularly the performance tasks that New York City is requiring them to administer this year.

I have had a lot of experience working with social studies teachers in the past, but I’m probably working more with science teachers this year than I ever have before. And that’s fantastic, because I get the opportunity to learn a lot of new things. I also get the chance to yell “Science!” like Magnus Pyke a lot. No, I don’t really do that, but it would be fun.

One of the science teachers I worked with today swears by a website for an organization called Urban Advantage. It has some great resources for teaching middle-school science with an inquiry-based approach. I like the way that their materials scaffold scientific writing, which is my focus this year.

Another science teacher I worked with today showed me the PhET website, which has some really compelling interactive simulations in the sciences. I watched 7th-grade students run a simulation on density, in which they had to determine the mass and volume of various mystery substances and identify them from a list of materials and their densities.

Science!

Shakespeare and the Common Core

Sunday, January 6th, 2013

Across the United States, education is undergoing a sea-change (into something rich and strange) surrounding the adoption of something called the Common Core State Standards.

Standards are simply a list of what students should be able to do by the end of each grade. Traditionally, these have been defined by states, with a requirement for them to do so by the No Child Left Behind Act of 2001. States still define their own standards, but, in an unprecedented act of coordination, 45 states (plus the District of Columbia and a few of the territories) have adopted the Common Core as their state standards. Full adoption has been targeted for next year, though New York has started phasing in significant portions of it this year.

Love it or hate it, the Common Core represents a new direction in pedagogical thinking, both qualitatively and quantitatively. Personally, I think the Common Core standards are a lot better than the existing New York State Standards, but we’re going to have to suffer through a difficult transition period before we can reap the benefits of that improvement. Right now is probably the most difficult time, as we have to deal with students who are not starting on what the new structure defines as grade-level, a lack of Common Core-aligned teaching materials, and uncertainty surrounding precisely how these new standards will be assessed. May you live in interesting times.

As with anything new and complex, there are going to be a number of misconceptions floating around about it. One of the most prevalent I’ve seen is that the Common Core eliminates (or at least de-emphasizes) literature, in favor of informational texts. In particular, many are convinced that Shakespeare will be replaced entirely by non-fiction, as public education descends into a Dickensian nightmare of Shakespeare-deprived conformity and standardization.

In fact, Shakespeare is mandated by the Common Core.

The confusion seems to stem from a chart that appears on page 5 of the English Language Arts Standards document, outlining the percentages of literary vs. informational texts included in the National Assessment of Educational Progress:

(Click for a larger image.)

The Common Core is explicit about aligning curricula with this framework, but it is just as explicit about how that alignment should be distributed:

Fulfilling the Standards for 6–12 ELA requires much greater attention to a specific category of informational text—literary nonfiction—than has been traditional. Because the ELA classroom must focus on literature (stories, drama, and poetry) as well as literary nonfiction, a great deal of informational reading in grades 6–12 must take place in other classes if the NAEP assessment framework is to be matched instructionally.

So, despite the canard that high-school English classes will only be allowed to teach literature 30% of the time, the 70% informational text requirement refers to the entirety of student reading across the curriculum. Given that one of the major shifts is an increase in reading and writing in the content areas, the ratio makes sense.

Let’s say that, over the course of a particular unit, a high-school English teacher is assigning 3 literary texts and 1 informational text. That means that (text length aside) students are reading 75% literature in English class. And if this is the only reading the students are doing, then they are reading 75% literature overall. But now imagine that, during the same timeframe, they are also reading 2 informational texts in social studies, 2 informational texts in science, and 2 informational texts in all of their other classes combined. They are still reading 75% literature in English class, but this now represents 30% of their reading overall.

And, far from being lost in the informational-text shuffle, Shakespeare now becomes the man of the hour. As the only author explicitly required by the Common Core, Shakespeare must be taught in grades 11 and 12 (see page 38, right column, Standards 4 and 7). Shakespeare is also included in the recommended texts for grades 9 and 10 (see page 58, left column, center). And Shakespeare is not excluded for younger students either, as the standards outline only the minimum of what must be taught in each grade. The Common Core does stress using authentic texts, so updated language versions of Shakespeare would be frowned upon, but that’s actually an adjustment I can get behind.

There is a lot of controversy surrounding the Common Core, and a lot of objections surrounding the new changes. Some of these objections are legitimate, and some are not. I look forward to continuing that conversation as the implementation develops. But rest assured that Shakespeare isn’t going anywhere.

It’s a Poor Workman Who Blames Yogi Berra: Artificial Intelligence and Jeopardy!

Wednesday, February 23rd, 2011

Last week, an IBM computer named Watson beat Ken Jennings and Brad Rutter, the two greatest Jeopardy! players of all time, in a nationally televised event. The Man vs. Machine construct is a powerful one (I’ve even used it myself), as these contests have always captured progressive imaginations. Are humans powerful enough to build a rock so heavy, not even we can lift it?

Watson was named for Thomas J. Watson, IBM’s first president. But he could just as easily have been named after John B. Watson, the American psychologist who is considered to be the father of behaviorism. Behaviorism is a view of psychology that disregards the inner workings of the mind and focuses only on stimuli and responses. This input leads to that output. Watson was heavily influenced by the salivating dog experiments of Ivan Pavlov, and was himself influential in the operant conditioning experiments of B.F. Skinner. Though there are few strict behaviorists today, the movement was quite dominant in the early 20th century.

The behaviorists would have loved the idea of a computer playing Jeopardy! as well as a human. They would have considered it a validation of their theory that the mind could be viewed as merely generating a series of predictable outputs when given a specific set of inputs. Playing Jeopardy! is qualitatively different from playing chess. The rules of chess are discrete and unambiguous, and the possibilities are ultimately finite. As Noam Chomsky argues, language possibilities are infinite. Chess may one day be solved, but Jeopardy! never will be. So Watson’s victory here is a significant milestone.

Much has been made of whether or not the contest was “fair.” Well, of course it wasn’t fair. How could that word possibly have any meaning in this context. There are things computers naturally do much better than humans, and vice versa. The question instead should have been in which direction would the unfairness be decisive. Some complained that the computer’s superior buzzer speed gave it the advantage, but buzzer speed is the whole point.

Watson has to do three things before buzzing in: 1) understand what question the clue is asking, 2) retrieve that information from its database, and 3) develop a sufficient confidence level for its top answer. In order to achieve a win, IBM had to build a machine that could do those things fast enough to beat the humans to the buzzer. Quick reflexes are an important part of the game to be sure, but if that were the whole story, computers would have dominated quiz shows decades ago.

To my way of thinking, it’s actually the comprehensive database of information that gives Watson the real edge. We may think of Ken and Brad as walking encyclopedias, but that status was hard earned. Think of the hours upon hours they must have spent studying classical composers, vice-presidential nicknames, and foods that start with the letter Q. Even a prepared human might temporarily forget the Best Picture Oscar winner for 1959 when the moment comes, but Watson never will. (It was Ben-Hur.)

In fact, given what I could see, Watson’s biggest challenge seemed to be understanding what the clue was asking. To avoid the complications introduced by Searle’s Chinese Room thought experiement, we’ll adopt a behaviorist, pragmatic definition of “understanding” and take it to mean that Watson is able to give the correct response to a clue, or at least a reasonable guess. (After all, you can understand a question and still get it wrong.) Watching the show on television, we are able to see Watson’s top three responses, and his confidence level for each. This gives us remarkable insight into the machine’s process, allowing us a deeper level of analysis.

A lot of my own work lately has been in training school-based data inquiry teams how to examine testing data to learn where students need extra help, and that work involves examining individual testing items. So naturally, when I see three responses to a prompt, I want to figure out what they mean. In this case, Watson was generating the choices rather than simply choosing among them, but that actually makes them more helpful in sifting through his method.

One problem I see a lot in schools is that students are often unable to correctly identify what kind of answer the question is asking for. In as much as Watson has what we would call a student learning problem, this is it. When a human is asked to come up with three responses to a clue, all of the responses would presumably be of the correct answer type. See if you can come up with three possible responses to this clue:

Category: Hedgehog-Pogde
Clue: Hedgehogs are covered with quills or spines, which are hollow hairs made stiff by this protein

Watson correctly answered Keratin with a confidence rating of 99%, but his other two answers were Porcupine (36%) and Fur (8%). I would have expected all three candidate answers to be proteins, especially since the words “this protein” ended the clue. In many cases, the three potential responses seemed to reflect three possible questions being asked rather than three possible answers to a correct question, for example:

Category: One Buck or Less
Clue: In 2002, Eminem signed this rapper to a 7-figure deal, obviously worth a lot more than his name implies

Ken was first to the buzzer on this one and Alex confirmed the correct response, both men pronouncing 50 Cent as “Fiddy Cent” to the delight of humans everywhere. Watson’s top three responses were 50 Cent (39%), Marshall Mathers (20%), and Dr. Dre (14%). This time, the words “this rapper” prompted Watson to consider three rappers, but not three potential rappers that could have been signed by Eminem in 2002. It was Dr. Dre who signed Eminem, and Marshall Mathers is Eminem’s real name. So again, Watson wasn’t considering three possible answers to a question; he was considering three possible questions. And alas, we will never know if Watson would have said “Fiddy.”

It seemed as though the more confident Watson was in his first guess, the more likely the second and third guesses would be way off base:

Category: Familiar Sayings
Clue: It’s a poor workman who blames these

Watson’s first answer Tools (84%) was correct, but his other answer candidates were Yogi Berra (10%) and Explorer (3%). However Watson is processing these clues, it isn’t the way humans do it. The confidence levels seemed to be a pretty good predictor of whether or not a response was correct, which is why we can forgive Watson his occassional lapses into the bizarre. Yeah, he put down Toronto when the category was US Cities, but it was a Final Jeopardy, where answers are forced, and his multiple question marks were an indicator that his confidence was low. Similarly cornered in a Daily Double, he prefaced his answer with “I’ll take a guess.” That time, he got it right. I’m just looking into how the program works, not making excuses for Watson. After all, it’s a poor workman who blames Yogi Berra.

But the fact that Watson interpreted so many clues accurately was impressive, especially since Jeopardy! clues sometimes contain so much wordplay that even the sharpest of humans need an extra moment to unpack what’s being asked, and understanding language is our thing. Watson can’t hear the the other players, which means he can’t eliminate their incorrect responses when he buzzes in second. It also means that he doesn’t learn the correct answer unless he gives it, which makes it difficult for him to catch on to category themes. He managed it pretty well, though. After stumbling blindly through the category “Also on Your Computer Keys,” Watson finally caught on for the last clue:

Category: Also on Your Computer Keys
Clue: Proverbially, it’s “where the heart is”

Watson’s answers were Home is where the heart is (20%), Delete Key (11%), and Elvis Presley quickly changed to Encryption (8%). The fact that Watson was considering “Delete Key” as an option means that he was starting to understand that all of the correct responses in the category were also names of keys on the keyboard.

Watson also is not emotionally affected by game play. After giving the embarrassingly wrong answer “Dorothy Parker” when the Daily Double clue was clearly asking for the title of a book, Watson just jumped right back in like nothing had happened. A human would likely have been thrown by that. And while Alex and the audience may have laughed at Watson’s precise wagers, that was a cultural expectation on their part. There’s no reason a wager needs to be rounded off to the nearest hundred, other than the limitations of human mental calculation under pressure. This wasn’t a Turing test. Watson was trying to beat the humans, not emulate them. And he did.

So where does that leave us? Computers that can understand natural language requests and retrieve information accurately could make for a very interesting decade to come. As speech recognition improves, we might start to see computers who can hold up their end of a conversation. Watson wasn’t hooked up to the Internet, but developing technologies could be. The day may come when I have a bluetooth headset hooked up to my smart phone and I can just ask it questions like the computer on Star Trek. As programs get smarter about interpreting language, it may be easier to make connections across ideas, creating a new kind of Web. One day, we may even say “Thank you, Autocorrect.”

It’s important to keep in mind, though, that these will be human achievements. Humans are amazing. Humans can organize into complex societies. Humans can form research teams and develop awesome technologies. Humans can program computers to understand natural language clues and access a comprehensive database of knowledge. Who won here? Humanity did.

Ken Jennings can do things beyond any computer’s ability. He can tie his shoes, ride a bicycle, develop a witty blog post comparing Proust translations, appreciate a sunset, write a trivia book, raise two children, and so on. At the end of the tournament, he walked behind Watson and waved his arms around to make it look like they were Watson’s arms. That still takes a human.

UPDATE: I’m told (by no less of an authority than Millionaire winner Ed Toutant) that Watson was given the correct answer at the end of every clue, after it was out of play. I had been going crazy wondering where “Delete Key” came from, and now it makes a lot more sense. Thanks, Ed!

Item of the Week

Monday, January 24th, 2011

This week’s testing item is a favorite of mine to use as an example, because it illustrates just how careful we need to be when looking at standardized testing data.

We will be looking at Item 16 on the 2009 New York State Grade 6 Exam. The performance indicator is “5.G14 Calculate perimeter of basic geometric shapes drawn on a coordinate plane (rectangles and shapes composed of rectangles having sides with integer lengths and parallel to the axes).” You can click the figure below to enlarge.



What is this question testing? Does it fit the performance indicator? Which of the wrong answers would you predict students would choose the most often? Why? What would students need to know and be able to do to answer this question correctly?

Item of the Week

Monday, January 17th, 2011

In this somewhat new blog feature, I will offer up a question from the statewide examinations that New York City students take each year. The purpose of this will not be for you to try to provide the correct answer, but rather to join me in examining the question. What does it tell us about student understanding? What do each of the wrong answers mean? What is this question testing? What is it really testing? What would students need to know and be able to do to answer this question correctly?

I gave a workshop for data teams on Friday. Three of the groups were examining last year’s 4th grade ELA scores, which I knew meant that we’d be talking about Abigail. In my visits to schools, I’ve found that students who took this exam had a lot of trouble on questions relating to this poem (click to enlarge):

Students had trouble on a number of the questions, but we will just look at one: Item 21 on the 2010 New York State Grade 4 ELA Exam:



The intended performance indicator is “Make predictions, draw conclusions, and make inferences about events and characters,” but we can be the judge of that.

What is this question testing? Does it fit the performance indicator? Which of the wrong answers would you predict students would choose the most often? Why? What would students need to know and be able to do to answer this question correctly?

Item of the Week

Monday, January 10th, 2011

I thought it might be fun to try something new with the “Question of the Week” feature here on the blog. Instead of asking my readers a question, I will offer up a question from the statewide examinations that New York City students take each year.

The purpose of this will not be for you to try to provide the correct answer, but rather to join me in examining the question. What does it tell us about student understanding? What do each of the wrong answers mean? What is this question testing? What is it really testing? What would students need to know and be able to do to answer this question correctly?

Sound like fun?

To differentiate this feature from the Question of the Week, I’ll call this the Item of the Week, which is what we call questions in the parlance of standardized testing.

Today’s item comes from the 2010 New York State Grade 4 Mathematics Exam. The strand is Measurement and the performance indicator is “4.M04 Select tools and units appropriate to the mass of the object being measured (grams and kilograms).” You can click the image for a larger view.

I like the layering of this question. First of all, the student needs to know which units measure mass and which don’t. If they answer A or D, they don’t. But to choose between B and C, students need to have some idea of how much a gram really is.

Sometimes these questions will have distractor answers that use numbers from the problem to try to trick students into choosing them. But there are no numbers in this problem. And all of the answers use the same number.

The trick here is in the first sentence. The fact that Mr. Patel moved his chair across the room is not relevant. But if you don’t know what “mass” means, that first sentence might trick you into thinking you are looking for a distance, in which case you might choose D. This assumes, of course, that you have no idea how long a kilometer is.

All in all, it seems like a pretty fair question that tests what it purports to test. In practice, it turned out to be one of the harder items for New York City students taking this exam.

As always, I invite further discussion.

Shakespeare Teacher: The Book!

Wednesday, September 1st, 2010

I am proud to announce that I have recently published a chapter in this book on teaching literature through technology. You can ignore the description; it seems to have been inadvertently switched with that of this book. Neither page describes my chapter, but you can read the abstract on the publisher’s page, or I could just tell you what it’s about.

Unlike this blog, the book chapter is actually about teaching Shakespeare! No riddles. No anagrams. No politics. (Well, maybe a little bit of politics.)

Here is the basic idea. I begin by citing experts who are skeptical of the ability of elementary school students to do Shakespeare. Specifically, I discuss the Dramatic Age Stages chart created by Richard Courtney.

Courtney describes “The Role Stage” as lasting from ages twelve to eighteen, at which point students are capable of a number of new skills that I would consider essential for understanding Shakespeare in a meaningful way. These skills include the ability to think abstractly, to understand causality, to interpret symbols, to articulate moral decisions, and to understand how a character relates to the rest of the play. So based on this chart, I would have to conclude that a student younger than twelve would not be ready to appreciate Shakespeare in these ways.

But Courtney bases his chart on the framework of developmental phases of Swiss psychologist Jean Piaget. These phases describe what a lone child can demonstrate under testing conditions. A more accurate and nuanced way of looking at development is provided in the work of Soviet psychologist Lev Vygotsky, who described a “Zone of Proximal Development” (ZPD), which is a range between what a child can demonstrate in isolation, and what the same child can do under more social conditions.

So I wondered if fifth-grade students (aged 10) would have some of the skills associated with “The Role Stage” somewhere in their ZPD. If so, a collaborative class project should provide enough scaffolding to develop those skills and allow ten-year-old students to understand and appreciate Shakespeare on that level.

So I developed and implemented a unit to teach Macbeth to a fifth-grade class in the South Bronx, using process-based dramatic activities, a stage production of the play performed for their school, and a web-based study guide to apply what they had learned. The idea was to use collaborative projects to get the kids to work together to make collective sense of the play. I then examined their written work for evidence that they had displayed the skills associated with “The Role Stage” in Courtney’s chart, and I was able to find a great deal of it.

I also create a three-dimensional rubric to assess the students’ work over the course of the unit. I say a three-dimensional rubric because I use the same eight categories in all three rubrics, but they develop over time to reflect the increased sophistication that I expect the students to demonstrate. I then compare the students’ performance-based rubric scores to their reading test scores to demonstrate that standardized testing paints only a very limited picture of what a student can achieve. (I did say that it had a little bit of politics.)

Anyway, that’s what my chapter was about. I just saved you $180! And I’m hoping to return to a regular blogging schedule soon, so more content is hopefully on the way.