Archive for the 'Data' Category

Don’t Be Rotten to the Core

Wednesday, October 2nd, 2013

I thought I’d take this opportunity, while the federal government is shut down over the question of its own power to legislate, to talk about another somewhat controversial initiative, namely the Common Core State Standards.

It should be noted that this is not a simple left-right issue. At a recent conference, I heard Kim Marshall joke that he never thought he’d see national standards because “the right doesn’t like national, and the left doesn’t like standards.” So, as you might expect, the Common Core seems to be embraced by moderates in both parties, while being attacked by extremists on both sides. Teachers and parents, who are the most directly affected by the changes, express the same range of opinions as policymakers and pundits. So, the discussion continues.

To get a sense of the issues involved, as well as the general tone, check out this New York Times editorial by Bill Keller, and this response by Susan Ohanian.

For the record, I agree with the Bill Keller editorial (you can just change that “K” into an “H” and we’re good). I’m a fan of the Common Core, though I have a number of concerns about the way it’s being implemented. But I respect the opinions of many who oppose it, and understand the quite valid reasons why they do. Unfortunately, most of the rhetoric that I encounter against the initiative is either focused on areas that have very little to do with the standards themselves, or are based in a fog of misinformation.

Now, if you’ve read the standards, and you honestly believe that we should not want our students to be able to cite evidence from informational texts to support an argument, I’m very willing to have that conversation. If you think the Common Core shifts aren’t the right direction for our students, I’m very willing to have that conversation. If you have a problem with emphasizing literacy in the content areas, I’m very willing to have that conversation. That’s just not the conversation I’ve been hearing about the Common Core, and if we’re going to discuss these very large-scale changes in the way they deserve to be discussed, we need to clear the air of distractions and distortions.

With that in mind, I present the Top Ten Most Common Objections to the Common Core, and my responses to them. This is meant to be the beginning of a conversation and not the last word, so please feel free to continue the discussion in the comments section below.

1. The Common Core is too rigorous. The standards are not developmentally appropriate.

I think we’re feeling that now because we’re transitioning into these standards from a less rigorous system. If students come in on grade level, what they’re being asked to learn in each year is very reasonable. The problem is that we’re so far from that “if,” that the standards can often seem very unreasonable. Add to that a rushed implementation, complete with career-destroying and school-closing accountability, and the Common Core expectations can leave a very bad taste in our mouths.

What’s more, the Common Core includes qualitative shifts as well as quantitative shifts, so students will be as unfamiliar with the new ways of learning as their teachers are. The good news is that each year we implement the Common Core, students will become more used to Common Core ideas such as text-based answers and standards of mathematical practice, and will be better prepared for the work of their grade each year. It will likely get worse before it gets better, but I do think there is a light at the end of the tunnel, and it will at least become visible in the next year or two.

2. The Common Core is not rigorous enough. My state had better standards before.

Well, the standards are meant to represent only the minimum of where students need to be in their grade level in order to be on track for college and career readiness by the end of Grade 12. So if you can meet these standards and then exceed them, more power to you. States that adopt the Common Core are also free to change up to 15%, and to add additional standards as well.

So here in New York State, we added Pre-K standards that aren’t in the national version, we put in additional standards throughout the documents (including Responding to Literature standards in ELA and teaching money in early-grade math classrooms), and we still retain the state-wide content standards in social studies and science that students need to pass their Regents. And even where states are slacking, a high-performing school won’t suddenly lower their standards just because they can. That’s not how they became high-performing schools in the first place.

3. The Common Core is a mandated top-down program that infringes on state control of schools.

The Common Core is not mandated by the federal government. States can choose to adopt the Common Core or opt out. I hesitate to present the most blindingly obvious of proofs, but here we go: not all of the states adopted the standards. Some states chose to opt out. That should suffice as proof enough that states can choose to opt out if they want to.

Did the federal government sweeten the deal by adding Race to the Top incentives for states that adopted the Common Core? Yes. But that’s bribery, not coersion. You can say no to a bribe, even if you need the money. And this wasn’t even that much of a bribe, as everyone knew there were only going to be a limited number of states that won Race to the Top funding and Common Core adoption was far from a guarantee.

Whether you love or hate the Common Core, it was your state legislature that adopted the standards, and the credit or blame should be placed there. States are just as capable of having cynical self-serving politicians as the federal government is, and don’t let anyone tell you otherwise. But some states may have genuinely adopted the Common Core to improve education for their students, even if you don’t think it will.

Frankly, I’m no more a fan of Race to the Top than I was of No Child Left Behind. I don’t think states should have to compete for education funding. And there were other incentives in the Race to the Top formula I had issues with, like the charter school expansions. But these are criticisms of federal education policy, and not the Common Core standards themselves.

4. The Common Core is a result of the corporate reform movement that’s undermining public education.

Maybe.

I don’t think the standards do undermine public education, though, and I believe the people who actually put them together are earnest in their attempts to improve it. I’m not blind to some of the strange bedfellows involved with the process, but if an idea leads to good things, I don’t care where it comes from. This is an argument that just doesn’t work on its own. Just because Bill Gates funded it, it isn’t necessarily Windows 7. Zing!

5. The Common Core is only about testing and accountability.

I hate to break it to you, but the testing and accountability movement has been around a lot longer than the Common Core. We’re already teaching to the test, so it makes sense to design a better test, one worth teaching to. You can read about early attempts to align New York’s state-wide exams to the Common Core in this article, and I’m quoted towards the end, but the bottom line is that they didn’t go very well.

Two multi-state consortiums are now hard at work to build a better test, though this turns out to be a tougher job than they originally thought. They are talking about having students take the state-wide (actually, consortium-wide) tests on computers, which means that every school needs to have computers. That could be a logistical nightmare in itself, but it could also mean more funding for computers in schools.

In New York City, teachers are being evaluated through a system that uses test scores, in one form or another, as 40% of a teacher’s score, while the other 60% will be based on the Danielson Framework. In my opinion, that’s a vast improvement over using test scores alone, which even the Gates-funded MET study doesn’t endorse.

6. The Common Core is a conspiracy to keep the poor uneducated.

No, that’s what we have now. There is a strong correlation between socioeconomic status and academic achievement. Having a set of common standards is one step in the process of attempting to close that gap.

7. The Common Core replaces literature with government manuals.

That simply isn’t true. There is an entire section of the standards that covers Reading Literature. There may be a government manual listed somewhere in the examples of informational texts, but it’s disingenuous to hold that up as the centerpiece of Common Core expectations for student reading. Anyone who makes this argument is either unfamiliar with the standards, or uninterested in engaging in a serious discussion about them.

8. People are making money from the Common Core!

This is true, in as much as we need people to write tests, publish classroom materials, and train teachers. But we would have needed this anyway, Common Core or no.

Liberals tend to think that everyone should do their jobs with the purest of motives, and if someone’s profiting from something, it must be an evil conspiracy. Conservatives tend to believe the opposite: that if you made money from an idea, then that proves the idea had market value, and those who improve the system deserve to profit from their innovations. I take a more neutral view of profit’s correlation with good in the world. I work in teacher training, and the Common Core affects what I teach, but not how often I teach or how much money I make. I have no financial interest in defending the Common Core.

Keller’s editorial estimates the costs of the new tests at about $29 per student, in a system that spends over $9,000 per student in a year. You might not like the Common Core for other reasons, but cost alone can’t be the only reason to oppose it.

9. These Common Core-aligned materials I have are bad.

I don’t doubt it. But just because a product claims to be “Common Core-aligned,” it doesn’t mean that it is Common Core-endorsed. I have no end of problems with the range of “Common Core-aligned” curricula being rolled out by New York City alone. This is not a function of poor standards, but rather poor implementation.

By the way, a lot of the Common Core-aligned materials were delayed getting into schools this year, even as teachers were required to start using them. You don’t have to convince me that we’re having implementation problems.

And I spent last summer modifying my own organization’s social studies curricula to be Common Core-aligned, and I feel strongly that our products improved immensely because of it.

10. The Common Core is untested, and shouldn’t be implemented on such a large scale without a pilot program.

This is from Reign of Error author Diane Ravitch, and she makes a fair point. But nothing’s written in stone. The standards will work in some ways and need mending in others. And where they need mending, we’ll mend them. Ten years from now, we may come to see the current version of the Common Core as a really good first draft. Or we may remember it as New Coke. There’s no way to know until we try it out. That can be used as an argument for it as well as against it.

I do think that we should do everything we can do to make it work. That’s the only way we’ll really know if it doesn’t.

Honorable Mention: President Obama is for it, and therefore I must be against it.

Hey, look! Someone over there is getting health care.

I really do see a lot of parallel between the Affordable Care Act debacle and the Common Core controversy. Tea party Republicans want to talk about how Obamacare will destroy the economy and force the government between you and your doctor and lead to the apocalypse, but they really oppose it on ideological principles. If they would talk about their principles, we could have an honest debate, but they know these principles sound cold and selfish, so they obfuscate. Common Core opponents dance around the actual changes being made in education because most of them make sense. The real concern, as I see it, is the danger of the larger corporate-funded movement to use testing and data to prove the ineffectiveness of public education in order to move to a privatized free-market system.

That’s a concern worth discussing directly, and I’m very willing to have that conversation.

Science!

Monday, January 7th, 2013

Today, I worked with science teachers on their performance tasks. Actually, I’ve been doing a lot of consulting this year on performance tasks, which is the hot new trend in assessment.

A performance task is an opportunity for students to demonstrate that they can independently apply the skills they’ve learned in a real-world context. So it’s like a post-test, only instead of multiple-choice questions, students have to do an authentic activity. Teachers examine the resulting student work with a rubric to measure whether or not students have learned the skills, and they can then use this information to plan future instruction. It’s much more effective than standardized-testing data in diagnosing student needs, though I do admit it is much more time-consuming.

This year, I’ve been working a lot with social studies and science teachers. Because of the Common Core shifts, these teachers are now required to teach literacy skills. There are no actual content standards in social studies or science in the Common Core; all of the standards for these subject areas are literacy standards. There are science content standards currently under development by Next Generation. When they are completed, states will have the option of adopting them in the same way they adopted Common Core. But until then, science content standards come from the states, and literacy standards from the Common Core are applied across the curriculum.

Now, I actually like the idea of literacy across the curriculum, but it is a big adjustment for science and social studies teachers, and so the schools where I consult have asked me to work with these teachers to help them infuse literacy skills into their curriculum and their assessments, particularly the performance tasks that New York City is requiring them to administer this year.

I have had a lot of experience working with social studies teachers in the past, but I’m probably working more with science teachers this year than I ever have before. And that’s fantastic, because I get the opportunity to learn a lot of new things. I also get the chance to yell “Science!” like Magnus Pyke a lot. No, I don’t really do that, but it would be fun.

One of the science teachers I worked with today swears by a website for an organization called Urban Advantage. It has some great resources for teaching middle-school science with an inquiry-based approach. I like the way that their materials scaffold scientific writing, which is my focus this year.

Another science teacher I worked with today showed me the PhET website, which has some really compelling interactive simulations in the sciences. I watched 7th-grade students run a simulation on density, in which they had to determine the mass and volume of various mystery substances and identify them from a list of materials and their densities.

Science!

Shakespeare Anagram: Hamlet

Saturday, November 10th, 2012

From Hamlet:

I cannot live to hear the news from England;
But I do prophesy the election lights
On Fortinbras.

Shift around the letters, and it becomes:

Math prognosticator Nate Silver predicted the whole state finishing roll, one-none.

Fun hobby!

Shakespeare Anagram: Romeo and Juliet

Saturday, September 15th, 2012

From Romeo and Juliet:

But, let them measure us by what they will,
We’ll measure them a measure, and be gone.

Shift around the letters, and it becomes:

The melee damage-buy seems mutual where Rahm blew a test-result law by the union.

Blog Log

Tuesday, May 3rd, 2011

Last week, I participated in a blogging project sponsored by the Shakespeare Birthplace Trust, who encouraged bloggers to post about the influence Shakespeare has had on our lives. They’ve linked up all of our contributions on one page, and it’s worth checking out. Whether you’re a fan of Shakespeare or not, it’s exciting to read people who are passionate about something writing about how they became passionate about it.

Also, be sure to check out this fantastic song parody from Bardfilm. I missed it among all the birthday excitement, but found again via a nod from the Shakespeare Geek.

In post-birthday blogging news, I’ve been asked to write a monthly post on using data for school improvement for both the company I work for and our partner organization. If you want to get a glimpse into what I actually do for a living – anagramming passages from Shakespeare doesn’t pay what it should – check out my first installment here or here.

A Choice to Make

Wednesday, April 13th, 2011

There is so much wrong with this article by Eric Hanushek that I fear that anything less than a line-for-line rebuttal will be woefully inadequate as a response. Out of consideration for my readers, I will refrain from providing one, and will rather try to focus on the most important points. Hanushek, of course, is the Stanford economist whose lurch into the field of education has driven much of the recent misguided effort towards “Reform” in today’s educational system. His article does a good job of summarizing his most crucial arguments, so it’s worth some time examining.

The title of the piece is “Valuing Teachers” and a brilliantly disingenuous title it is. Rather than using the word as we might use it (placing a high value on teachers), he is using it as an economist might (assessing the value of teachers). He is measuring how much teachers are worth. According to Hanushek, better teachers result in higher incomes for their students later in life. To make his case, he uses a series of unscientific leaps of logic that will yield easily to a few moments of rationality.

He notes that “a student with achievement (as measured by test performance in high school) that is one standard deviation above average can later in life expect to take in 10 to 15 percent higher earnings per year.” I have no reason to doubt his numbers.

But Hanushek is making the classic blunder of confusing correlation with causation. Do higher test scores in school directly cause higher incomes? Or is it possible that they may have common contributing factors? What about factors that the student brings in, such as intelligence, stamina, and motivation? Is it possible that parental income can be a factor in both standardized testing scores and future income? Hanushek’s famous value-added study attempted to isolate these factors, but he seems content to ignore them when citing this achievement/income connection.

And, as Diana Senechal points out, “there is no evidence (as far as I know) that students in the highest percentiles in high school are those who made the greatest gains on their standardized tests over the years. In fact, I suspect that most of them did pretty well on those tests all along.”

Using future income as a measure of teacher quality is even more outrageous than using test scores. How much does a Stanford professor make compared to a Wall Street hedge fund manager? Is that a function of the quality of education they received? In the interest of full disclosure, I should mention that I make significantly less than LeBron James. Did he have better teachers?

Hanushek’s solution is to “contemplate asking 5 to 10 percent of teachers to find a job at which they are more effective so they can be replaced by teachers of average productivity.” (Note to my boss: if it should ever become necessary to fire me, I would request that you instead contemplate asking me to find a job at which I am more effective.)

Hanushek’s solution – fire the bad teachers – is very simple, but it requires several assumptions that I don’t think we should be so quick to grant.

Assumptions

  1. Standardized tests accurately measure student achievement.
  2. The teachers whose students don’t make progress on the tests are the bad teachers.
  3. There is a line of average teachers at the door waiting to be hired.
  4. No factors other than teacher quality are significant.

Peruse this list, and note that Hanushek’s plan falls apart if even one of these assumptions is false. In fact, they all are.

Assumption: Standardized tests accurately measure student achievement.

False. The tests that students are given are deeply flawed indeed. Many of the questions do not test what they purport to test, and test-taking itself has become it’s own skill set that schools ignore at their own peril. If we’re careful, we can use some the results to identify areas in need of improvement. But the tests on the whole are way too idiosyncratic to use the overall scores as a basis for high-stakes decision making.

Assumption: The teachers whose students don’t make progress on the tests are the bad teachers.

False. In an August 2010 paper for the Economic Policy Institute, a team of highly distinguished education researchers laid out the case against the use of student test scores to evaluate teachers. Bottom line: It doesn’t work. Test scores are simply an ineffective statistical measure for identifying bad teachers. If you don’t find twenty pages of research from a panel of experts compelling, then you can read about this well-respected hard-working teacher who got slammed by a statistical formula.

Assumption: There is a line of average teachers at the door waiting to be hired.

False. In fact, teacher recruitment and retention is becoming a serious problem. A McKinsey study, Closing the Talent Gap, describes the decline in the teaching profession’s ability to compete in the labor market.

However, I suspect there is a bit of condescension towards the profession of teaching when we assume we can just go out and hire average teachers. The implication is that the average person would make an average teacher, rather than acknowledging that teaching requires a particular set of qualities (e.g., diligence, patience, intelligence, and a calling to want to do it) for someone to even be an average teacher. To glibly say that we can just fire the bad teachers and hire average ones is unintentionally insulting.

Assumption: No factors other than teacher quality are significant.

False. Hanushek anticipates this rebuttal, and is kind enough to provide examples of other factors that are not significant:

The initiatives we have emphasized in policy discussions—class-size reduction, curriculum revamping, reorganization of school schedule, investment in technology—all fall far short of the impact that good teachers can have in the classroom. Moreover, many of these interventions can be very costly.

Costly? I thought we were discussing what is most effective. Aren’t we having a national education crisis? Hanushek has moved past his role as researcher and now is making policy judgements. Danny Westneat argues effectively against the idea that class size is irrelevant, so I don’t have to. Teachers already know the importance of class size, and I suspect that the Reformers do as well. Similarly, other initiatives we take to improve education, costly or no, are based on research and accumulation of best practices. Even if we let Hanushek fire all of the bad teachers, we would still want to implement successful education initiatives. Sorry.

Neither side is happy with our current educational system. But Reformers seem to offer nothing but slapdash solutions that keep expenses low but ignore the facts on the ground. It seems, then, we have a choice to make. Do we want to have a public education system in this country? Many do not, and would rather see the free market take over education. Charter schools seem to be a first step in that direction, and I think the Reformers who tout them have become, wittingly or unwittingly, somewhat of a stalking horse for the movement against public education. Diane Ravitch, in her eloquent response to Waiting for Superman, discusses why charter schools aren’t the panacea they’re often held up as. She also discusses the impact of poverty on student achievement, and the dangers of ignoring it in the national discussion. Paying teachers more? Keeping class size down? Addressing the needs of high-poverty schools? It all seems so… costly.

That’s what it’s going to take, though. If we want a high-quality public education system, we’re going to have to pay for it. These may be troubled economic times, but really it’s just a question of priorities. If we’re going to have public education at all, we need to increase, not decrease, funding for it. We need to increase it by a lot. Reformer “solutions” only distract from the real issue. They want us to look at charter schools, but if we look closely enough, we’ll see that the most successful charter schools are able to spend much more per student than the public schools who are expected to emulate them.

And so, we must choose between abolishing public education and funding it adequately. Abolishing it is not really a choice at all, and would lead to an even worse crisis than we have now. But, if we can adjust our priorities and give our students the schools they deserve, then, as Dan Quayle said, “We are going to have the best educated American people in the world.” (Should we be blaming his teachers?)

It’s Funny Because It’s Not Funny

Sunday, March 6th, 2011

I recently saw a particularly poignant piece of graffito etched on a friend’s Facebook wall:

A public union employee, a tea party activist and a CEO are sitting at a table with a plate of a dozen cookies in the middle of it. The CEO takes 11 of the cookies, turns to the tea partier and says, “Watch out for that union guy. He wants a piece of your cookie.”

And while this might easily refer to any number of anti-labor sentiments, it seems most appropriate as a reaction to the current – inexplicable – War on Teachers that has been raging in the media lately.

If you haven’t seen last Thursday’s Daily Show, you really need to go watch it. In a brilliant piece at the top of the show, Jon Stewart demonstrates the hypocrisy of the right-wing talking heads when talking about teachers. Later, he interviews education truth-teller Diane Ravitch, who lays out the rest of the argument.

If you want to understand the conversations surrounding education reform, then – as Tom Tomorrow says in this week’s strip – that’s all you need to know.

It’s a Poor Workman Who Blames Yogi Berra: Artificial Intelligence and Jeopardy!

Wednesday, February 23rd, 2011

Last week, an IBM computer named Watson beat Ken Jennings and Brad Rutter, the two greatest Jeopardy! players of all time, in a nationally televised event. The Man vs. Machine construct is a powerful one (I’ve even used it myself), as these contests have always captured progressive imaginations. Are humans powerful enough to build a rock so heavy, not even we can lift it?

Watson was named for Thomas J. Watson, IBM’s first president. But he could just as easily have been named after John B. Watson, the American psychologist who is considered to be the father of behaviorism. Behaviorism is a view of psychology that disregards the inner workings of the mind and focuses only on stimuli and responses. This input leads to that output. Watson was heavily influenced by the salivating dog experiments of Ivan Pavlov, and was himself influential in the operant conditioning experiments of B.F. Skinner. Though there are few strict behaviorists today, the movement was quite dominant in the early 20th century.

The behaviorists would have loved the idea of a computer playing Jeopardy! as well as a human. They would have considered it a validation of their theory that the mind could be viewed as merely generating a series of predictable outputs when given a specific set of inputs. Playing Jeopardy! is qualitatively different from playing chess. The rules of chess are discrete and unambiguous, and the possibilities are ultimately finite. As Noam Chomsky argues, language possibilities are infinite. Chess may one day be solved, but Jeopardy! never will be. So Watson’s victory here is a significant milestone.

Much has been made of whether or not the contest was “fair.” Well, of course it wasn’t fair. How could that word possibly have any meaning in this context. There are things computers naturally do much better than humans, and vice versa. The question instead should have been in which direction would the unfairness be decisive. Some complained that the computer’s superior buzzer speed gave it the advantage, but buzzer speed is the whole point.

Watson has to do three things before buzzing in: 1) understand what question the clue is asking, 2) retrieve that information from its database, and 3) develop a sufficient confidence level for its top answer. In order to achieve a win, IBM had to build a machine that could do those things fast enough to beat the humans to the buzzer. Quick reflexes are an important part of the game to be sure, but if that were the whole story, computers would have dominated quiz shows decades ago.

To my way of thinking, it’s actually the comprehensive database of information that gives Watson the real edge. We may think of Ken and Brad as walking encyclopedias, but that status was hard earned. Think of the hours upon hours they must have spent studying classical composers, vice-presidential nicknames, and foods that start with the letter Q. Even a prepared human might temporarily forget the Best Picture Oscar winner for 1959 when the moment comes, but Watson never will. (It was Ben-Hur.)

In fact, given what I could see, Watson’s biggest challenge seemed to be understanding what the clue was asking. To avoid the complications introduced by Searle’s Chinese Room thought experiement, we’ll adopt a behaviorist, pragmatic definition of “understanding” and take it to mean that Watson is able to give the correct response to a clue, or at least a reasonable guess. (After all, you can understand a question and still get it wrong.) Watching the show on television, we are able to see Watson’s top three responses, and his confidence level for each. This gives us remarkable insight into the machine’s process, allowing us a deeper level of analysis.

A lot of my own work lately has been in training school-based data inquiry teams how to examine testing data to learn where students need extra help, and that work involves examining individual testing items. So naturally, when I see three responses to a prompt, I want to figure out what they mean. In this case, Watson was generating the choices rather than simply choosing among them, but that actually makes them more helpful in sifting through his method.

One problem I see a lot in schools is that students are often unable to correctly identify what kind of answer the question is asking for. In as much as Watson has what we would call a student learning problem, this is it. When a human is asked to come up with three responses to a clue, all of the responses would presumably be of the correct answer type. See if you can come up with three possible responses to this clue:

Category: Hedgehog-Pogde
Clue: Hedgehogs are covered with quills or spines, which are hollow hairs made stiff by this protein

Watson correctly answered Keratin with a confidence rating of 99%, but his other two answers were Porcupine (36%) and Fur (8%). I would have expected all three candidate answers to be proteins, especially since the words “this protein” ended the clue. In many cases, the three potential responses seemed to reflect three possible questions being asked rather than three possible answers to a correct question, for example:

Category: One Buck or Less
Clue: In 2002, Eminem signed this rapper to a 7-figure deal, obviously worth a lot more than his name implies

Ken was first to the buzzer on this one and Alex confirmed the correct response, both men pronouncing 50 Cent as “Fiddy Cent” to the delight of humans everywhere. Watson’s top three responses were 50 Cent (39%), Marshall Mathers (20%), and Dr. Dre (14%). This time, the words “this rapper” prompted Watson to consider three rappers, but not three potential rappers that could have been signed by Eminem in 2002. It was Dr. Dre who signed Eminem, and Marshall Mathers is Eminem’s real name. So again, Watson wasn’t considering three possible answers to a question; he was considering three possible questions. And alas, we will never know if Watson would have said “Fiddy.”

It seemed as though the more confident Watson was in his first guess, the more likely the second and third guesses would be way off base:

Category: Familiar Sayings
Clue: It’s a poor workman who blames these

Watson’s first answer Tools (84%) was correct, but his other answer candidates were Yogi Berra (10%) and Explorer (3%). However Watson is processing these clues, it isn’t the way humans do it. The confidence levels seemed to be a pretty good predictor of whether or not a response was correct, which is why we can forgive Watson his occassional lapses into the bizarre. Yeah, he put down Toronto when the category was US Cities, but it was a Final Jeopardy, where answers are forced, and his multiple question marks were an indicator that his confidence was low. Similarly cornered in a Daily Double, he prefaced his answer with “I’ll take a guess.” That time, he got it right. I’m just looking into how the program works, not making excuses for Watson. After all, it’s a poor workman who blames Yogi Berra.

But the fact that Watson interpreted so many clues accurately was impressive, especially since Jeopardy! clues sometimes contain so much wordplay that even the sharpest of humans need an extra moment to unpack what’s being asked, and understanding language is our thing. Watson can’t hear the the other players, which means he can’t eliminate their incorrect responses when he buzzes in second. It also means that he doesn’t learn the correct answer unless he gives it, which makes it difficult for him to catch on to category themes. He managed it pretty well, though. After stumbling blindly through the category “Also on Your Computer Keys,” Watson finally caught on for the last clue:

Category: Also on Your Computer Keys
Clue: Proverbially, it’s “where the heart is”

Watson’s answers were Home is where the heart is (20%), Delete Key (11%), and Elvis Presley quickly changed to Encryption (8%). The fact that Watson was considering “Delete Key” as an option means that he was starting to understand that all of the correct responses in the category were also names of keys on the keyboard.

Watson also is not emotionally affected by game play. After giving the embarrassingly wrong answer “Dorothy Parker” when the Daily Double clue was clearly asking for the title of a book, Watson just jumped right back in like nothing had happened. A human would likely have been thrown by that. And while Alex and the audience may have laughed at Watson’s precise wagers, that was a cultural expectation on their part. There’s no reason a wager needs to be rounded off to the nearest hundred, other than the limitations of human mental calculation under pressure. This wasn’t a Turing test. Watson was trying to beat the humans, not emulate them. And he did.

So where does that leave us? Computers that can understand natural language requests and retrieve information accurately could make for a very interesting decade to come. As speech recognition improves, we might start to see computers who can hold up their end of a conversation. Watson wasn’t hooked up to the Internet, but developing technologies could be. The day may come when I have a bluetooth headset hooked up to my smart phone and I can just ask it questions like the computer on Star Trek. As programs get smarter about interpreting language, it may be easier to make connections across ideas, creating a new kind of Web. One day, we may even say “Thank you, Autocorrect.”

It’s important to keep in mind, though, that these will be human achievements. Humans are amazing. Humans can organize into complex societies. Humans can form research teams and develop awesome technologies. Humans can program computers to understand natural language clues and access a comprehensive database of knowledge. Who won here? Humanity did.

Ken Jennings can do things beyond any computer’s ability. He can tie his shoes, ride a bicycle, develop a witty blog post comparing Proust translations, appreciate a sunset, write a trivia book, raise two children, and so on. At the end of the tournament, he walked behind Watson and waved his arms around to make it look like they were Watson’s arms. That still takes a human.

UPDATE: I’m told (by no less of an authority than Millionaire winner Ed Toutant) that Watson was given the correct answer at the end of every clue, after it was out of play. I had been going crazy wondering where “Delete Key” came from, and now it makes a lot more sense. Thanks, Ed!

Accountability

Tuesday, February 1st, 2011

I was talking to my graduate students about the literacy standards last night, and predictably got pulled off on a tangent about accountability. I found myself making a point that I’ve alluded to before, but it’s worth making explicit now.

Robert Benchley famously said “There are two kinds of people in the world: those who divide the world into two kinds of people, and those who don’t.” I will put myself in the former category when I say that, generally, there are two kinds of people who talk about standards and accountability.

The first believes that anything worth doing is worth doing well. In order to make sure we’re doing the best job we can, it’s important to measure our results, so we can identify areas for potential improvement and apply strategies for intervention where they will do the most good.

The second believes that taxpayer-funded education is one of the evils of socialism and must be eradicated. In order to make the necessary changes, evidence must be gathered that the public education system is a failure, so that arguments to turn education over to the free market will be more persuasive.

And my point was that, when you hear someone talking about standards and accountability, it’s important to know which of these two groups that person is in.

Item of the Week

Monday, January 17th, 2011

In this somewhat new blog feature, I will offer up a question from the statewide examinations that New York City students take each year. The purpose of this will not be for you to try to provide the correct answer, but rather to join me in examining the question. What does it tell us about student understanding? What do each of the wrong answers mean? What is this question testing? What is it really testing? What would students need to know and be able to do to answer this question correctly?

I gave a workshop for data teams on Friday. Three of the groups were examining last year’s 4th grade ELA scores, which I knew meant that we’d be talking about Abigail. In my visits to schools, I’ve found that students who took this exam had a lot of trouble on questions relating to this poem (click to enlarge):

Students had trouble on a number of the questions, but we will just look at one: Item 21 on the 2010 New York State Grade 4 ELA Exam:



The intended performance indicator is “Make predictions, draw conclusions, and make inferences about events and characters,” but we can be the judge of that.

What is this question testing? Does it fit the performance indicator? Which of the wrong answers would you predict students would choose the most often? Why? What would students need to know and be able to do to answer this question correctly?