Archive for the 'Game Theory' Category

It’s a Poor Workman Who Blames Yogi Berra: Artificial Intelligence and Jeopardy!

Wednesday, February 23rd, 2011

Last week, an IBM computer named Watson beat Ken Jennings and Brad Rutter, the two greatest Jeopardy! players of all time, in a nationally televised event. The Man vs. Machine construct is a powerful one (I’ve even used it myself), as these contests have always captured progressive imaginations. Are humans powerful enough to build a rock so heavy, not even we can lift it?

Watson was named for Thomas J. Watson, IBM’s first president. But he could just as easily have been named after John B. Watson, the American psychologist who is considered to be the father of behaviorism. Behaviorism is a view of psychology that disregards the inner workings of the mind and focuses only on stimuli and responses. This input leads to that output. Watson was heavily influenced by the salivating dog experiments of Ivan Pavlov, and was himself influential in the operant conditioning experiments of B.F. Skinner. Though there are few strict behaviorists today, the movement was quite dominant in the early 20th century.

The behaviorists would have loved the idea of a computer playing Jeopardy! as well as a human. They would have considered it a validation of their theory that the mind could be viewed as merely generating a series of predictable outputs when given a specific set of inputs. Playing Jeopardy! is qualitatively different from playing chess. The rules of chess are discrete and unambiguous, and the possibilities are ultimately finite. As Noam Chomsky argues, language possibilities are infinite. Chess may one day be solved, but Jeopardy! never will be. So Watson’s victory here is a significant milestone.

Much has been made of whether or not the contest was “fair.” Well, of course it wasn’t fair. How could that word possibly have any meaning in this context. There are things computers naturally do much better than humans, and vice versa. The question instead should have been in which direction would the unfairness be decisive. Some complained that the computer’s superior buzzer speed gave it the advantage, but buzzer speed is the whole point.

Watson has to do three things before buzzing in: 1) understand what question the clue is asking, 2) retrieve that information from its database, and 3) develop a sufficient confidence level for its top answer. In order to achieve a win, IBM had to build a machine that could do those things fast enough to beat the humans to the buzzer. Quick reflexes are an important part of the game to be sure, but if that were the whole story, computers would have dominated quiz shows decades ago.

To my way of thinking, it’s actually the comprehensive database of information that gives Watson the real edge. We may think of Ken and Brad as walking encyclopedias, but that status was hard earned. Think of the hours upon hours they must have spent studying classical composers, vice-presidential nicknames, and foods that start with the letter Q. Even a prepared human might temporarily forget the Best Picture Oscar winner for 1959 when the moment comes, but Watson never will. (It was Ben-Hur.)

In fact, given what I could see, Watson’s biggest challenge seemed to be understanding what the clue was asking. To avoid the complications introduced by Searle’s Chinese Room thought experiement, we’ll adopt a behaviorist, pragmatic definition of “understanding” and take it to mean that Watson is able to give the correct response to a clue, or at least a reasonable guess. (After all, you can understand a question and still get it wrong.) Watching the show on television, we are able to see Watson’s top three responses, and his confidence level for each. This gives us remarkable insight into the machine’s process, allowing us a deeper level of analysis.

A lot of my own work lately has been in training school-based data inquiry teams how to examine testing data to learn where students need extra help, and that work involves examining individual testing items. So naturally, when I see three responses to a prompt, I want to figure out what they mean. In this case, Watson was generating the choices rather than simply choosing among them, but that actually makes them more helpful in sifting through his method.

One problem I see a lot in schools is that students are often unable to correctly identify what kind of answer the question is asking for. In as much as Watson has what we would call a student learning problem, this is it. When a human is asked to come up with three responses to a clue, all of the responses would presumably be of the correct answer type. See if you can come up with three possible responses to this clue:

Category: Hedgehog-Pogde
Clue: Hedgehogs are covered with quills or spines, which are hollow hairs made stiff by this protein

Watson correctly answered Keratin with a confidence rating of 99%, but his other two answers were Porcupine (36%) and Fur (8%). I would have expected all three candidate answers to be proteins, especially since the words “this protein” ended the clue. In many cases, the three potential responses seemed to reflect three possible questions being asked rather than three possible answers to a correct question, for example:

Category: One Buck or Less
Clue: In 2002, Eminem signed this rapper to a 7-figure deal, obviously worth a lot more than his name implies

Ken was first to the buzzer on this one and Alex confirmed the correct response, both men pronouncing 50 Cent as “Fiddy Cent” to the delight of humans everywhere. Watson’s top three responses were 50 Cent (39%), Marshall Mathers (20%), and Dr. Dre (14%). This time, the words “this rapper” prompted Watson to consider three rappers, but not three potential rappers that could have been signed by Eminem in 2002. It was Dr. Dre who signed Eminem, and Marshall Mathers is Eminem’s real name. So again, Watson wasn’t considering three possible answers to a question; he was considering three possible questions. And alas, we will never know if Watson would have said “Fiddy.”

It seemed as though the more confident Watson was in his first guess, the more likely the second and third guesses would be way off base:

Category: Familiar Sayings
Clue: It’s a poor workman who blames these

Watson’s first answer Tools (84%) was correct, but his other answer candidates were Yogi Berra (10%) and Explorer (3%). However Watson is processing these clues, it isn’t the way humans do it. The confidence levels seemed to be a pretty good predictor of whether or not a response was correct, which is why we can forgive Watson his occassional lapses into the bizarre. Yeah, he put down Toronto when the category was US Cities, but it was a Final Jeopardy, where answers are forced, and his multiple question marks were an indicator that his confidence was low. Similarly cornered in a Daily Double, he prefaced his answer with “I’ll take a guess.” That time, he got it right. I’m just looking into how the program works, not making excuses for Watson. After all, it’s a poor workman who blames Yogi Berra.

But the fact that Watson interpreted so many clues accurately was impressive, especially since Jeopardy! clues sometimes contain so much wordplay that even the sharpest of humans need an extra moment to unpack what’s being asked, and understanding language is our thing. Watson can’t hear the the other players, which means he can’t eliminate their incorrect responses when he buzzes in second. It also means that he doesn’t learn the correct answer unless he gives it, which makes it difficult for him to catch on to category themes. He managed it pretty well, though. After stumbling blindly through the category “Also on Your Computer Keys,” Watson finally caught on for the last clue:

Category: Also on Your Computer Keys
Clue: Proverbially, it’s “where the heart is”

Watson’s answers were Home is where the heart is (20%), Delete Key (11%), and Elvis Presley quickly changed to Encryption (8%). The fact that Watson was considering “Delete Key” as an option means that he was starting to understand that all of the correct responses in the category were also names of keys on the keyboard.

Watson also is not emotionally affected by game play. After giving the embarrassingly wrong answer “Dorothy Parker” when the Daily Double clue was clearly asking for the title of a book, Watson just jumped right back in like nothing had happened. A human would likely have been thrown by that. And while Alex and the audience may have laughed at Watson’s precise wagers, that was a cultural expectation on their part. There’s no reason a wager needs to be rounded off to the nearest hundred, other than the limitations of human mental calculation under pressure. This wasn’t a Turing test. Watson was trying to beat the humans, not emulate them. And he did.

So where does that leave us? Computers that can understand natural language requests and retrieve information accurately could make for a very interesting decade to come. As speech recognition improves, we might start to see computers who can hold up their end of a conversation. Watson wasn’t hooked up to the Internet, but developing technologies could be. The day may come when I have a bluetooth headset hooked up to my smart phone and I can just ask it questions like the computer on Star Trek. As programs get smarter about interpreting language, it may be easier to make connections across ideas, creating a new kind of Web. One day, we may even say “Thank you, Autocorrect.”

It’s important to keep in mind, though, that these will be human achievements. Humans are amazing. Humans can organize into complex societies. Humans can form research teams and develop awesome technologies. Humans can program computers to understand natural language clues and access a comprehensive database of knowledge. Who won here? Humanity did.

Ken Jennings can do things beyond any computer’s ability. He can tie his shoes, ride a bicycle, develop a witty blog post comparing Proust translations, appreciate a sunset, write a trivia book, raise two children, and so on. At the end of the tournament, he walked behind Watson and waved his arms around to make it look like they were Watson’s arms. That still takes a human.

UPDATE: I’m told (by no less of an authority than Millionaire winner Ed Toutant) that Watson was given the correct answer at the end of every clue, after it was out of play. I had been going crazy wondering where “Delete Key” came from, and now it makes a lot more sense. Thanks, Ed!

Conundrum: Russian Roulette

Tuesday, January 25th, 2011

In Russian Roulette, a six-chambered revolver is loaded with one round, the cylinder is spun to place the round in a random position, and participants take turns pointing the gun to their heads and pulling the trigger until one player loses.

Imagine you are playing this game (for whatever reason) with one other person, but do not wish to die.

1. Assume there is one round and the cylinder is spun only once, at the beginning of the game. Is it better to go first or second?

2. Assume there is one round and the cylinder is spun after each player’s turn. Is it better to go first or second?

3. Assume there are two rounds in random position and the cylinder is spun only once, at the beginning of the game. Is it better to go first or second?

4. Assume there are two rounds in random position. The first player shoots an empty chamber. You have the option of shooting the gun as is, or spinning the cylinder first. Which do you choose?

5. Assume there are two rounds in a random position – but you are told that the two rounds are in consecutive chambers. The first player shoots an empty chamber. You have the option of shooting the gun as is, or spinning the cylinder first. Which do you choose?

6. Assume there are two rounds in a random position – but you are told that the two rounds are in consecutive chambers. The cylinder is spun only once, at the beginning of the game. Is it better to go first or second?

These are pure probability questions, for entertainment purposes only. Shakespeare Teacher in no way condones the use of firearms in this manner.

Conundrum: Solved Games

Tuesday, December 11th, 2007

A game is considered to be “solved” when all of the possible moves have been mapped out in a mathematical tree and thus the perfect set of moves can be determined regardless of an opponent’s play.

Tic-Tac-Toe is a pretty easy one. You solved this as a kid. There are three opening moves – corner, edge, center. And then you work from there.

Connect Four was solved in 1988. That’s because those new-fangled computer thingies were starting to get some real power behind them. If you want to play Connect Four against the best opponent you’ve ever played in your life, check out the applet on John’s Connect Four Playground which is programmed to play flawlessly, based on a database of pre-determined best moves. But if you go first, and play just as flawlessly, you can beat it.

Checkers was solved this past April by researchers from the University of Alberta. You can play against Chinook, which will play flawlessly, but the best you can hope for is a draw. It doesn’t matter how amazingly good you are at checkers. You will never win. For me, there’s something a little disturbing about that.

Could chess be next? There are an incredibly large number of possible games, but it must be finite. And if it’s finite, then the tree must conceptually exist even if nobody has been able to come close to mapping it yet. Some see chess playing ability as intutive and creative, and not merely a number cruching process. But if number crunching continues to get better, it might evolve to the point where we get a chess-playing program as unbeatable as Chinook.

To be clear, we’re not talking about a really, really good chess-playing program. We have that now. We’re talking about a program that can access an exhaustive database of pre-determined best moves in order to ensure the most favorable outcome possible.

What do you think?

Will computers ever solve chess?

Conundrum: A Fair Deal

Tuesday, June 5th, 2007

I often like to come up with games of chance. There have been times in my life when this has been profitable, but mostly I’m just interested in questions of statistics and probability.

I had considered the math behind putting together my own Deal or No Deal style game, but with greatly reduced suitcase amounts and with a cost to play. Determining a fair cost (one which I would agree to if I were the player or the banker) at first seems like a hopelessly difficult problem, but the math is actually quite simple. The player has the option of keeping the initial suitcase until the end, and the banker has the option of offering whatever small amount he wants. At any given time the chosen suitcase is worth the average of all unopened cases. The banker certainly isn’t going to offer more, and if the player accepts less it’s just because he’s hedging his bets. The cost to play should be the average of all of the cases, whatever they may be.

A couple of months ago, while discussing the Two Envelopes problem, we briefly discussed what’s known as the Monty Hall problem, after the host of Let’s Make A Deal. Thinking of that problem has inspired another gambling proposition which is this week’s Conundrum.

Let’s continue to call our two gamblers the banker and the player. The banker has three boxes and hides a $10 bill in one of the boxes and a $1 bill in each of the other two. The player pays a set amount to the banker and chooses one of the three boxes. The banker must then open one of the other two boxes and show the player a $1 bill. Then the player can decide whether to keep the contents of the box he chose or switch to the other unopened box.

What would be the fair amount for the player to pay the banker to play this game?

UPDATE: Question solved by David. See comments for the answer.

Conundrum: Two Boxes

Tuesday, April 17th, 2007

Researchers in Germany are working on a way to predict the intentions of human subjects by observing their brain activity. Damn!

For some reason it’s a little disturbing to me that something as personal and ephemeral as an intention can have a physiological manifestation that can be measured. Or maybe I’m just disturbed that they are now starting to measure it. What new “mind reading” technologies might be developed from this science? Could it become prosecutable to merely intend to commit a crime? Intent is already used as a legal concept, and attempted murder is considered a crime, even if nobody is hurt as a result. Could market researchers measure the intent of potential consumers? Will we one day have little handheld devices that can measure intent at a poker table or when our daughter’s date arrives to pick her up?

It all reminds me of a thought experiment made popular by Robert Nozick, which will be this week’s Conundrum. Before we get to it, though, it might be helpful to consider another thought experiment known as Kavka’s Toxin.

Let’s say I offer you $100,000 if you can form an intention to drink a particular toxin. This toxin will make you violently ill for about five or six hours, after which you will be perfectly fine. You’d drink it for the money, but you’re not being asked to drink it. You’re being asked to intend to drink it. After you have the money, you are free to change your mind and not drink it. The question is, can you actually form a genuine intention of doing something unpleasant that you will have no motivation to do?

Turn that one over in your mind for a few moments before moving on to this week’s Conundrum, Newcomb’s Problem.

Imagine there are two boxes, Box A and Box B. You will have the option of choosing to take both boxes, or to take Box B alone. You will keep what you find inside. Box A is transparent and contains one thousand dollars. Box B is opaque. A super-intelligent alien scientist with a proven track record of accurately predicting human behavior has analyzed you and has secretly made a prediction about which you will choose. If he believes you will choose Box B alone, he has put one million dollars inside. If he believes you will take both boxes, then he has left Box B empty. Which do you choose?

The super-intelligent scientist has run this trial with several hundred other humans, and has made a correct prediction each time. The only people who have ended up with the million are the ones who chose Box B alone. On the other hand, our alien friend has already made his prediction and left. Your choice can no longer affect the amounts that are in the boxes. You may as well take them both, right?

Fans of game theory might recognize this as a variation of the Prisoner’s Dilemma. Game theory would likely suggest that you flip a coin, so we’re going to disallow that option. You must rely on reasoning alone.

Unlike last week’s math puzzler, this one doesn’t have a right or wrong answer. It’s a thought experiment designed to test your conceptions of free will vs. determinism.

Or as Nozick put it:

To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly.

It will be interesting to hear how people answer this.

Will you take both boxes, or Box B alone?

Feel free to answer the question, or continue the discussion of any of the topics covered above.

Conundrum: Two Envelopes

Tuesday, April 10th, 2007

I overheard this once on a train and was never able to figure it out. Maybe someone here can help me.

Imagine I have two envelopes and I tell you truthfully that both contain money and that one envelope contains twice as much money as the other. I offer you your choice of envelope and you choose one of them without opening it.

Now I ask you if you would like to switch envelopes. You chose yours randomly, so it’s a 50/50 chance whether the other envelope contains half as much or twice as much. So, if the amount you now have is x, there’s a 50 percent chance that switching would get you 2x and a 50 percent chance it will get you x/2. You have twice as much to gain as you have to lose, regardless of how much is in your envelope, so it makes sense mathematically to switch envelopes.

But of course, this is ridiculous, since you have no new information about the two envelopes than you had before. Once you’ve made that switch, by the same logic, you should want to switch again. This much seems obvious. So where’s the flaw in the math above?

By the way, I consulted our good friend Wikipedia before posting this, and it was little help. It just mumbled something about Bayesian Decision Theory and said the problem would be easy if I were a mathematician. It then went on to pose a harder problem in which you can look inside one of the envelopes, and an even harder problem that was way over my head at 5:30 am. Thanks, Wikipedia.

The Prisoner’s Dilemma

Wednesday, February 28th, 2007

Via Prospero’s Books, I found this article about robots being used to simulate evolution. I’ve read about similar projects simulating evolution through competing artificial intelligence programs, using the “Prisoner’s Dilemma” scenario as the competitive task. The Prisoner’s Dilemma, for those who are unfamiliar, breaks down as some variation of this:

You and a partner are both correctly arrested for two crimes, one major and one minor, and are put in separate rooms. Executive Assistant District Attorney Jack McCoy comes to visit you and offers you a deal: testify against your partner for the major crime, your partner will get twenty years, and you’ll walk for both crimes. However, his lovely assistant is right now offering the same deal to your partner. If you both confess, you’ll both get five years. If your partner confesses and you don’t, you’ll get the twenty, and he’ll walk. If neither of you confess, McCoy can’t make his case for the major crime, but he’ll make sure you both do two years for the minor one. What’s the right play?

Well, logically speaking, regardless of what your partner ends up doing, you’re better off confessing. But if you both confess, you both end up worse off than if you had both kept your mouths shut. If you had had the chance to communicate with each other, you might have chosen differently. The fact that you don’t know what your idiot partner is going to do while gazing into the eyes of the lovely ADA means that you can’t afford to take any chances, and neither can he. You both end up doing the nickel, even though neither of you had to.

In this example, you only get to play the game once. If you play some version of the Prisoner’s Dilemma with the same person repeatedly, your choices can affect future outcomes. In a sense, the choices you make are a form of communication. Only the very last time you play do you revert back to the original cutthroat scenario. (And since everybody knows this will be the case, the next-to-last iteration can also be cutthroat. How far back does this reasoning work?) There is actually a twenty-year-old Iterated Prisoner’s Dilemma competition for artificial intellegence programs and the winning strategy has long been the simple Tit-for-Tat. But it seems there’s now a new champion, though it seems to me to be a bit of a cheat. Read the article and let me know what you think.

The Prisoner’s Dilemma is an illustration of one of the central concepts of a branch of mathematics called “game theory.” Game theory allows us to make mathematical computations in decision making, even when all of the factors are not known. Think of two generals, one trying to choose a target to attack, the other deciding how to deploy defensive forces. Each knows the other is intelligent and out there making his decision. That’s game theory. If you were to meet someone anywhere in the world outside of the United States, but you couldn’t plan with that person ahead of time, where would you go? Would it surprise you to learn that almost everyone makes the same choice? (Post your answer in the comments section, if you like.) That’s game theory too.

With a branch of mathematics that can take unknown variables into account, a computer’s functionality can be increased significantly. Obviously computers that are powerful enough can play chess, but game theory allows them to play poker as well. There’s already a Texas Hold ‘Em Tournament for Artificial Intelligence programs. Imagine putting all of these programs into a giant simulated Texas Hold ‘Em Tournament where the losing programs died out and the winning programs created offspring with the possibility of mutation. We might evolve the ultimate strategy. And when we do, the first round of drinks are on me!

But as computers get more powerful, imagine other simulations we may be able to run, and what understandings we might be able to gain from these experiments. Evolution has proved itself to be a mighty force in the past. Once all of the data from Web 2.0 is compiled, maybe it will be allowed to evolve into Web 3.0. It’s not about computers becoming super-sentient and ruling over humans. It’s about humans developing and using new tools that can increase our capacity for growth. And if evolution has taught us nothing else, it has taught us that.