Socialize Me: Computers vs Humans at Jeopardy!

Friday, February 11, 2011

Computers vs Humans at Jeopardy! - Are You Ready?

I've been anxiously waiting for this. We are just days away from the grand event where IBM's supercomputer (a.k.a. Watson) will take on previous Jeopardy! champions head on. Last month, there was a practice round between Watson and the former champions, and Watson won. I can't wait to see who wins the the real competition for $1 million.

On Wednesday of this week, PBS aired a special called NOVA: Smartest Computer on Earth. If you missed it, check out the live blog replay. One of the interesting comments was a question from the NOVA show to the IBM Researchers who created Watson. The question was about if they were scared about how Watson learns (SkyNet anyone?). The researchers said they are not too worried about it because they know how it learns and that Watson can only do what it was programmed to do.

IBM's Researchers in Conjunctions with Universities

Earlier today, IBM's press release announced that Watson was created in part with IBM's awesome research team but in partnership with several universities such as: MIT, Carnegie-Mellon, RPI, and UMass among others. I'm a big fan of MIT and artificial intelligence so when I read that MIT was involved my interest peaked even more.

A team of researchers from Massachusetts Institute of Technology, led by Boris Katz, principal research scientist at MIT's Computer Science and Artificial Intelligence Laboratory, pioneered an online natural language question answering system called START, which has the ability to answer questions with high precision using information from semi-structured and structured information repositories. The underlying contribution to the Watson system is the ability to break down the question into simple sub-questions for responses to be quickly collected and then fused back together to come up with an answer. The Watson system architecture also took advantage of the object-property-value data model pioneered by MIT, which enables the information in semi-structured data sources to be effectively retrieved in response to natural language questions.

Learning More From MIT

I was lucky enough to get a chance and "sit down" with Boris Katz last night. As we kicked off the call we found that we have tons of things in common. Turns out that I used his technology as part of my thesis work at MIT. It also turns out that Ask Jeeves has licensed his patent, and Ask Jeeves was also the company the acquired the start-up I worked at shortly before joining IBM. So we are pretty sure that we at least worked in the same building at one point at MIT.

So this brought up the question, when did START (SynTactic Analysis using Reversible Transformations) go live? I learned that START went online in 1993 about the time the web started. Mr. Katz said that he immediately got value out of it as people from all over the world were using it and he was getting valuable research data for free (crowdsourcing at its best 18 years ago!!). As I got ready to interview Mr. Katz, I played with START to see how it worked. Here's what I asked and the responses from the computer:

Q: Who is better the Red Sox or the Yankees?
A: Sorry - I don't know whether or not the Red Sox are better than the yankees.
Q; What is the weather for NYC this Saturday?
A: It gave me real-time info from weather.com
Q: Who is luis benitez?
A: I am sorry to say I don't know who Luis Benitez is. (notice that the name is capitalized. so it detected my name, but just didn't know who I was. I also detect some humor in its answers)

I asked Mr. Katz who approached whom when it came time to incorporate START into Watson. Turns out IBM approached him as IBM was trying to replicate results that other companies claimed they could do in the area of natural language processing. Since other companies weren't giving IBM their code, IBM in conjunction with other institutions and university decided to create the Open Advancement of Question Answering Initiative (OAQA). That's where Mr. Katz got involved giving talks and demos to IBM researchers and software engineer.

Clearly START provided some great intellectual capital to Watson so I asked Mr. Katz if he had to change some of the code to be integrated into Watson. Turns out that Mr. Katz didn't provide the code since, he says, IBM has far superior software engineers than undergraduates that barely work a couple of hours a week on it. (funny)

Looking Towards the Future

Watson and START sound like amazing technologies that could prove to be the next big search engine. Mr. Katz says that the technology is still in a "playground" phase. "It's important to society because it's going to bring new students to the field, but I think we are still very far from understanding language and answering questions or respond properly to any query", says Mr. Katz. And this is evidenced, according to him, when you see Watson give silly responses. There's still tons of work that needs to happen and he's evaluating how to make the system better either by giving it vision, touch, partnering with cognitive sciences, etc.

This brought me to my next question. I told him how some people were referencing Watson as the beginning of Skynet since Watson has some learning capabilities in it. Mr. Katz says that Watson uses Machine Learning technologies with history from the past 20 years of Jeopardy! It uses this history to pick the best answer from the up to 1000 candidate answers it sometimes generates. If you think about it, this is not how humans think. When you are asked a question, you don't evaluate 1000 past responses/alternatives and try to pick the best one. You just know. According to Mr. Katz, "the real question is, can we build a machine that can scratch the surface and be even as smart as a 3 year old? In some sense computers can do things faster than us like multiply two large numbers together. But 3 year olds can understand language and execute commands like no other machine. How long will it take to build a 3 year old? Frankly, I don't think anyone knows".

Mr. Katz says that he hopes that in 20 years this technology can be used to help society like drive cars, and do simple tasks. Some of this can be seen already today with devices that can recognize certain commands and execute them. IBM certainly has plans for this technology.

Where To Learn More

By the way, if you are like me and like videos, check out the video collection for more on how Watson works here. Watson, of course, has its own Facebook and Twitter accounts (though it's not clear if it's Watson himself responding or humans). Finally, Watson has its own social media aggregator which makes it very easy to see what everyone is saying about Watson in one page, including photos, videos, blogs, tweets, etc. If you want a single place to catch all Watson related news, the aggregator is for you.

What do you think? Is Watson going to win the 3-day competition? Where would you like to see Watson applied in real life?