trinetizen: Wolfram Alpha: Google + Wikipedia?

There is buzz about genius entrepreneur Stephen Wolfram's unveiling of Wolfram Alpha in May, 2009. Nova Spivack who was privy to a private demo reveals on TechCrunch some details on the "computational knowledge engine" he touts will be The Next Big Thing:

It doesn’t simply return documents that (might) contain the answers, like Google does, and it isn’t just a giant database of knowledge, like the Wikipedia. It doesn’t simply parse natural language and then use that to retrieve documents, like Powerset, for example. Instead, Wolfram Alpha actually computes the answers to a wide range of questions — like questions that have factual answers such as "What country is Timbuktu in?" or "How many protons are in a hydrogen atom?" or "What is the average rainfall in Seattle?"

Think about that for a minute. It computes the answers. Wolfram Alpha doesn’t simply contain huge amounts of manually entered pairs of questions and answers, nor does it search for answers in a database of facts. Instead, it understands and then computes answers to certain kinds of questions.

Wolfram Alpha is a system for computing the answers to questions. To accomplish this it uses built-in models of fields of knowledge, complete with data and algorithms, that represent real-world knowledge.

For example, it contains formal models of much of what we know about science — massive amounts of data about various physical laws and properties, as well as data about the physical world.

Based on this you can ask it scientific questions and it can compute the answers for you. Even if it has not been programmed explicity to answer each question you might ask it.

But science is just one of the domains it knows about — it also knows about technology, geography, weather, cooking, business, travel, people, music, and more.

It also has a natural language interface for asking it questions. This interface allows you to ask questions in plain language, or even in various forms of abbreviated notation, and then provides detailed answers.

The vision seems to be to create a system wich can do for formal knowledge (all the formally definable systems, heuristics, algorithms, rules, methods, theorems, and facts in the world) what search engines have done for informal knowledge (all the text and documents in various forms of media).

In his own blogpost, Wolfram describes what he was trying to achieve:

Fifty years ago, when computers were young, people assumed that they’d quickly be able to handle all these kinds of things.

And that one would be able to ask a computer any factual question, and have it compute the answer.

But it didn’t work out that way. Computers have been able to do many remarkable and unexpected things. But not that.

I’d always thought, though, that eventually it should be possible. And a few years ago, I realized that I was finally in a position to try to do it.

I had two crucial ingredients: Mathematica and NKS. With Mathematica, I had a symbolic language to represent anything—as well as the algorithmic power to do any kind of computation. And with NKS, I had a paradigm for understanding how all sorts of complexity could arise from simple rules.

But what about all the actual knowledge that we as humans have accumulated?

A lot of it is now on the web—in billions of pages of text. And with search engines, we can very efficiently search for specific terms and phrases in that text.

But we can’t compute from that. And in effect, we can only answer questions that have been literally asked before. We can look things up, but we can’t figure anything new out.

So how can we deal with that? Well, some people have thought the way forward must be to somehow automatically understand the natural language that exists on the web. Perhaps getting the web semantically tagged to make that easier.

But armed with Mathematica and NKS I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.

It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.

But, OK. Let’s say we succeed in creating a system that knows a lot, and can figure a lot out. How can we interact with it?

The way humans normally communicate is through natural language. And when one’s dealing with the whole spectrum of knowledge, I think that’s the only realistic option for communicating with computers too.

Of course, getting computers to deal with natural language has turned out to be incredibly difficult. And for example we’re still very far away from having computers systematically understand large volumes of natural language text on the web.

But if one’s already made knowledge computable, one doesn’t need to do that kind of natural language understanding.

All one needs to be able to do is to take questions people ask in natural language, and represent them in a precise form that fits into the computations one can do.

Of course, even that has never been done in any generality. And it’s made more difficult by the fact that one doesn’t just want to handle a language like English: one also wants to be able to handle all the shorthand notations that people in every possible field use.

I wasn’t at all sure it was going to work. But I’m happy to say that with a mixture of many clever algorithms and heuristics, lots of linguistic discovery and linguistic curation, and what probably amount to some serious theoretical breakthroughs, we’re actually managing to make it work.

Pulling all of this together to create a true computational knowledge engine is a very difficult task.

It’s certainly the most complex project I’ve ever undertaken. Involving far more kinds of expertise—and more moving parts—than I’ve ever had to assemble before.

And—like Mathematica, or NKS—the project will never be finished.

But I’m happy to say that we’ve almost reached the point where we feel we can expose the first part of it.

It’s going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.

We’re all working very hard right now to get Wolfram|Alpha ready to go live.

LINKS:
Wolfram Alpha computes to factual questions. This is going to be big.
Wolfram|Alpha is coming
God, Stephen Wolfram and Everything Else

trinetizen

About Me

Monday, March 09, 2009

Wolfram Alpha: Google + Wikipedia?

0 Comments: