Guest Author Digvijay Lamba, of Kosmix
Having read of the recent death of the Irish writer Oscar Wilde, a student at Harvard University is looking for information on his life and his achievements. How does he access this information? He looks at the library catalogue, skims through newspapers and magazines, and writes letters to his colleagues.
After a few weeks of research he publishes a tribute to the life and works of Oscar Wilde.
August 5th, 2000
A century later, having read of the recent death of English actor and writer Sir Alec Guinness, a student at Harvard University is looking for information on his life and his achievements. The Wikipedia movement has not yet started and the researcher turns to the recently formed google.com search engine. After a few hours of searching for information, posting on bulletin boards and groups, and browsing websites dedicated to Sir Alec Guinness he is ready with a tribute to the life and works of Sir Alec Guinness.
The year 2100
What will search for information look like in the year 2100? If progress continues at the same rate as the last century, a student in 2100 should be able to go from the thought of writing the essay to the complete essay in a mindboggling 16 seconds. And the rate of progress is increasing.
Even with the best of information, making predictions about the future is a messy business. We can, however, look at the trends and learn from them. What does search look like in 2100 to allow one to go from thought to essay in 16 seconds? I assume that a person will be able to specify his information requirement in some form and a machine will instantly create an answering essay tailored to his or her needs. Instead of presenting a set of links, this answer will be like an automated Wikipedia-like page which will contain not just objective encyclopedic information but also subjective views, statistics, and several other kinds of information. Further, it will be possible for you to specify the extent of information you need, the different aspects of the topic you need covered, the tone of content, the target audience, and several other features that a student would use to make his essay better.
So you can, for example, ask “Give me the list of symptoms of diabetes”, “What is the phone number of my local Wal-Mart?”, “Write a 2 paragraph summary of the Harry Potter series”, “Write a two page essay on the scientific basis of speech in apes as mentioned in the book Congo by Michael Crichton”, “I recently heard about White Holes and want to learn more about the subject and related interesting things”. Current technology can come close to answering the first few questions but it gets harder as the questions get more complex. An ideal information extraction system would not only be able to answer all these questions but will be able to tailor the answers to your needs.
This may sound like a far off dream but we are clearly moving in a direction where a machine will automatically create the perfect article that precisely and completely covers the searched topic.
A search engine of the future
While search engines like Google, Yahoo, and Microsoft Live solve the first few questions above, human created content sites like Wikipedia are trying to do a better job with the later more complex questions by writing the most asked for answers. It is, however, clear that the system of the future will have to automate what Wikipedia is doing and more and do it in several different ways in order to satisfy every user’s need.
Let us try and understand the basic structure of this hypothetical system. On one end we have the users query with some extended specification. On the other end we have an extremely large amount of available content.
The first step this system needs to accomplish is to understand the query better. So we take the user’s question and determine what the subjects this query is interested in are, what the kinds of information that the user wants are, what is the tone of the answering essay, and what is the extent and depth of the returned content. So we know the user wants information on the book Congo and on the scientific basis of speech in apes. We also know he wants a two page essay and is interested in more authoritative scientific sources.
The rest of this article is on the Kosmix blog:
















