A few good tips from Mark Johnson on rating search engines.
Please read the entire article on his blog Deliberate Ambiguity.
While demoing Live Search at the Web 2.0 Expo, people continually asked the same questions: “What makes Live different?” or “Show me some features that will make me want to switch from my search engine” or the extremely confrontational “Why do you think you’re better than Google?”
My first instinct was to dive in and show people the coolest features in Live Search (e.g., demoing Virtual Earth with an Xbox controller) or to let them play around with their own queries.
However, given my experience working for several startup search engines, I’ve come to realize that it’s extremely difficult to convince someone that you’re better than another engine with words, features, or few carefully chosen queries.
So, after awhile, I started my demos with a caveat about the nature of a search engine: I implored my audience to try out Live Search for a week so that, in the words of the immortal Lavar Burton of Reading Rainbow, “But, you don’t have to take my word for it.”
Is this a cop-out?
Why is demoing search so hard?
Search “Features:” (more)
Common mistakes when evaluating a search solution: (more)
* A few good/bad results don’t mean that all results will be good/bad (more)
* It’s hard to select a representative cross section of queries (more)
* What you think is “good” may not be good for the majority of users (more)
* Queries are out of context (more)
* People tend to focus on the first result (more)
There are probably countless other mistakes that are made during solo evaluations of search. Therefore, search engines big and small realize that problems of ranking and relevance – the core of any search project – are solved only by lots and lots and lots of data from lots and lots and lots of people. To solve this data problem, we need to collect data from real users. For example, we run many thousands of queries past human judges and look at mountains of click data from the production site. After applying apply advanced statistical techniques to this data, we get the information we need to create algorithms that turn your few (mispelled) words and turn them into a useful page of results.
As one of my colleagues at Powerset always likes to remind me: this is rocket science.