Parametric Search







Our Guest Author this week is Philip James, founder and CEO of the neat wine search engine Snooth.

His topic today is Parametric Search.


Google’s a great search engine and I use it daily. When I’m not exactly sure what I’m looking for nothing else comes close to making sense of the random strings I throw at it. But when I want to buy the new Harry Potter book, I go to Amazon, select books from the drop down and simply type in the eponymous hero’s name.

What Google does well is searching of text documents. Luckily for them html pages are essentially text documents. But, when the html page is merely an electronic representation of a physical item, Google often comes up short.

Go tell it to find the most relevant ‘document’ for the words “Harry Potter” and it dutifully returns movie times, the official site, the Wikipedia entry and so on. Even with the search “Harry Potter book” the top link (a news result) still isn’t what I want. It’s a simple example, but ramp it up and things break down fast.

We do wine search. Come to our site and type in “big cab” and you’ll get a list of big, fruity, cabernet sauvignon wines, which I hope is what you were looking for at that point. Google can’t hope to guess your intentions based off of just those two words.

“But that’s just vertical search” I hear you cry. Fine, but it gets much better. Staying with wine; some bright spark decided to call his Chenin Blanc wine “Chardonnay No Way Cuvee”. Unsurprisingly, this shows up on most search engine’s results for “Chardonnay”. I think we can all agree that a user wants a wine that’s actually made with Chardonnay at this point.

Parametric Search sidesteps this by setting this specific wine’s attributes as Varietal=Chenin Blanc and Brand=Chardonnay No Way and then performing a search on the attributes only. You type in Chardonnay, and the algorithm finds Chardonnay in the Varietal table, then just searches for wines which have Varietal set as Chardonnay. As we know its wine you’re looking for, it’s a logical step to then allow you to filter the results by vintage and price etc.

The limitations of this are just as obvious. Without good data you can’t extract the attributes, rendering parametric search useless, and unless you can determine the user’s intent you need to stick to a single vertical and let the users signal their intent by coming to your site.

Parametric search is really the tool of the vertical search engine, and companies like Kayak use it very well. The true panacea would be if a generic search engines was able to add relevant parametric search options to the results pane, after it had determined the user intent from the initial query. Until then vertical search engines will be able to carve out their own niches.

One Response to “Parametric Search”

  1. Falafulu Fisi Says:

    Phillip James said…
    [What Google does well is searching of text documents. Luckily for them html pages are essentially text documents. But, when the html page is merely an electronic representation of a physical item, Google often comes up short.]

    Phil, I want to mention that Google currently uses completely two different dataset for computing the importance of document ranking. The first dataset that is used as input to their system is a matrix (rows and columns) of links (eg: doc-A links to doc-B, doc-C, etc,… , doc-B links to doc-D, doc-A, doc-E, etc,…). The other matrix is the term-by-document (frequency of terms that appears in documents). The matrix of web-links is computed via their famous PageRank algorithm and the matrix of the frequency of term-by-documents is computed via matrix algebra (or matrix factorisation) algorithm, and one of such common algorithm is called SVD (Singular Value Decomposition), although there are other recent new published algorithms on matrix factorisation that are available from literatures. The SVD computation is called Latent Semantic Indexing (finding the similarity amongst contents of documents) .I don’t know how Google combines the results of the 2 different computations for its final ranking of documents.

    You might be interested in SVD which has been applied in product recommender systems such as yours:

    “Application of Dimensionality Reduction in Recommender System”
    http://www.grouplens.org/papers/pdf/webKDD00.pdf

    Phillip James said…
    [We do wine search. Come to our site and type in “big cab” and you’ll get a list of big, fruity, cabernet sauvignon wines, which I hope is what you were looking for at that point. Google can’t hope to guess your intentions based off of just those two words.]

    The reason for this is that recommender systems work on one type of data, ie, user-item-rating or users surfing-history where as compared to Google it works on 2 different types of data (links & contents). Also, recommender systems for a site like yours contain much smaller dataset compared to Google.

    I am very well familiar with most recommender systems algorithms and I have implemented a few of them. If you’re interested in the latest dimensional reduction algorithm that is used in recommender systems that have been published in literatures recently, then I am happy to point them out to you.

Leave a Reply