A search engine for data mining? No, just mining.

August 10th, 2009 by Charles S. Knight
Posted in Verticals | No Comments »

asp_m_hmpg
SearchMining.net is a search engine created specifically for the mining industry. SearchMining.net searches the highest quality and most relevant mining websites and allows our users to search through them with the click of a button. By only indexing websites that are highly relevant to the mining industry, we ensure that users save time and get a better result. Source: SearchMining.net

Q and A with real-time search engine OneRiot

August 10th, 2009 by Guest Author
Posted in CEO Views, Guest Authors, Realtime | No Comments »

or

A short exchange with Tobias Peggs of OneRiot:

Q: Which  search strategies do you think hold the most potential for advertisers/marketers?

OneRiot: Firstly, advertisers/marketers will benefit from being where the users are. And increasingly that is realtime search. Industry stats suggest that 40% of searches would be best answered by results from realtime search engines (these are people trying to find out “what’s going right now” for something as heavyweight as “Iran Election” or as entertaining as “Britney Spears”). Stats aside, you can see the stellar growth of services like Twitter (which shows you the conversation around a query) and OneRiot (which shows you fresh, socially-relevant web content related to the query) underlining that fact.

Secondly, advertisers/marketers will benefit from a particular user behavior found on realtime search engines. That is: users tend to search many more times per day for the same query than they do on a traditional search engine. The reason is simple. On a realtime search engine, the results change in realtime to keep up with the latest news and buzz. So users keep searching to stay on top of the latest information. Therefore advertisers / marketers working with realtime search engines will be offered many more opportunities to monetize and/or engage the user during a day.

Q: From a business perspective, how difficult is it to actually get a viable engine?

OneRiot: It’s difficult. Firstly you have to address a user need that is not met adequately by the big traditional search engines. Realtime search does that. Users want it, and traditional search engines struggle to deliver it. So users are actively looking for alternative solutions that meet the need – and we’re able to satisfy that.

But secondly you need to deliver a very good “search baseline” just to meet the users basic expectations of a search engine. For example, all searchers expect clean titles, spam and porn filters, fast page return times, the ability to scale and handle multi-millions of queries per day, etc. Users have come to expect all this and more from a search engine – so yours needs to deliver that *and* deliver a compelling and differentiated experience.

We have a clearly differentiated product that serves a distinct need in the market, that comes with a great “search baseline”. All of these are critical elements to success.

Q: What kind of $ is needed for R&D before launch? SearchMe, for instance, reportedly had $44M and its CEO said it would take a total of $100M. Does that sound like it’s on target?

Clearly there’s a lot of investment required up front to build a search service that delivers relevant results, at speed and at scale. And that investment needs to continue – because the web you are indexing is always getting bigger and more interesting. However, once you’ve got a marketable product, the $ required to “launch” depends a lot on your Go-To-Market strategy.

Our approach is to grow the business by opening up an API which allows 3rd party developers to take our realtime search results to their users. For example, Microsoft uses our API in a “realtime version of IE” which is bundled with OneRiot search. Microsoft is now distributing this product from Microsoft.com – and we obviously get search traffic through that. We launched our API in June, and already have more than 40 partners in our program. It certainly requires an investment to make sure you have a robust partner API and support program, but in our opinion this is a robust approach to growth.

The key with building a search product is to understand the feedback loop with the users. The more people search with your service, the more you understand what they are looking for, and the more you can tune your engine to meet their needs. If you don’t have search volume initially, you can’t work that feedback loop and it’s hard to improve the underlying engine. By deciding to open our API, we’re getting that search query volume very quickly, and we’re able to work that feedback loop.

All of our users – whether at OneRiot.com, or whether they see our results presented on a partner site – benefit. Likewise, so would advertisers / marketers. Not only does a better product help drive user engagement (“eye balls”). But, when you get down to some search business specifics, the better targeted our search results are for a particular query, the better the potential is for advertisers / marketers to work with us and effectively monetize that.

Job search smart with UpMo’s Intelligent Job Hunt

August 10th, 2009 by Charles S. Knight
Posted in Verticals | No Comments »

upmo
UpMo

The Science Behind Career Success

Let’s face it. Getting the job you really want can be painful. It’s emotional, involves critical decision-making and most certainly impacts your family planning and financial future. With nearly six unemployed Americans for every job opening, there’s fierce competition for new opportunities. Nowadays, to get noticed and hired, you need a distinct, almost unfair, tactical advantage.

Sorry, job boards don’t work. The old ways of landing a job waste time in the aggressive pursuit of largely the wrong opportunities. Consider the conventional approach: input a few keywords, define generic search criteria on a big job board, click “Search,” and then comb through hundreds, if not thousands, of loosely—relevant listings. Good luck!

info-boards

Find the right job smarter, faster. UpMo has done its research—analyzed data and talked to hiring managers, placement firms and successful professionals—to determine the criteria for “the right match.” The result: the Intelligent Job Hunt™, a highly—personalized agent that identifies and prioritizes job opportunities based upon the important dimensions of your career including your actual career path, professional network, specific goals, and desired mentors.

info-ijh

How does it work? The Intelligent Job Hunt™ weighs important factors about you—the unique mix of information that paints your detailed professional portrait—and uses expert insights, data-models and a rich algorithmic engine to pinpoint the precise activities you should be doing and the specific job opportunities you should be pursuing to increase your chances of getting the job you want.

With the Intelligent Job Hunt™, you can expose real job matches that embody the next step on your career path—plus lateral moves, opportunities in other industries, jobs you can access by tapping your professional network, and even previews of what could be your next-next opportunity.

Listen, job search should not be a full-time job.

Hunt smart. Get ready. Hunt smart. Move up.

UpMo Trial Membership – Try it now here.

Source: UpMo.com

My update from people search engine Yasni.de

August 10th, 2009 by Charles S. Knight
Posted in Global, People, Updates, Verticals | No Comments »

header-logo-de
www.yasni.de

Hallo Charles Knight,

Es gibt Neuigkeiten zu Ihrem Personen-Profil! und neue Suchergebnisse für Sie

Hinweis: Ab sofort erhalten Sie detaillierte Statistiken zu den Besuchern Ihres Personen-Profiles. Zusätzlich sehen Sie den Durchschnitt über alle yasni-Nutzer im Vergleich. Um die Besucher-Zahl zu steigern, können Sie den Link zu Ihrem yasni-Profil einfach in Ihre E-Mail-Signatur oder Ihre Website einbinden!

Ihr Profil-Link: http://person.yasni.de/charles-knight-105320.htm

Ihre Statistik für die Woche 03.08.2009 – 10.08.2009 (33)

Es gibt 198 neue Ergebnisse zu Ihrem Namen (ansehen!)

VIP-Rank 390

Keine Änderung

Das sollten Sie heute noch tun: Bekannte einladen!

Neue Such-Ergebnisse zu Ihren gemerkten Namen: Charles Knight

Der Name “Charles Knight” wurde letzte Woche 0-mal gesucht (gesamt: 6).

703 neue Ergebnisse zu Charles Knight, z.B.

2009-08-10_0958

Hello Charles Knight,

There is news about your passenger profile and new results for you!

Note: From now on you will receive detailed statistics about visitors to your passenger profile. Additionally, you can see the average over all yasni users in comparison. To the visitor numbers to increase, you can link to your profile yasni into your email signature or your website!

Your profile-link: http://person.yasni.de/charles-knight-105320.htm

Their statistics for the week 03.08.2009 – 10.08.2009 (33)

There are 198 new results on your behalf

VIP Rank 390

No change

You should still invite friends!

Source: Yasni.de

Four Things to Consider for Choosing a Real-Time Search Engine

August 10th, 2009 by Guest Author
Posted in Guest Authors, Realtime | No Comments »

By John Park, Founder of Feedmil

As the web continues to grow on an unprecedented scale, the way people consume information is evolving too. Content is increasingly being delivered in real-time streams, and it is now common to discover information by subscribing to RSS, following twitter status updates, or reading facebook streams. More recently, real-time search engines that attempt to find results based on what is happening now have emerged as a new means to tap into the various streams of information flowing on the web.

While there are lots of ongoing debates on what the real-time search is, the limitations of existing real-time search engines, and how they will evolve, this article will focus on the key issues that need to be considered when assessing the quality and performance of current generation of real-time search engines.

Jumping into the real-time streams:
Following a popular person in twitter definitely gives some kind of fun, but when it comes to a problem of discovering information of your ongoing interest by following people, there’s a pain too. You will not only receive posts you like, but also have to encounter many not-so-relevant posts: Most tweets are expressing sentiments like “Dinner at Red Lobster was great.” rather than providing useful information. With so many tweets sprouting from the people you follow, it is becoming difficult to keep on top of your favorite topic.

Things get slightly better when you use real-time search services that retrieve the results from microblogs. They are simple to use and have straightforward interface that presents results chronologically. But it then becomes very hard to keep up with this never ending flow of volatile information streams. To make the matter worse, the stream of microblog search results usually contains similar content (see the twitter search results below) and is often full of dittos (like retweets). You are likely to miss the important piece of information you really want to find out unless you keep scrutinizing the streams flying by very carefully.

twitter_scr

That is, while this type of monitoring based approach for real-time search does a good job of showing what is going on right now for various events (e.g., Iran election) and what is being said about specific companies, it has very low signal to noise ratio. So there is a danger of being drowned in the real-time stream due to the information overload when you jump into the stream.

Real-time web is much bigger than twitter:
Twitter sphere is big, and growing rapidly, but as indicated in the recent survey by sysomos (http://sysomos.com/insidetwitter/mostactiveusers/#most-followers), only 5% of twitter users accounts for 75% of all activity, one quarter of all tweets are generated by bots, and users are quite biased in terms of demographics. Yet, many current real-time search engines only look for microblogs such as twitter and friendfeed, and some only search posts that contain links for the purpose of providing good filtering.

In fact, microblog sphere is still quite short of representing what is happening on the web, since there are a lot more other non-microblog sources out there that generate streams. These include blogs, news, podcasts, and social media posts such as youtube uploads, digg submissions and delicious bookmarks. For instance, the recent tragic death of Michael Jackson was first reported by entertainment site TMZ.com not by tweets.

Considering the facts that many breaking news are still first released from public and social media, not from microblogs and also that certain deeper stories require research and investigation that only professionals can do, it is worth trying search engines that retrieve results from broader sources, not limited to microblogs.

Recency alone is not enough:
With so many live posts, photos, videos, and tweets generated on the fly for a current hot topic, what would you prefer to read first? This question is related to the problem of ranking the real-time search results. Currently most real-time search engines do this by simply matching query with strings appeared in the microblog posts and putting the most recent results at top. Then they keep pushing them down in a flowing river of streams as new results that match the query arrive.

The downside of this approach is that the top ranked entries may not be topically relevant and sometimes they turn out to be spams. Furthermore, there is a high possibility that the top ranked results are quite redundant as they are talking about the same thing.

As a solution to this problem, quality assessment result through identifying most popular, most authoritative, most linked-to, or most re-tweeted items can be taken into consideration. But, such kind of filtering will require further processing that will decrease the freshness of information. An effective alternative then would be to present the topically relevant search results from reliable, quality sources in a timely manner. Nevertheless, balancing the tension among recency, relevance, and quality is not an easy problem.

Keeping sync with web in seconds:
Real-time search engines are expected to produce results on a given query from various sources as they find them. Accordingly, one of the biggest challenges in real time search is to minimize latency – the time lag between creation of some data and getting them indexed.

In order to achieve this, existing real-time search engines are already employing in-memory indexing instead of hard disk, and distributing indexes across multiple machines for parallelizing indexing processes.
On the technology side, traditional pull-style RSS and Atom feeds will be increasingly replaced by push oriented protocols such as XMPP and pubsubhubbub, making it possible to instantly notify new content to interested parties.

Therefore, as the cost of CPU and memory is getting cheaper, the critical issue is likely to become a matter of hardware investment rather than software technology, and economically viable real-time search engine services that can sync with web in the scale of seconds are likely to emerge.

feedmillogo_421Feedmil: a new real-time search engine

In an attempt to effectively address the issues raised above, we have developed a new real-time feed search engine, feedmil.com that can find topic focused quality streaming sources as well as current hot streams for a topical query. Compared to the other existing real-time search engines, it has broadest possible coverage, encompassing all sorts of streams including microblogs, blogs, podcasts, and public & social media.

As for the ranking of real-time streams, feedmil takes a different approach to go beyond the simple buzz monitoring tool, and balances users’ needs to see results in real-time with necessity to discover information from spam-free, quality sources. As shown in the following screen shot, feedmil retrieves quality sources instead of individual posts, and combines them with a column of search results ranked by recency.

feedmil_scr2

That is, search results are grouped by their streaming sources in feedmil. By doing so, users can find right sources for real-time information and interested users can further subscribe to or follow them so that real-time streams can be directly pushed to the users.

In addition, the popularity filter provided at feedmil allows users to quickly explore streaming sources by specifying whether they want the most popular or more surprising results. In this way, users will have a better chance to find feeds that are not in mainstream and for niche subjects.

Feedmil’s goal is to help people search real-time information of their instant as well as ongoing interest in the most efficient and easiest possible way, enabling them keep track of topics they care about.
Although in its infancy, real-time search is one of the hottest areas in search industry right now, and there appear to be lots of challenges ahead to fully realize its potential. Feedmil is not there yet, but we believe that every new development at feedmil.com is a step in that direction.

So, what’s your favorite real-time search engine?