Pakistani search engine Faadi

June 2nd, 2009 by Steffen Schilke
Posted in Global | 5 Comments »

fFaadi is a search engine from Pakistan and could be very boring if you see it as another meta search engine which is aggregating search results from the Top 10 search engine including, e.g., Google, Yahoo, Alltheweb, Altavista, A9 and Fast to provide results for Web, Images, Audio, Videos, News, Files, Wikipedia, Blogs, Mail, Download and Video. One point which is worth mentioning is that they de-duplicate the search results.

But the catch and their USP is they have their own search engine index covering Pakistani web sites! A vertical search based on the Pakistani web is something at least 176,242,949 people (July 2009 est. – The World Factbook) can dig – especially if the average age in this country is about 20.something-ish. The national language is Urdu but there is a bunch of other languages spoken (Source Wikipedia):

1. Punjabi (44.15%)
2. Pashto (15.42%)
3. Sindhi (14.1%)
4. Seraiki (10.53%)
5. Urdu (7.57%)
6. Balochi (3.57%)
7. Others (4.66%)

Faadi has an own spider which crawls the Pakistani web sites and collects information in an own index (not a Google custom search as many do at the moment). The collect information about sites in Pakistan of all kind and topics (including Educational, Entertainment, Informative). Unfortunately they do not mention on their site if they only index .pk sites or if they crawl the whole internet to identify other web sites with Pakistani content.

If you consider that Pakistan is an Islamic Republic you know where the problem could start. The English web pages seem pretty open but even bigger search engines have committed to reduce the number of shown search results in other countries. I did not want to write a certain word but I assume you can imagine.

I think besides the big market share there are reasonable market shares in search markets which depend on a language as vertical. Especially if that language is (not yet) covered by Google. Anyhow certain languages are covered at Google only because they have a large number of engineers from that cultural background on their team or they have a big penetration in the internet world (hence advertisement profit).

Steffen followed up:

Dear Faadi Team,

So your algorithm works with a language identification for the languages spoken in Pakistan? I.e., Punjabi, Sindhi, Siraiki, Pashtu, Urdu , Balochi, Hindko, Brahui, English (official; lingua franca of Pakistani elite and most government ministries), Burushaski and others? Do you also index Pakistani pages in English?

For the .pk domains do you use the zone files or how do you get hold of the Pakistani domains? Do you also search other domains for content from or about Pakistan or owned by people / companies from Pakistan or in one of the Pakistan languages?

Do you include everything in your index which your spider comes along or do you select what will be presented as a search result?

BTW: I type a English search term and selected Pakistan but I’ve got the same search results when I selected International?!?

Kind regards,

Steffen

The reply:

Dear Steffen,

Thanks for Posting of Faadi Search Engine News. but sir you have use some wrong wording that is effecting our goodwill. that News is transfer to all other Blogs, News agencies and press releases.

Please check this and remove this sentence…

Faadi is a search engine from Pakistan and could be very boring if you see it as another meta search engine which is aggregating search results from the Top 10 search engine including,…

e.g. ” and could be very boring ” please remove it and mention that Faadi search engine has thumbnail options, its user friendly design- it is meta search engine. but soon we are launching Index / Spider search engine – it is under testing and debugging mode.

We are providing Custom search results for Pakistan. We have an algorithm to search the Pakistan Websites by Languages + By .Pk Domains. and also the site created from Pakistan.

Thanks,

Changez
Marketing Manager
Faadi.com

Editor’s note: There is no reason to alter Steffen’s review.

Microsoft Bing Video Search – A Quick Overview

June 2nd, 2009 by Guest Author
Posted in Guest Authors, Majors, Verticals, Video | 4 Comments »

reel-seo-logogif
By Christophor Rick.

Original post here.

bing-logo
Microsoft has launched their new Google competitor, bing, complete with video search and everything.

The thing is, I don’t know what all the hype is about. People are talking about it like it’s going to be some actual competition to Google and from what I can see it’s nothing special.

Some blogs are touting the awesomeness of the live video previews on mouseover. Sure that’s sort of cool (doesn’t work in Firefox which is no big surprise is it?). Blinkx gives you some small preview of all the videos in the results albeit without sound so that’s not so groundbreaking.

The bing video search gives you some options to filter down your results by length, screen size, resolution and source. Source is most interesting to me as it gives you the list of sources it’s apparently pulling from:

* MSN
* AOL
* MTV
* Hulu
* ESPN
* YouTube
* MySpace (?!)
* DailyMotion
* MetaCafe

Not a complete who’s who of video platforms and sharing sites but not the smallest cross-section I’ve seen in a video search either.

So I did one of my standard searches “Wolverine” and received a whopping 157,000 results from bing as compared to 80,000 from Blinkx and a lowly 17,200 results in Google Video. So perhaps they’re doing something right.

Or are they? After some further digging I realized that some of the videos in the search results were actually showing up multiple times, for example the Wolverine profile from the latest incarnation of X-men animated series. In fact some results even showed up multiple times on the same page of search results so it seems that they have some further filtering to do. This might be chalked up to the same video being on multiple sites but don’t they have a way of checking that and combining the results?

Another strange thing is that bing, from Microsoft, is using Flash from Adobe and not Silverlight from Microsoft. Now if you think about that, it’s not all that strange. They want the widest possible adoption of the service most likely and Silverlight isn’t the predominant video player on the web so they had to go with the strongest player in the market. After all, how many of you would choose to install a video player plugin over just going someplace that already let you search based on a plugin you probably already had installed?

In regards to the advanced search capabilities, Google still gives more options and that means easier searching overall in my book. I can drill down to exactly what I’m looking for far more clearly. Plus Google offers some serious keyword search filtering and with the coming of HTML 5 and the Video tag I think that’s going to put them in the forefront though the others will have plenty of time to bolster their keyword search options before it becomes widely accepted. Microsoft knows that it won’t become the world leader in search overnight but they’re hoping that it will gain some widespread use.

Of course what is bing at its most basic? It’s LIVE search right? It’s LIVE search with a new coat of paint, some additional options and a new name. And doesn’t it look a lot like Google from the top left corner of the browser? Almost the same list of options even.

Sure it’s another place that will catalog and index all of our video content and some people will always prefer Microsoft over the others so it will probably find its niche in the market.

Scientific Literature Library and Search Engine CiteSeer

June 2nd, 2009 by Charles S. Knight
Posted in Verticals | No Comments »

csxbetaCiteSeer was the first digital library and search engine to provide automated citation indexing and citation linking using the method of autonomous citation indexing.

CiteSeer was developed in 1997 at the NEC Research Institute, Princeton, New Jersey, by Steve Lawrence, Lee Giles and Kurt Bollacker. The service transitioned to the Pennsylvania State University’s College of Information Sciences and Technology in 2003. Since then, the project has been led by Lee Giles with technical and administrative direction by Isaac Councill.

After serving as a public search engine for nearly ten years, CiteSeer, originally intended as a prototype only, began to scale beyond the capabilities of its original architecture. Since its inception, the original CiteSeer grew to index over 750,000 documents and served over 1.5 million requests daily, pushing the limits of the system’s capabilities. Based on an analysis of problems encountered by the original system and the needs of the research community, a new architecture and data model was developed for the “Next Generation CiteSeer,” or CiteSeerx, in order to continue the CiteSeer legacy into the foreseeable future.

CiteSeerx is a scientific literature digital library and search engine that focuses primarily on the literature in computer and information science. CiteSeerx aims to improve the dissemination of scientific literature and to provide improvements in functionality, usability, availability, cost, comprehensiveness, efficiency, and timeliness in the access of scientific and scholarly knowledge.

Rather than creating just another digital library, CiteSeerx attempts to provide resources such as algorithms, data, metadata, services, techniques, and software that can be used to promote other digital libraries. CiteSeerx has developed new methods and algorithms to index PostScript and PDF research articles on the Web. Citeseerx provides the following features.

* Autonomous Citation Indexing (ACI) – CiteSeer uses ACI to automatically create a citation index that can be used for literature search and evaluation. Compared to traditional citation indices, ACI provides improvements in cost, availability, comprehensiveness, efficiency, and timeliness.
* Citation statistics – CiteSeer computes citation statistics and related documents for all articles cited in the database, not just the indexed articles.
* Reference linking As with many online publishers, CiteSeer allows browsing the database using citation links. However, CiteSeer performs this automatically.
* Citation contextCiteSeer can show the context of citations to a given paper, allowing a researcher to quickly and easily see what other researchers have to say about an article of interest.
* Awareness and tracking – CiteSeer provides automatic notification of new citations to given papers, and new papers matching a user profile.
* Related documents CiteSeer locates related documents using citation and word based measures and displays an active and continuously updated bibliography for each document.
* Full-text indexing – CiteSeer indexes the full-text of the entire articles and citations. Full boolean, phrase and proximity search is supported.
* Query-sensitive summaries – CiteSeer provides the context of how query terms are used in articles instead of a generic summary, improving the efficiency of search.
* Up-to-date – CiteSeer is regularly updated based on user submissions and regular crawls.
* Powerful search – CiteSeer uses fielded search to all complex queries over content, and allows the use of author initials to provide more flexible name search.
* Harvesting of articles – CiteSeer automatically harvests research papers from the Web.
* Metadata – CiteSeer automatically extracts and provides metadata from all indexed articles.
portallogo* Personal Content Portal Details

Source: CiteSeer

Distributed, Federated, and Faceted search – what’s the dif?

June 2nd, 2009 by Guest Author
Posted in Federated Search, Guest Authors | 1 Comment »

By Guest Author Sol Lederman
The Federated Search Blog

Charles asked for someone to explain the difference between distributed search, federated search, and faceted search. And, he wanted the explanation to be in layman’s terms. These “definition” articles are actually tougher to write than they look because it can be tough to define something I know about but never consider articulating. But, whining aside, here’s my attempt at defining the terms:

Distributed search

Distributed search queries a number of sources simultaneously.

Federated search

Federated search queries a number of sources simultaneously.

Faceted search

Faceted search guides users to the best results for their queries via a refinement process. First, the user performs a text search as he would in Google or other popular search engine. Then, the results found in the text search are organized by categories, more technically known as facets. For example, a faceted search for “Mexico” might offer users the opportunity to refine results by such facets as topic (history, language), language (English, Spanish), date published, etc. Note that in faceted search there can be different ways to categorize the same content. (I want to thank Daniel Tunkelang for contributing to this definition.)

I know of no difference between “federated search” and “distributed search.”

Faceted search is a fundamentally different beast that the other two. Faceted search is an approach to refining results and to guiding users to the best results from a set of results once that set has been found.

Federated and distributed search are approaches to getting those results in the first place, by searching multiple content sources.

If you enjoy these questions of defining and naming things you may like this article and this one too.

Taptu reaches 1,000,000 Mobile Searches Daily

June 2nd, 2009 by Charles S. Knight
Posted in Mobile, Verticals | No Comments »

taptuWith more than one million searches daily and 3.4 million unique users in April, mobile search engine Taptu is delivering the most user-friendly mobile search by eliminating the frustrations and hassle of desktop search engines on mobile devices.

Taptu recognizes the importance of delivering a high-quality experience to its users by providing an easy and seamless search consisting of only mobile-friendly results and the opportunity for users to share and discover new content.

tap2“With the increase in Taptu searches, users have made it clear that there is a distinct need for a mobile-only search engine with results best viewed on mobile devices,” said Steve Ives, Founder and CEO of Taptu. “I am excited by the exponential growth of Taptu searches and proud of the team for reaching this milestone.

We’re looking forward to expanding our footprint in the mobile search market with the upcoming launch of the Taptu iPhone application.”

icon_iphoneapp-soonTaptu improves mobile search by resolving one simple question: If a Web site does not work well on a mobile device, then why should it show up in a mobile search? Taptu only crawls the mobile-friendly web and users are only presented with relevant sites that are optimized for viewing on their mobile devices. Furthermore, Taptu provides users with a visual preview of each site in their search results, enabling them to determine if the site is relevant to them before selecting the result. Taptu understands that mobile search isn’t always about finding the needle in a haystack, and discovery is an essential part of the Taptu service with recommended related searches and “Top Searches of the Day.”