Make yourself more VISIBLE with VIREL

July 23rd, 2008 by Charles S. Knight
Posted in News | No Comments »

VIREL is a website-friendly crawler for microformats. It is primarily focusing on collecting published open microformats embedded in websites.

It searches Websites for Microformats and follows links on Websites to other Websites to continiue its search to the WWW. To learn more about microformats you may want to take a look at http://www.microformats.org/. Right now VIREL is searching for contact informations (hCard) and calendaric informations (hCalendar)

To create a hCard you can use their online version of the hCard-Creator Online hCard Creator or another third party Online hCard Creator or this third-party Online hCalendar Creator

If you like to make your microformats accessible to VIREL you may want to submit your URL to VIREL. VIREL checks this URL in regular intervals for new or updated information embedded in microformats. You can submit your URL here: Submit. A search for ‘Wolfgang’ would look like this:

The search for ‘Wolfgang’ took 0.0199 Seconds and produced 4 matches:

show hCard for Wolfgang Wiese Wolfgang Wiese found at www.webkongress.uni-erlangen.de 44% Integrity
http://www.webkongress.uni-erlangen.de/kontakt.shtml
Last Update: 2008-07-03

hCard vCard DataMatrix

Phone:09131 / 85 – 28326
E-Mail: wolfgang.wiese@rrze.uni-erlangen.de

show hCard for Wolfgang Bartelme Wolfgang Bartelme found at www.factorycity.net 32% Integrity
http://www.factorycity.net/projects/microformats-icons
Last Update: 2008-07-09

hCard vCard DataMatrix

Alternative Search Results Part II

July 23rd, 2008 by Guest Author
Posted in Guest Authors | No Comments »


Dmitri Soubbotin, CEO of Semantic
Engines, the maker of SenseBot!

Part II

Semantic approach to search
Semantic search engines use various NLP and text mining methods to “understand” what a Web page is about and extract the meaning from it. A few most popular ones are described in this article. The idea is to give a satisfactory answer to the user’s query, bypassing the need to dig into often unrelated sources. Most of the engines are trying to give a direct and accurate answer to a question raised by a user, e.g.  “Who built Empire State Building?”

There is also an interesting article by Amit Singhal on Google Blog. It clearly shows that Google uses linguistic technologies and makes headway in understanding user intent. We believe that they would still use semantic technology only marginally, as opposed to semantic engines that use it as a foundation for their search.

The niche that SenseBot occupies within the semantic search engine family is in attempting to provide an overview and facilitate overall understanding of the subject, as opposed to finding hard facts and answering direct questions.
With the advent of Semantic Web, we expect a boost for semantic search engines. Semantic tagging will help search engines to decrease ambiguity and identify what a given page is about, allowing for higher relevance results. There is also a possibility that semantic search engines may help Semantic Web to materialize – by analyzing existing Web pages and identifying representative tags for them, that can then be transformed into RDF or other formats. This would fall within the bottom-up approach to building Semantic Web (see Alex Iskold’s article on this subject). This is one type of application that SenseBot Web services can be used for.

The challenges that semantic search engines typically face are: ambiguity of analyzing generic content, e.g. Web pages found by Google. Human languages are inherently ambiguous, so “spider” can be construed as a biological species, or a movie title, or a data processing system. The sources about all different meanings would still cohabitate the first page of results. Not surprisingly, the long-awaited demo of Powerset turned out to be based on Wikipedia – a highly controlled and homogenously written set of content, simplifying the task of semantic analysis.

In general, much higher quality can be achieved in content verticals and within an enterprise, where the content is narrowed down, and the taxonomy is defined or at least implicitly present. For example, check out the artery surgery summary. It is quite informative even though based on a generic Web search; yet within a medical portal it could have been of even higher quality.

The glamorous first page and beyond.
It feels intuitive that the first page of search results grabs the bulk of user attention. According to a recent study by iProspect, 68% of search engine users click a search result within the first page of results. One reason for this may be good ranking of sources performed by search engines. However, another plausible reason can be the narrowing attention span of users – exploring another page of results can be seen as a disproportionate burden. If the answer is not found on the first page, users may believe that the chances of finding the answer beyond it are slim and not worth the expense of time. So the search is abandoned, or the user settles with whatever quality answer was found within the first page.

A summary of the first page of results could be very useful for the users, especially for informational type of queries (see examples in Part I). A quick, at-a-glance introduction to the topic through the summary may save time, and also expose better, content-richer sources.It is not surprising that businesses spare no expense on SEO to get their sites on the first page – those coveted 10 listings. For those queries where the knowledge area is relatively narrow and structured, and the number of authoritative sources is small, this may be acceptable. But for less structured or highly dynamic content areas, this means that a valuable source may become effectively invisible to the users just because it is ranked #11.

This is where semantic search engines can really shine, by scouting the back pages and extracting valuable items from them. SenseBot’s In-depth Search can go up to 10 pages of results (100 Web sources) deep, and produce a summary of the most relevant sources. It is eye-opening to see sometimes a little gem of content from the 4th or 5th page of results to take its proud place in a summary.

Cognition Launches Semantic MEDLINE

July 23rd, 2008 by Charles S. Knight
Posted in Alts, News | No Comments »

Cognition, a next-generation Semantic Natural Language Processing (NLP) company, announces a quantum improvement in the application of NLP technology with the introduction of Semantic MEDLINE™ – the 18 million article abstract database of complex health information published by the National Library of Medicine.  This new free service at www.SemanticMEDLINE.com enables complex health and life science material to be rapidly and efficiently discovered with greater precision and completeness. This marks the first time that users can employ a natural, conversational sentence structure to find the most complex studies within the MEDLINE dataset.

SemanticMEDLINE is powered by Cognition’s Semantic NLP™ technology, which incorporates word and phrase knowledge to comprehend the meaning and nuances of the English language. Cognition’s Semantic Map, the most complete and comprehensive available today, enables the Search process to be based on meaning, rather than statistical word pattern matching, and therefore returns more complete and relevant results.

“Cognition’s Semantic NLP is the first and only technology to combine all of the key linguistic elements to unravel the complexity of language and optimize semantic understanding of ambiguous content.  The foundation behind this capability is our comprehensive Semantic Map of the English language,” said Scott Jarus, CEO of Cognition Technologies, “SemanticMEDLINE’s results are far more comprehensive and thorough when compared with Pubmed’s native Search results because of two unique capabilities: an understanding of synonymy and the ability to understand meaning and context reasoning.”

With traditional keyword search engines, such as those used by Google, Yahoo! and others, finding the best medical research document within complex datasets, such as MEDLINE, is very difficult to obtain without the use of complex Boolean equations and a deep understanding of the many permutations of technical synonymy. Cognition’s Semantic MEDLINE has the ability to target and locate these types of data that are otherwise hidden in masses of information because of its comprehensive Semantic Map (particularly deep within the health sciences discipline) and its unique ability to “understand” the meaning behind words, phrases and idioms.

Cognition’s SemanticMEDLINE.com is a new important tool for researchers in the medical and biotech community,” said Dr. Betsy Goldsmith, Professor of Biochemistry at UT Southwestern Medical Center, Dallas. “Cognition’s MEDLINE Search capabilities save users hours hunting for desired documents within complex content. It is helpful in conducting complex research, planning experiments, grant creation and writing technical papers.” Dr. Goldsmith is a collaborative developer of Cognition’s SemanticMEDLINE through a sponsored grant provided by Cognition Technologies.

For more information regarding Cognition’s Semantic NLP technology, visit www.cognition.com.  To semantically search medical or scientific data through MEDLINE, visit www.SemanticMEDLINE.com.