What is Concept Searching?

June 17th, 2009 by Guest Author
Posted in Guest Authors, Semantic, Verticals | 1 Comment »

123esi

What Is Concept Searching by Herb Roitblat Transcript by ESIBytes™

Karl Schieneman-Interviewer

Herb Roitblat-Guest

K: Hello everyone.  Welcome to another edition of ESI Bytes.  This is Karl Schieneman, Director of Legal Analytics and Review at JurInnov.  I’m real excited about today’s show as it’s an area that I work in.  We’re going to talk about concept searching and what people mean when they say, “concept searching.”  It means so many different things (depending on who’s saying it).  We have with us one of the pioneers of the concept searching and search engine fields – Herb Roitblat, or Dr. Herb Roitblat we could say.  He’s a PhD and co-founder of the principle at Orcatec LLC.  Before starting at Orcatec, Herb was also the Executive Vice President and Chief Scientist (and co-founder) of Dolphin Search – one of the early search engines in the field.  Herb led the design of the Dolphin Search review tools.  He’s part of the team that brought concept searching and native file review to the e-discovery industry.  Herb’s recognized as an expert in cognitive search, information management, data mining, statistics and e-discovery processes.  He’s been writing about data mining and how technology can ease the burden of e-discovery currently (as well as for years).  Herb, thanks for joining us on the show.

H: Thanks for having me.  (It’s) a great pleasure. 
 
K: Let’s start off…I always like to ask everyone on the show…how did you first become interested in electronic discovery?
 
H: It kind of happened by accident.  We were trying to use some technologies for doing knowledge management and ended up finding out that lawyers needed to do discovery more than they needed to do knowledge management.  We had the technology to do it, so we responded to those two things.  At the time it seemed easy and then we learned what it really was all about.  It hasn’t been easy for at least 10 years. 

K: Okay.  Let’s dive into the topic here.  We’ve heard a lot about concepts searching over the past few years in electronic discovery.  Help us out here (and help the listeners) – what is concept searching?

 H:  Basically, concept searching is using meaning to help find responsive documents.  There are a number of approaches to using that meaning, but they all revolve around the same idea.  Instead of just looking for strings of letters as words, rather, let’s look for words as meaningful things.  We can identify what the words mean using a number of different tools that we can talk about in a bit.  Once we identify that meaning we should be a whole lot better at identifying what the documents are about and which documents are responsive and which ones aren’t.

K: Is the term, “concept searching” overused at this point?  Are there different people attaching different meanings to it?

 H: There are somewhat different meanings attached to “concept searching”.  I don’t know that it’s necessarily being overused so much as there are a variety of tools you can us to get at concept searching.  For example, you could use a thesaurus.  You’re familiar from your intermediate school days (maybe high school) with using Roche’s Thesaurus – other ways of saying things.  In fact, people are very creative in how they say things.  Your job as a searcher is to undo that creativity in the sense of trying to figure out how they could have said something and how you can find it afterwards.  You could also you a taxonomy.  A taxonomy is a hierarchical list of categories.  In a taxonomy, if you’re interested in say, cars – searching for the word “cars”, you might also interested in documents that are in supersets in the category of the word “cars” (such as) documents that talk about vehicles.  If you’re interested in various things, you can move up and down the hierarchy and find things that name something at either a higher or lower level.  A third kind of system for doing concept search involves an ontology.  An ontology is like a taxonomy in that it points to things that are related to one another, but it’s different in that it isn’t required to be just hierarchical.  You can talk about things that are associated, for example, lawyer and attorney are synonym.  You’d find that in a thesaurus, but they’re also related words for a legal professional of the sort.  The legal professional also might be called other things.  There are other words that are associated with “lawyers”, such as “judge” and “case” and “matter”.  You might be interested in documents that talk about those words when you search for a particular words like “lawyer”.  There’s yet another approach to concept searching – this is the one that I’ve tended to follow, and that’s a machine learning kind of approach.  Rather than having somebody sit down and explicitly design a taxonomy or an ontology, you can let the documents tell you what words are related.  In this we follow, say the philosopher Wittgenstein, who argued that the meaning of a word is its use in the language – that’s pretty much right.  Back to our “lawyer” example, any document that has the word “lawyer” in it is also likely to have things like “Esq.” and “judge” and “case” and “matter”.  Conversely, documents that talk about “judge” and “case” and “matter” are likely to be about (the word) “lawyer”, whether “lawyer” appears in it or not.  All of these different approaches try to use meaning to help get at what you’re searching for.  The way they do it is essentially query expansion.  So if you search for lawyer, you can use any of these approaches.  What’s going to happen behind the scenes (sometimes where you can see it and sometimes where you can’t) is going to be a search for “lawyer” + “judge” + “matter” + whatever other words your system tells you are associated.  What that’s going to is bring back documents to focus on the meaning of the word on the very top of your list and it’s going to find documents that you wouldn’t have otherwise thought of.  It’s going to search for these other words in context and using context (even if it doesn’t have that particular word in it) to find the documents that you might not know to look for.

Please read the entire interview on ESI BYTES:

Search for Aviation Jobs with AviationJobSearch.com

June 17th, 2009 by Charles S. Knight
Posted in Job Search, Verticals | 1 Comment »

logo4Jobseekers, search for your next aviation job using the method that best suits your needs. You can search the database by selecting jobs in certain locations or in specific job categories, or you can use the advanced search and enter keywords to narrow your search.

For the jobseeker that knows what they are looking for you can browse the database by showing all jobs in a particular location or job category.

Advanced aviation job search
Search for aviation jobs by entering keywords and narrowing your search by location, job category, contract type and age of advert. Search for aviation jobs using the advanced search

Source: AviationJobSearch.com

Search Engine Yauba to Launch Iranian Version

June 17th, 2009 by Charles S. Knight
Posted in Global, News | No Comments »

yauba_com1Yauba has just informed AltSearchEngines that it will be accelerating the development of an Iranian (Farsi) version of its Privacy Safe Search Engine.  Over the past several days, traffic from Iran to http://www.yauba.com has increased more than 300%, driven primarily by Iranians seeking to anonymously access uncensored information about the Iranian elections. 

According to Sameer Khan, a vice president at Yauba, “A large number of individuals from Iran have written to thank us for the Yauba service, as it has allowed them to not only search for, but also visit, foreign news sites that have been blocked by the Iranian government.  The Farsi version will allow even more people to freely and privately access information. To achieve this goal we are making a public twitter appeal for Farsi experts to assist with the Iranian version of Yauba.” 

Yauba is justly proud of the fact that it uses no cookies, does not store any personal information and allows users to browse third party sites on a completely anonymous basis.  It is known for having one of the shortest privacy policies of any major Internet service and is currently available in English, French, Italian, Portuguese, and Russian.

So do you speak and write Farsi?  Leave a comment here and we will pass it on to Yauba!

Bing Search Improves on the Search Results Page

June 17th, 2009 by Guest Author
Posted in Guest Authors, Majors, Reviews | 1 Comment »

2009-06-17_1612So Bing has a good name that you can now replace “To Google Something” with “To Bing Something!” and has been getting tons of positive media coverage, but from the searcher’s point of view, what is the big fuss? Well the good news is that Microsoft has done their homework this time and Bing is definitely a better search engine than the old Live search service. 

By in large, it is a very similar Search experience to Google in terms of the display layout and results relevancy which is a good thing. However, Bing has some cool innovations that Google currently does not have and I think not only improves on the Google design but really helps the searcher and I give Microsoft and the Bing Search team for thinking beyond the Google offering.

Sure the Bing search screen has nice interactive background that changes everyday,  but for me, it’s all about the search results page!

Microsoft research apparently identified that one of biggest frustration in search today is in clicking through to a page, only to find it is not the right one so helping searcher identifying the correct page to click through on the results page was identified as a main area of focus for the Bing team.

I am happy to say that the focus on delivering useful information on the results page itself has been a good one and I think has delivered real tangible benefits for the searchers.

Now on to the Bing search results layout…

Please read the rest of this article here:

felix_gravatar_small
By Felix

Expert System adds Semantic Search for Twitter

June 17th, 2009 by Charles S. Knight
Posted in Updates | No Comments »

503509_logo_headerExpert System, leading provider of semantic software that searches, discovers, classifies and interprets unstructured text information, recently announced it has developed a new semantic search Web site for Twitter content.

Using Expert System’s COGITO® engine, tagging hash-tag topics and Twitter users is quick, accurate, and easily located among all updates within the Twitterverse.

“The site is intended for users to avoid the lengthy process of reviewing ‘who said what’ within a hash tag before deciding to join, respond or read further,” said Luca Scagliarini, vice president of business development at Expert System. “This site graphically shows what people are talking about topically and who is talking about those things most intensively. In this way, a rich understanding of the hash-tag composition is revealed without resorting to the cumbersome review process using the current Twitter interface.”

Semantic technology makes more accurate Twitter hash-tag search possible since it understands the conversations categorically no matter how varied authors express themselves. For example, in a hash tag about global warming, one “tweet” may express the thought that “poverty causes global warming,” while another may express the idea that “global warming causes poverty.” Current keyword technology in virtually all Twitter interfaces would not be able to distinguish between these two very different concepts since the same words are used. However, Expert System’s semantic COGITO engine understands and records the logic of what causes what.

“Semantic technology can understand as humans do, and cut across the infinite ways of expressing similar things, tally them up and display the wisdom that crowds possess,” Scagliarini said. “Expert System built this site to demonstrate that the capture, enhancement and documentation of social wisdom is facilitated by semantic technology.”

Policy makers, researchers, and everyday users will find the value of Twitter much more apparent with the Expert System site. The company has applied the same technique to a variety of public and private information sets to gain efficiencies, new understanding, and early warnings about the world around us.

For access to the Twitter semantic search site, contact Brooke Aker, CEO of Expert System USA at baker <at> expertsystem <dot> net.