
by Hope Leman
novo|seek is an information extraction system developed by the Spanish IT company Bioalma for searching published knowledge in biomedical literature. Last month, I had the opportunity to talk in depth about novo|seek with Ramón Alonso-Allende, the Marketing and Business Development Director at Bioalma.
I first heard about novo|seek earlier this year through the editor of AltSearchEngines.com, Charles Knight. I then began to see references to it on Twitter and posts by it in The Life Scientists room on FriendFeed. I tried out novo|seek and wrote a glowing review for AltSearchEngines. I then spoke with Ramón Alonso-Allende by phone, he in Madrid and me in my home in Oregon. During the course of our interview he said some fascinating things that are included below.
The Interview
First of all, you began by discussing the origins of Bioalma, the parent firm of novo|seek. You mentioned a research scientist who helped found the company and who developed some key techniques in biomedical search. Could you please tell us who that person is, what search tools he developed and how they led to the development of novo|seek?
The technology base of novo|seek originated from the National Center of Biotechnology, and was largely invented by Prof. Alfonso Valencia, a former director at the Protein Design Group at the NCB, who now serves as Director of the Structural Biology and the biocomputational program at the Spanish National Center for Cancer Research (CNIO).
This technology was formed when Bioalma became involved in examining literary resources on the topic of predicting the function of genes. We found that using text mining had not been implemented as a solution as much as it could have and we saw an opportunity because the use of text mining technologies reveals the relations among concepts within literature — relations that would be difficult to expose over the course of reading a wide breadth of articles to find what you’re looking for. We saw the potential for text mining as a way to extract the information stored within scientific literature.
What is your own professional background and role in novo|seek?
I’ve been working in the biomedical IT sector for 10 years and I initially joined Bioalma in the scientific research unit and now lead the marketing and business development team. I started my career as a CRA developing new electronic data collection system and then moved to the Protein Design Group at the National Center of Biotechnology to coordinate the bioinformatic development of a European molecular biology project, during which I developed molecular data integration systems. I have participated in other European projects developing data integration interfaces and have a degree in Pharmacy from the Universidad San Pablo in Madrid and an MBA from the Instituto de Empresa, Madrid.
Who is the primary audience for novo|seek? Basic scientists? Frontline medical providers? Can you give us some real-world examples? Say I’m a molecular biologist. How would I use novo|seek? Say I am a practicing neurologist — how would I use novo|seek to help me determine the best treatment for a 40-year-year old man with longstanding epilepsy whom I have never treated before?
novo|seek is aimed at a variety of audiences within the biomedical and medical community including: medical doctors and students, medical librarians, and biomedical scientists and researchers.
For example, novo|seek can be used by scientists researching a cure for disease, or the research associated with a particular gene, by allowing them to find the right information that’s the most relevant and the most comprehensive. Another example might be for medical doctors conducting patient-related queries — novo|seek enables them to execute searches that will bring up the most relevant information immediately, without requiring a labor-intensive search. In terms of medical librarians, novo|seek provides the comprehensive, updated search results that they require as it is equipped to process and catalogue thousands of new articles per day.
There are several ways to perform a search, but we should start as a regular user, namely from the main search box. If we type in the main search terms as specified for this research, such as “longstanding epilepsy”, we get 46 results (as of July 6th, 2009).

At this point, when the results are sorted out by date in novo|seek, they are very similar to PubMed’s. The next step is to make the search more precise by adding the filter “epilepsy” to the current search. To do this, we click on the first concept in the category “Diseases or Syndromes” in the left sidebar. We now have 34 results in Medline.
One of the main issues we face is that “longstanding epilepsy” is not recognized as a disease and therefore is not mapped. This is why both “longstanding” and “epilepsy” appear in bold in the results. However, we chose to sort results by relevance and start looking at them. The first one looks promising.
In the case that it is not, we can keep scrolling for more results.
Novice researchers can use the concepts sidebar to refine the search even further. On the other hand, advanced users could use the following query to extract results for male, aged 40 years old, suffering from longstanding epilepsy:
“adult”[mesh] “male”[mesh] longstanding epilepsy
This interview is continued on the Next Generation Science blog here:

















