Stealth Report: Pre-Alpha Search Engine Snagsta

June 10th, 2008 by Charles S. Knight
Posted in News | No Comments »

SNAGSTA

Find Factz, Get a free Powerset T-shirt!

June 10th, 2008 by Charles S. Knight
Posted in Uncategorized | No Comments »

Posted by Mark Johnson

Now that Summer is here, you need a hip, snazzy Club Powerset t-shirt!

Here’s the deal: look for something cool, unique, interesting or surprising in the Factz section of Powerset. Write a blog post with a link to the search, an explantion of what’s cool, and maybe even a screenshot. If you’re feeling motivated, you can post a few searches to increase your chances. On Friday, June 20, Powerset will post links to the best blog posts and send each of those users a Club Powerset t-shirt (we’ll limit the number of t-shirts we give away to the double digits). How easy is that?

The goal is to find queries that show off Powerset’s Factz, which are aggregated from across Wikipedia articles to summarize a topic. When you click a word in Factz, Powerset will show you the sentence it was derived from. Note that oftentimes, Factz come from Wikipedia articles that are different from the topic you were searching for. Try a famous person like Jon Stewart, a company like Atari, or even just a noun like cat. By default, Powerset shows the top three relations, but clicking the “More” button on the bottom will expand to all of the Factz that Powerset found.

Sometimes you can find fun Factz on just a topical query. But, you can also use Factz to generate lists. {warning: brief grammar lesson follows!} Notice that all Factz are in a subject-verb-object format called “triples”. You’ll get Factz if you ask a simple question in the forms:

* whom/what do [subject] [verb] (try what do italians eat or what did the FDA ban)
* who what [verb] [object] (try who beheaded charles i or who climbed mount everest)
* [name] and [name] (try cat and mouse or man and woman)

You can often use a simple topical query to guess what questions will have good answers. For example, a query for Leibniz has a number of things that he invented. Therefore, the query What did Leibniz invent, turns out to be an excellent Factz query. Try this for any topic you find interesting if you’re not sure what questions to ask.

To get in the running for a Club Powerset t-shirt, leave a comment in the blog and send us an e-mail at feedback@powerset.com with “Screenshots” in the title, your address, and the size of t-shirt you want. If you post your screenshot to Flickr, tag it with powersetscreenshots.

We look forward to a few surprises, a few a-has, and hopefully a lot of laughs.

Earth911 searches for recycling centers near you

June 10th, 2008 by Charles S. Knight
Posted in Reviews | No Comments »


Earth 911 delivers actionable local information on recycling and product stewardship that empowers consumers to act locally, live responsibly and contribute to sustainability.

Both the Earth911.com site and 1-800-CLEANUP toll-free hotline are provided at no cost to the user or taxpayer. Earth 911 centralizes information and resources into a single user-friendly, neutral and non-governmental network.

If you like the web, I bet you’ll love South Korea!

June 10th, 2008 by Guest Author
Posted in Global, Guest Authors | No Comments »

Guest Author Joop Dorresteijn
Original post on TheNextWeb


That is, if you like to be connected to the Internet all day, while enjoying the fastest connections in the world! Enter the hyper connected society, with an astounding 90 percent of the country connected with 3G and a nation wide coverage of a South Korean version of Wimax.

How and why did South Korea become an overlord in Internet speed? In short; the South Korean government introduced a number of policy instruments to stimulate technological learning, aimed to strengthen international competitiveness of the economy. The government launched a five year plan to create a ubiquitous networked world in 1995, meaning that the country developed a stunning 1.5 billion dollar wireless network to stimulate the use of the Internet.

Today, South Korea is the most connected country on earth, but the funny thing is that we hardly hear anything about Korea’s web scene. This made us curious about what websites are popular over there, and if Korea has a web 2.0 scene. To find that out, we reviewed the three visited websites in Korea and we interviewed Chang W. Kim, Korean web 2.0 guru, and initiator of the Open Web Asia ‘08 conference.

The top three websites in South Korea

When we look at the Alexa list of most visited websites in South Korea, we find a social network, search engine and internet portal. They all have western equivalents, but are slightly different.

1. Cyworld
Social network Cyworld is a trendsetter in E-commerce, generating an astounding amount of revenue last year, surpassing that of almighty Facebook. The majority of their cash flow is built up out of digital presents, not advertisements, with the biggest amount coming from the ability to buy-your-friend-a-song, only that service made Cyworld the second-biggest music store in the world behind iTunes in 2006.

2. Naver

Search engine Naver is probably the worlds best in localized results. For the last few years, we have seen increasing amounts of search engines popping up on the Internet, they are using advanced algorithms to come up with better results. Not in South Korea, at least not at Naver. Employees analyze, index and produce content for their search engine by hand. Even though this labor intensity, Naver counts for 70% of the search queries in their country, leaving Google with just 2% far behind.

3. Daum
On number three we find Internet portal Daum, the most interesting thing about Daum is the combination of user generated content and the connections to mass media networks and companies. ‘Web stars’ from the video portal make appearances on dedicated TV programs. Also, companies are realizing the power of the new media, and show increasing interest in organizing contests, offering prizes to original video uploads. It seems that Daum is innovative in it’s approach in user generated content, and might be a interesting thing to look for as a ‘old media company’.

All these websites offer refreshing approaches on income, localization and approaches on user generated content. However, none of these top South Korean websites has yet become successful in Europe or the US, and we hardly hear from South Korean web companies on Techcrunch or the thenextweb.org, or meet many South Korean friends at conferences in the Europe. To me, this doesn’t add up. I suspect that all this connectivity would at create an internet culture, what about the web 2.0 culture in South Korea?

The startup scene in South Korea

We approached Chang W. Kim to find out more about the web scene in South Korea, Chang is blogger at web2.0Asia and CEO of the biggest blog portal in South Korea. He explains: “I think it can be said that the startup industry in Korea is still relatively small, although there is a web climate in Seoul, entrepreneurs meet at barcamp meetings, we have something similar to lunch 2.0 and there are conferences. However, from the investors side, there are not enough good startups in this region.”

Chang has written before about why so little South Korean companies get ‘Techcrunched‘. He thinks that it’s related to the lack of efforts to bring the Asian Web 2.0 innovations to the attention of the rest of the world. “Less effort to get these companies known, less attention to Asian Web 2.0 industry, less venture money flowing in, less number of startups, and so on.”

We also asked Chang if he would see South Korean/European web 2.0 projects work together in the future: “Frankly, when Koreans think about web 2.0, they usually think about Silicon Valley, not much about Europe. But if a company is good, I don’t think there is particularly a reason why Korean companies should NOT work with Europeans.”
Business opportunities in an networked world

South Korea is considered a broadband laboratory, it is a place that allows us to look for answers on how the Internet may evolve in the future. However, bridging the gap between the west and Asia might be difficult. Could it be that the South Korean culture simply doesn’t translate to the western world? For example, if we talk about culture, we find that the design of Korean websites are generally busier and cuter then their western equivalents. Also, South Korean websites are more aimed at being social and work different then Facebook. Most South Koreans consider Facebook to complex to work with. Something to consider for western entrepreneurs before launching a web application in South Korea, services like Web2Asia and the research that I will conduct in the next few months might help entrepreneurs to be successful in Korea.

Sources

1. Burns, S. (2007, 11 09). VNUnet. Retrieved 06 10, 2008, from VNU.co.uk: http://www.vnu.co.uk/vnunet/news/2203144/korea-portal-naver-sales-rise?vnu_lt=vnu_art_related_articles
2. Businessweek. (2008, 06 10). Businessweek, Investing. Retrieved 06 10, 2008, from Businessweek.com: http://investing.businessweek.com/research/stocks/snapshot/snapshot.asp?symbol=035720.KQ
3. Eriksson, S. (2005, 1 2). State policy for technological innovation in east asia: A comparative study of South Korea and Taiwan. Asian Geographer (23), pp. 61-91.
4. IOL technology. (2008, 05 28). Google not king in South Korea. Retrieved 06 10, 2008, from iol Technology: http://www.ioltechnology.co.za/article_page.php?from=rss_IOLTechHome&iSectionId=2883&iArticleId=4425561
5. Kim, C.-W. (2008, 06 10). Web 2.0 Asia. Retrieved from http://www.web20asia.com/
6. Stern, A. (2008, 05 29). Akamai Releases State of the Internet report. Retrieved 06 10, 2008, from Centerworks: http://www.centernetworks.com/akamai-state-internet
7. Townsend, A. Seoul: Birth of a broadband Metropolis. New York: 2004.
8. Fitzpatrick, M. (2008, 05 15). Print Article: Korea is totally wired. Retrieved 05 18, 2008, from The Sydney Morning Herland
9. Chris Taylor, (2006, 06, 14) The future is in South Korea, CNN Business 2.0 Retrieved 05 18, 2008

Using Semantics to Improve Machine Translation

June 10th, 2008 by Charles S. Knight
Posted in Guest Authors | 3 Comments »


By Kathleen Dahlgren, Ph.D., CTO
Cognition Technologies, Inc.
www.cognition.com

Machine translation technologies currently use one of several statistical algorithms to guess the translation by similarity to known translations. This is a gradient process that produces a set of proposed translations, ordered by the algorithm’s guess as to how likely the translations are to be the correct one. For example, in translating the Spanish sentence “Los obreros tratataba de terminar el edificio a tiempo” into English, a ranking of guessed translations might be:

“The workers tried of finish the building in good weather.”
“The workers treated the finish the building at time.”
“The workers tried to terminate the building on time.”
“The workers tried to finish the building with a steady pace.”
“The workers tried to finish the building on time.”

By employing a Semantic Natural Language Processing (NLP) technology, like that developed by Cognition Technologies, the process can use the technology’s deep ”understanding” of language to significantly reduce the quantity of bad guesses coming from the statistical machine translation algorithms. It then enables the process to select the most structurally and semantically plausible translations from among the suggested alternatives.

Semantic NLP technology can rule out some of the translations because they cannot be parsed (as in examples (1) and (2) above). It is unlikely that an ungrammatical sentence is a good translation. In the case of Cognition’s Semantic NLP (which includes a complete semantic map of the English language), word and phrase meanings are discovered, therefore, it can rule out other translations as being semantically implausible, as in (3) above (i.e. buildings aren’t “terminated”, either in the sense of being fired, or in the sense of completing an electrical circuit). Finally, by ranking the semantic plausibility of the remaining translations, Cognition’s deep Semantic NLP can decide that (4) and (5) are good translations, but that (5) is more likely to be accurate.

In a prototype implementation within a commercial automated translation engine, Cognition’s Semantic NLP was able to eliminate 80% of 1,000 suggested parses for each of 50 sentences. This task showed the power and value of deep NLP for improving statistical language translation.

Conclusion

The accuracy of automated machine translation technology depends on an understanding language, yet it lacks the resources to achieve a high rate of understanding. Cognition’s Semantic NLP™ can give automated machine translation an understanding of word and sentence meaning that no other technology can.

Dr. Kathleen Dahlgren is the Founder and Chief Technology Officer of Cognition Technologies. She began her career as a professor of computational linguistics at Pitzer College of the Claremont Colleges and then worked for IBM at their Los Angeles Scientific Center, focusing on building a “natural language understanding system.” Dr. Dahlgren has a Ph.D. in Linguistics and a post-doctorate in Computer Science from the University of California, Los Angeles. She has published a number of scholarly articles on the subjects of linguistics and computer science, and is the author of Naive Semantics for Natural Language Understanding. She is the co-author of Cognition’s seminal patent (1998), and she received the Small Business Innovation Award from the U.S. Army in 1995. Currently, she is also an adjunct professor of Linguistics at the University of California, Los Angeles.