Researchers from Allen Institute for AI have built a computer system capable of teaching itself many facets of broad concepts by scouring and analyzing search engines using natural language processing and computer vision techniques.
Meet the algorithm that can learn “everything about anything”
Posted on MAY. 23, 2014 - 10:16 AM | by
Derrick HarrisThe most recent advances in artificial intelligence research are pretty staggering, thanks in part to the abundance of data available on the web. We’ve covered how deep learning is helping create self-teaching and highly accurate systems
for tasks such as sentiment analysis and
facial recognition, but there are also models that can
solve geometry and
algebra problems, predict
whether a stack of dishes is likely to fall over and (from the team behind
Google’s word2vec)
understand entire paragraphs of text.
(Hat tip to frequent commenter Oneasum for
pointing out all these projects.)
One of the more interesting projects is
a system called LEVAN, which is short for Learn EVerything about ANything and was created by a group of researchers out of the
Allen Institute for Artificial Intelligence and the University of Washington. One of them, Carlos Guestrin,
is also co-founder and CEO of a data science startup called GraphLab. What’s really interesting about LEVAN is that it’s neither human-supervised nor unsupervised (like many deep learning systems), but what its creators call “webly supervised.”
What that means, essentially, is that LEVAN uses the web to learn everything it needs to know. It scours
Google Books Ngrams to learn common phrases associated with a particular concept, then searches for those phrases in web image repositories such as Google Images, Bing and Flickr. For example, LEVAN now knows that “heavyweight boxing,” “boxing ring” and “ali boxing” are all part of
the larger concept of “boxing,” and it knows what each one looks like.
More impressive still is that because LEVAN uses text and image references to teach itself concepts, it’s also able to learn when words or phrases mean the same thing. So while it might learn, for example, that “Mohandas Gandhi” and “Mahatma Gandhi” are both sub-concepts of “Gandhi,” it will also learn after analyzing enough images that they’re the same person.
So far, LEVAN has modeled 150 different concepts and more than 50,000 sub-concepts, and has annotated more than 10 million images with information about what’s in them and what’s happening in them. The
project website lets you examine its findings for each concept and download the models.
According to
a recent presentation by one of its creators, LEVAN was designed to run nicely on the Amazon Web Services cloud — yet another sign of how fast the AI space is moving. Computer science skills and math knowledge are one impediment to broadly accessible AI, but those can be addressed
by SDKs,
APIs, and other methods of abstracting complexity. However, training AI models
can require a lot of computing power, something that is easily available to the likes of Facebook and Google but that for everyday users might need to be offloaded to the cloud.
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请
点击举报。