Can Thought Vectors Deliver Human-Level Reasoning?
By transforming words into vectors that can be processed with deep learning techniques, Google pushed the accuracy of its search engine to new heights. Now we’re on the cusp of realizing the next logical step: the creation of thought vectors that may open the door to the sort of natural reasoning that has so far eluded AI.
One person who’s closely tracking the progress in natural language processing (NLP) is Hadayat Seddiqi, director of machine learning at legal tech company InCloudCounsel. Seddiqi foresees a day when AI will deliver semantic understanding of large amounts of written text via thought vectors.
We’re not quite there yet, however. According to Seddiqi, reaching an advanced level of semantic understanding in AI will require several key milestones.
The first milestones was the use of keywords for search engines. This was one of the first use cases of an AI that understands natural language, Seddiqi says. There are two main challenges with keywords: describing what you’re looking for, and finding items that are similar to your description, he tells Datanami.
“The description problem was first solved by using explicit keywords,” he says. “Then, the search for similar items was reduced to finding items that were exactly the same. This obviously doesn’t allow for wiggle room, which is necessary given the fuzzy nature of language.”
What we really want, he continues, is a way to expand the description with synonyms and other semantically related words. “But this requires a lot of manual effort, which doesn’t scale because mapping out every word and its relationships is a gargantuan effort that isn’t even all that well defined.”
The next milestone on the path to semantic AI was reached several years ago with the advent of word vectors.
With a word vector, a single word is transformed into a column of hundreds of numbers that represent that word within the context of other words around it. It’s a more evolved approach to NLP than older methods, such as one-hot encoding, writes Jayesh Bapu Ahire in “Introduction to Word Vectors.”
“In essence, traditional approaches to NLP, such as one-hot encodings, do not capture syntactic (structure) and semantic (meaning) relationships across collections of words and, therefore, represent language in a very naive way,” Ahire writes.
By applying this concept on Web-scale data, Seddiqi says, we’re able to automate what previously required the manual effort of skilled people.
The next milestone is thought vectors, which Seddiqi also calls “sentence vectors.”
“It’s not always the case that you can describe what you’re looking for with just a collection of keywords, he says. “Sometimes you want to express a more complex idea, which you can do by putting certain words in a particular order, otherwise known as a phrase or sentence.”
The thought vector was created by basically extrapolating the word vector approach to deal with longer strings of text — essentially creating larger arrays of word vectors. AI researcher Geoff Hinton is credited with driving this advance.
“This enables the expression of rich semantics that weren’t available before, opening the door to a type of search not previously possible,” Seddiqi says. “This technology was found to work in 2018, and it’s not unreasonable to say it will mature and become more widely adopted within the next few years based on current research progress.”
We’re probably a few years away from broad adoption of thought vectors. In the meantime, we can take some time to ponder the ramifications of this exciting new technology, and how it will impact business.
Seddiqi says there’s a clear pattern of hierarchy emerging in the progression of word vectors to thought vectors.
“We’re getting closer to AI understanding ideas at a sentence level using similar techniques from the word level and scaling them up,” he says. “This opens up exciting applications for AI understanding ideas requiring paragraphs, entire documents, or even entire books.
While NLP software is progressing at a rapid clip, there are sizable hardware barriers that stand between the present and the age of machines with human-like reasoning capabilities.
Hinton hinted at the future of thought vectors in 2015 speech to the Royal Society in London, where the Google researcher said:
“If we convert a sentence into a vector that captures the meaning of the sentence, then Google can do much better searches; they can search based on what’s being said in a document. Also, if you can convert each sentence in a document into a vector, then you can take that sequence of vectors and [try to model] natural reasoning. And that was something that old fashioned AI could never do.”
One could theoretically take all the documents the world has ever generated, encode them as thought vectors, and run them through a massive neural network. If one could do this, the resulting AI model could be expected to have the same level of reasoning as a human, Hinton said.
“To understand it at a human level, we’re probably going to need human level resources and we have trillions of connections [in our brains],” Hinton continued. “But the biggest networks we have built so far only have billions of connections. So we’re a few orders of magnitude off, but I’m sure the hardware people will fix that.”