AI firms will soon exhaust most of the internet’s data
Can they create more?
In 2006 fei-fei li, then at the University of Illinois, now at Stanford University, saw how mining the internet might help to transform ai research. Linguistic research had identified 80,000 “noun synonym sets”, or synsets: groups of synonyms that described the same sort of thing. The billions of images on the internet, Dr Li reckoned, must offer hundreds of examples of each synset. Assemble enough of them and you would have an ai training resource far beyond anything the field had ever seen. “A lot of people are paying attention to models,” she said. “Let’s pay attention to data.” The result was ImageNet.
Explore more
This article appeared in the Schools brief section of the print edition under the headline “Mining the net”
More from Schools brief
The race is on to control the global supply chain for AI chips
The focus is no longer just on faster chips, but on more chips clustered together
A short history of AI
In the first of six weekly briefs, we ask how AI overcame decades of underdelivering
Finding living planets
Life evolves on planets. And planets with life evolve
On the origin of “species”
The term, though widely used, is hard to define
Making your way in the world
An individual’s life story is a dance to the music of time
How organisms are organised
Like any well-run operation, a body is made of specialised parts