Researchers at the non-profit AI research group OpenAI just wanted to train their new text generation software to predict the next word in a sentence. It blew away all of their expectations and was so good at mimicking writing by humans they’ve decided to pump the brakes on the research while they explore the damage it could do.
Elon Musk has been clear that he believes artificial intelligence is the “biggest existential threat” to humanity. Musk is one of the primary funders of OpenAI and though he has taken a backseat role at the organization, its researchers appear to share his concerns about opening a Pandora’s box of trouble. This week, OpenAI shared a paper covering their latest work on text generation technology but they’re deviating from their standard practice of releasing the full research to the public out of fear that it could be abused by bad actors. Rather than releasing the fully trained model, it’s releasing a smaller model for researchers to experiment with.
The researchers used 40GB of data pulled from 8 million web pages to train the GPT-2 software. That’s ten times the amount of data they used for the first iteration of GPT. The dataset was pulled together by trolling through Reddit and selecting links to articles that had more than three upvotes. When the training process was complete, they found that the software could be fed a small amount of text and convincingly continue writing at length based on the prompt. It has trouble with “highly technical or esoteric types of content” but when it comes to more conversational writing it generated “reasonable samples” 50 percent of the time.
In one example, the software was fed this paragraph:
In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.
Based on those two sentences, it was able to continue writing this whimsical news story for another nine paragraphs in a fashion that could have believably been written by a human being. Here are the next few machine-paragraphs that were produced by the machine:
The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.
Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.
Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow.
GPT-2 is remarkably good at adapting to the style and content of the prompts it’s given. The Guardian was able to take the software for a spin and tried out the first line of George Orwell’s Nineteen Eighty-Four: “It was a bright cold day in April, and the clocks were striking thirteen.” The program picked up on the tone of the selection and proceeded with some dystopian science fiction of its own:
I was in my car on my way to a new job in Seattle. I put the gas in, put the key in, and then I let it run. I just imagined what the day would be like. A hundred years from now. In 2045, I was a teacher in some school in a poor part of rural China. I started with Chinese history and history of science.
The OpenAI researchers found that GPT-2 performed very well when it was given tasks that it wasn’t necessarily designed for, like translation and summarization. In their report, the researchers wrote that they simply had to prompt the trained model in the right way for it to perform these tasks at a level that was comparable to other models that are specialized. After analyzing a short story about an Olympic race, the software was able to correctly answer basic questions like “What was the length of the race?” and “Where did the race begin?”
These excellent results have freaked the researchers out. One concern they have is that the technology would be used to turbo-charge fake news operations. The Guardian published a fake news article written by the software along with its coverage of the research. The article is readable and contains fake quotes that are on topic and realistic. The grammar is better than a lot what you’d see from fake news content mills. And according to The Guardian’s Alex Hern, it only took 15 seconds for the bot to write the article.
Other concerns that the researchers listed as potentially abusive included automating phishing emails, impersonating others online, and self-generating harassment. But they also believe that there are plenty of beneficial applications to be discovered. For instance, it could be a powerful tool for developing better speech recognition programs or dialogue agents.
OpenAI plans to engage the AI community in a dialogue about their release strategy and hopes to explore potential ethical guidelines to direct this type of research in the future. They said they will have more to discuss in public in six months.