DeepMind says its new AI coding engine is as good as an average human programmer

February 3, 2022

DeepMind has created an AI system named AlphaCode that it says “writes computer programs at a competitive level.” The Alphabet subsidiary tested its system against coding challenges used in human competitions and found that its program achieved an “estimated rank” placing it within the top 54 percent of human coders. The result is a significant step forward for autonomous coding, says DeepMind, though AlphaCode’s skills are not necessarily representative of the sort of programming tasks faced by the average coder.

Oriol Vinyals, principal research scientist at DeepMind, told The Verge over email that the research was still in the early stages but that the results brought the company closer to creating a flexible problem-solving AI — a program that can autonomously tackle coding challenges that are currently the domain of humans only. “In the longer-term, we’re excited by [AlphaCode’s] potential for helping programmers and non-programmers write code, improving productivity or creating new ways of making software,” said Vinyals.

AlphaCode was tested against challenges curated by Codeforces, a competitive coding platform that shares weekly problems and issues rankings for coders similar to the Elo rating system used in chess. These challenges are different from the sort of tasks a coder might face while making, say, a commercial app. They’re more self-contained and require a wider knowledge of both algorithms and theoretical concepts in computer science. Think of them as very specialized puzzles that combine logic, maths, and coding expertise.

In one example challenge that AlphaCode was tested on, competitors are asked to find a way to convert one string of random, repeated s and t letters into another string of the same letters using a limited set of inputs. Competitors cannot, for example, just type new letters but instead have to use a “backspace” command that deletes several letters in the original string. You can read a full description of the challenge below:

Ten of these challenges were fed into AlphaCode in exactly the same format they’re given to humans. AlphaCode then generated a larger number of possible answers and winnowed these down by running the code and checking the output just as a human competitor might. “The whole process is automatic, without human selection of the best samples,” Yujia Li and David Choi, co-leads of the AlphaCode paper, told The Verge over email.

AlphaCode was tested on 10 of challenges that had been tackled by 5,000 users on the Codeforces site. On average, it ranked within the top 54.3 percent of responses, and DeepMind estimates that this gives the system a Codeforces Elo of 1238, which places it within the top 28 percent of users who have competed on the site in the last six months.

“I can safely say the results of AlphaCode exceeded my expectations,” Codeforces founder Mike Mirzayanov said in a statement shared by DeepMind. “I was sceptical [sic] because even in simple competitive problems it is often required not only to implement the algorithm, but also (and this is the most difficult part) to invent it. AlphaCode managed to perform at the level of a promising new competitor.”

DeepMind notes that AlphaCode’s current skill set is only currently applicable within the domain of competitive programming but that its abilities open the door to creating future tools that make programming more accessible and one day fully automated.

Many other companies are working on similar applications. For example, Microsoft and the AI lab OpenAI have adapted the latter’s language-generating program GPT-3 to function as an autocomplete program that finishes strings of code. (Like GPT-3, AlphaCode is also based on an AI architecture known as a transformer, which is particularly adept at parsing sequential text, both natural language and code). For the end user, these systems work just like Gmails’ Smart Compose feature — suggesting ways to finish whatever you’re writing.

A lot of progress has been made developing AI coding systems in recent years, but these systems are far from ready to just take over the work of human programmers. The code they produce is often buggy, and because the systems are usually trained on libraries of public code, they sometimes reproduce material that is copyrighted.

In one study of an AI programming tool named Copilot developed by code repository GitHub, researchers found that around 40 percent of its output contained security vulnerabilities. Security analysts have even suggested that bad actors could intentionally write and share code with hidden backdoors online, which then might be used to train AI programs that would insert these errors into future programs.

Challenges like these mean that AI coding systems will likely be integrated slowly into the work of programmers — starting as assistants whose suggestions are treated with suspicion before they are trusted to carry out work on their own. In other words: they have an apprenticeship to carry out. But so far, these programs are learning fast.

Warren Buffett says Berkshire Hathaway ‘did better than I expected’ last…

Car buyers beware: Big tax credit on EVs is in limbo

The Dow plunges 750 points as bad economic news piles up…

Warren Buffett Just Issued His Most Daunting Warning to Wall Street…

Rivian posts $170 million ‘gross profit’ in Q4, sees losses decreasing…

The Dow plunges 750 points as bad economic news piles up…

Trump enthusiasm matches GameStop mania as small investors flood market in…

S&P 500, European shares end at record highs as markets digest…

S&P 500 sets fresh record as stocks rally into the close

Stock market today: S&P 500 nears record, Dow, Nasdaq jump as…

Home sales drop sharply as prices hit an all-time high for…

‘Stagflation’ fears haunt US markets despite Trump’s pro-growth agenda

Dow closes more than 400 points lower Thursday, S&P 500 slides…

Hong Kong shares hit three-year highs as investors weigh Japan inflation…

Meet the Monster Stock that Continues to Crush the Market

Nvidia warns ‘production anomaly’ causing performance losses on some GeForce RTX…

Apple Intelligence to Expand to Vision Pro Headset in April

Nvidia confirms ‘rare’ RTX 5090 and 5070 Ti manufacturing issue

Research shows AI will try to cheat if it realizes it…

Chinese smartphone firm Oppo launches slim $1,870 folding phone to rival…

DeepMind says its new AI coding engine is as good as an average human programmer

Most Viewed

Americans are more stressed about money than work or relationships—here’s why

Meta warns as many as one million Facebook users that their...

Stocks Mixed in Holiday-Hit Trading; Dollar Steady: Markets Wrap

Trending Now

The Dow plunges 750 points as bad economic news piles up fast

Trump enthusiasm matches GameStop mania as small investors flood market in record numbers

Apple Intelligence to Expand to Vision Pro Headset in April

DeepMind says its new AI coding engine is as good as an average human programmer

RELATED ARTICLES

Most Viewed

Trending Now