A neural network that spots similarities between programs could help computers code themselves

Computer programming has never been easy. The first coders wrote programs out by hand, scrawling symbols onto graph paper before converting them into large stacks of punched cards that could be processed by the computer. One mark out of place and the whole thing might have to be redone.

Nowadays coders use an array of powerful tools that automate much of the job, from catching errors as you type to testing the code before it’s deployed. But in other ways, little has changed. One silly mistake can still crash a whole piece of software. And as systems get more and more complex, tracking down these bugs gets more and more difficult. “It can sometimes take teams of coders days to fix a single bug,” says Justin Gottschlich, a computer scientist at Intel.

That’s why some people think we should just get machines to program themselves. Automated code generation has been a hot research topic for a number of years. Microsoft is building basic code generation into its widely used software development tools, Facebook has made a system called Aroma that autocompletes small programs, and DeepMind has developed a neural network that can come up with more efficient versions of simple algorithms than those devised by humans. Even OpenAI’s GPT-3 language model can churn out simple pieces of code, such as web page layouts, from natural-language prompts.

Gottschlich and his colleagues call this machine programming. Working with a team from Intel, MIT and the Georgia Institute of Technology in Atlanta, he has developed a system called Machine Inferred Code Similarity, or MISIM, that can extract the meaning of a piece of code—what the code is telling the computer to do—in much the same way as natural-language processing (NLP) systems can read a paragraph written in English. 

MISIM can then suggest other ways the code might be written, offering corrections and ways to make it faster or more efficient. The tool’s ability to understand what a program is trying to do lets it identify other programs that do similar things. In theory, this approach could be used by machines that wrote their own software, drawing on a patchwork of preexisting programs with minimal human oversight or input.

MISIM works by comparing snippets of code with millions of other programs it has already seen, taken from a large number of online repositories. First it translates the code into a form that captures what it does but ignores how it is written, because two programs written in very different ways sometimes do the same thing. MISIM then uses a neural network to find other code that has a similar meaning. In a preprint, Gottschlich and his colleagues report that MISIM is 40 times more accurate than previous systems that try to do this, including Aroma.

MISIM is an exciting step forward, says Veselin Raychev, CTO at the Swiss-based company DeepCode, whose bug-catching tools—among the most advanced on the market—use neural networks trained on millions of programs to suggest improvements to coders as they write.

But machine learning is still not great at predicting whether or not something is a bug, says Raychev. That’s because it is hard to teach a neural network what is or isn’t an error unless it has been labeled as such by a human.

There’s a lot of interesting research being done with deep neural networks and bug fixing, he says, “but practically they’re not there yet, by a very big margin.” Typically AI bug-catching tools produce lots of false positives, he says.

MISIM gets around this by using machine learning to spot similarities between programs rather than identifying bugs directly. By comparing a new program with an existing piece of software that is known to be correct, it can alert the coder to important differences that could be errors.

Intel plans to use the tool as a code recommendation system for developers in-house, suggesting alternative ways to write code that are faster or more efficient. But because MISIM is not tied to the syntax of a specific program, there is much more it could potentially do. For example, it could be used to translate code written in an old language like COBOL into a more modern language like Python. This matters because a lot of institutions, including the US government, still rely on software written in languages that few coders know how to maintain or update. 

Ultimately, Gottschlich thinks this idea could be applied to natural language. Combined with NLP, the ability to work with the meaning of code separately from its textual representation could one day let people write software simply by describing what they want to do in words, he says. 

“Building little apps for your phone, or things like that that will help your everyday life—I think those are not too far off,” says Gottschlich. “I would like to see 8 billion people create software in whatever way is most natural for them.”