AI helps find bugs, Intel’s open source code programming tool ControlFlag

Reduce code debugging time and cost expenditure

Because code debugging is essential to program development, almost all large-scale software needs to be debugged to avoid software errors to the greatest extent.

But for most developers, this process is not only very time-consuming, but most of the debugging can only be done manually. Because repairing a software defect may take days, weeks, or even months, it is estimated that up to 50% of software development time is wasted on debugging programs. This is because most errors require semantic analysis to identify and evaluate the root cause, and even the most advanced debugging systems cannot effectively perform this analysis.

Justin Gottschlich, chief artificial intelligence scientist at Intel Labs, also said: “Although some progress has been made in the research of automated debugging in the past few decades, existing tools still cannot perfectly detect increasingly complex software errors. This is also the main reason for debugging. A key reason for the human-driven process.”

In addition, the budget for debugging code is also very high. According to Intel’s official introduction, the IT industry spent approximately US$2 trillion on debugging code-related links in 2020, accounting for about half of the average IT budget.

“Super Power” ControlFlag

It is reported that the system is part of Intel’s Machine Programming Research (MPR) project. The overall goal of the project is to reduce the time required to develop software by 1,000 times through automation. For example, one of the areas Gottschlich’s team is working on is eventually extending ControlFlag’s capabilities to automatically fix errors it detects.

Since its launch last year, Intel has tested the machine learning tool on various software systems and has achieved gratifying results. “When we initially designed the system, we didn’t expect it to find highly complex defects,” said Justin Gottschlich, chief artificial intelligence scientist at Intel Labs. “However, given its self-supervised design, ControlFlag was able to find highly complex and subtle software flaws, and even those of us who built it were shocked.”

The Intel team used an “unsupervised” learning method to allow ControlFlag to detect errors in a wider repository. The system learns coding patterns from more than 1 billion lines of unlabeled source code, enabling it to achieve high accuracy and even adapt to the style of developers to distinguish software anomalies and programming language style changes.

ControlFlag is suitable for any programming language that contains control structures (such as C/C++), and ControlFlag can continuously learn from unmarked source code, and “evolve” with the introduction of new data to make itself better. Although it cannot automatically resolve the code errors it finds, the tool can provide developers with potential modification suggestions.

Justin Gottschlich also said that using ControlFlag on only two proprietary software repositories so far, more than 300 defects in production quality and deployed programs have been discovered. For example, last year ControlFlag detected code anomalies in a computer software project called Client URL (cURL), which used various network protocols to transmit data more than 1 billion times a day. After reporting the anomaly to the cURL team, they agreed to ControlFlag’s discovery and redesigned the code to fix the problem.

Continuously improving ControlFlag

As the Intel team is committed to developing ControlFlag, the past year has brought quite a lot of learning points. Gottschlich believes that two key areas for improvement are reducing the number of false positives reported by the tool and integrating more advanced semantic analyzers into ControlFlag’s reasoning.

However, as a system that will become one of Intel’s machine programming tool suite, ControlFlag will continue to evolve. “ControlFlag’s progress is unlikely to stop,” Gottschlich emphasized. “This is mainly because with the development of software programming languages, hardware description languages ​​and computing devices, ControlFlag also needs to evolve to keep up with them.”

At the same time, Intel’s MPR team is working on projects focused on simplifying software development. For example, last year the company also released a tool jointly developed with the Massachusetts Institute of Technology laboratory that can study code snippets to understand what the software intends to do. The system is called MISIM (Machine Inferred Code Similarity), and it uses a pre-existing code catalog to understand the intent behind the new algorithm and helps engineers develop software by suggesting other programming methods or providing options to make the code more efficient.

Gottschlich expects that MISIM will work with ControlFlag one day. “When the right ideas were fused together, we envisioned a more powerful new system that would be able to detect all the defects that ControlFlag can currently detect, as well as hundreds of defects that currently cannot be detected due to potential complexity. “Gottschlich said.

The Links:   MG50G6ES40 AA084XA02

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *