Prime - Programming with Millions of Examples
We present the PRIME tool which utilizes static specification mining techniques to extract useful specifications of library APIs from a large number of code fragments that use it, and then use the collected samples for code completion, verification, and - by using data mining techniques - to aggregate the samples into use-cases and sort them according to popularity and complexity.
Programming is becoming more and more about using frameworks and libraries, with most of them designed to support a wide range of usage scenarios. Typically, a programmer only needs partial functionality from a library, but is required to navigate the extensive library interface (API) to find how to implement the desired functionality. Instead of navigating the complicated library code and documentation, programmers often rely on code examples of client programs that use the library. Such code examples can often be easily obtained from library documentation, other programmers, or via a myriad of search engines and other online tools. Making sense of these vast numbers of examples, however, can be an extremely challenging task. Code fragments using the API of interest may appear in slightly different contexts and are often interleaved with irrelevant code, making it hard for a programmer to tease out the relevant details. Furthermore, for a given code sample there is always the possibility that its use of the API is erroneous or sub-optimal. These factors make it hard for a human to benefit from this vast amount of available information. Using a combination of program analysis and machine learning techniques, PRIME mines library specifications from a large collection of client code using it, allowing programmers to write new code using the library, even when they are not familiar with it. The results from PRIME can then also be used for automatic completion or verification of new code using that API.