Skip to main content


ArtForm is a concolic analysis tool for JavaScript validation code attached to web forms.

This is a collaboration with Casper Svenning Jensen and Anders Møller, Aarhus University.

Our prototype tool is available on GitHub. Please contact Ben Spencer for support installing and using it.

Form Analysis

ArtForm Manual Mode: Allows manual interaction with the page and trace recording.

When crawling the web, data is often hidden behind forms with validation constraints. We aim to analyse JavaScript code attached to forms and infer their integrity constraints, retrieve or model the hidden data, and improve the efficiency of searching or data-extraction tools.

  • Dynamic features of JavaScript and real-world code make static analysis difficult.
  • We use concolic analysis and perform a symbolic execution driven by concrete runs.
  • ArtForm is built on Artemis, a tool for automated testing of JavaScript applications.

ArtForm Architecture

ArtForm Architecture: An instrumented version of WebKit records symbolic information which is used to generate new concrete
            inputs to test.

Interface Analyser

  • Chooses the entry-point.
  • Could be buttons, links, images or inputs and forms may cover the entire page.
  • ArtForm uses DIADEM for page analysis.

Test Driver

  • Loads the test page with no saved state.
  • Injects the given inputs into the form.
  • Simulates real click on submit button.

Instrumented WebKit Browser

  • Controls all code execution and events.
  • Allows access to the DOM implementation and page information.

Symbolic Interpreter

  • Tracks both concrete and symbolic values.
  • Form inputs are initially symbolic variables and symbolic information is propagated as they are used.
  • Records the path taken and the symbolic branches observed.

Constraint Solver

  • Solves the path constraint to generate the next input values to test.
  • Translates our internal constraints to input for a third-party solver.
  • ArtForm uses CVC4.

Trace Classifier

  • Decides whether a trace was a successful submission or not.
  • Based on alerts, changes to the DOM and page loads.

Path Tree

  • Stores information about the previous runs.

Search Procedure

  • Chooses the next path to explore.
  • We are Investigating good search strategies to find interesting parts of the tree more quickly.



Search form for

The execution tree for the

Symbolic branches explored: 629
Traces explored: 412
Successful submissions: 248
Failed submissions: 164
Constraints solved: 692

Dark green, numbered: Event handlers
Light blue, two children: Symbolic branches
Right/green edges: Condition is true, branch taken
Left/red edges: Condition is false, branch not taken
Green 'S': Successful submission
Red 'F': Failed submission
Purple 'U': Unreachable path (constraint was unsatisfiable)
Orange '!': Alert message
Light green 'L': New page loaded

Full execution tree (2.4MB png)

Page Analysis Tools

As well as our main analysis, we have developed some tools to help manual analysis of web forms and their validation code. Manual interactions with a form are recorded as both concrete and symbolic traces, which are linked to the original page source code.

Path Trace Report: Shows event handlers and functions called in the recorded trace. Coverage
            Report: Shows which lines in the JavaScript source were covered or contained symbolic information.


  • Event handlers
    • Each field has its own validation function and may depend on other fields.
    • Triggering them in-order is a useful guess, but may miss some information.
    • Checking dependence between the handlers is difficult.
    • Testing all orderings is not feasible.
  • Forms may be updated dynamically.
  • Select boxes and radio buttons have implied constraints found in the DOM.
  • JavaScript’s coercion semantics and NaN are difficult to model and solve.
  • Some constraint types are not supported by the solver or our translation.
  • JavaScript minification and obfuscation.
  • Long loops and repeated code.

Future Work

  • Continue testing on real-world sites to identify common patterns which we do not handle well.
  • Track event handler dependencies and use partial order methods to choose appropriate orderings to test.
  • Use heuristics or static analysis to target the search and speed up exploration of the most interesting parts of the trees.
  • Work on converting the trees to useful descriptions of the constraints.
  • Infer higher-level constraints such as set containment which are not shown directly in our traces.
  • New features for our translator and the solver, for example more complex regular expression tests or string inequalities.

Principal Investigator


Share this: