clangml: OCaml bindings for clang.

clangml provides bindings for all versions of clang, from 3.4 to 8.0.0.

Introduction

It is a complete rewritting of the previous clangml (clangml versions <4.0.0): the bindings now rely on automatically generated C stubs to libclang, with some extensions when libclang is incomplete. Contrary to old clangml versions, the versions of clangml from 4.0.0 are independent from the version of the clang library: any version of clangml from 4.0.0 can be built with any version of the clang library in the supported interval. Currently, all versions of clang, from 3.4 to 8.0.0, are supported.

However, clangml is statically linked to libclang, and clangml needs to be rebuilt for every version of libclang to run with. In addition, the low-level bindings are automatically generated from libclang’s header and their signature can change from one version of libclang to another.

The high-level bindings (Clang.Ast, Clang.Type, Clang.Expr, Clang.Stmt, Clang.Decl and Clang.Enum_constant) provide abstractions that are essentially independent from libclang version. These abstraction aim mainly to provide an algebraic datatype representation of Clang abstract syntax tree (AST). It is worth noticing that there can be some differences in the way clang parses file from one version to another (in particular, some features of the C/C++ languages are only supported by recent versions of clang, see some examples in Clang__ast module documentation).

Installation

clangml is installable via opam: opam install clangml.

Manual installation requires a bootstrapped source directory. Commits from branch snapshot are bootstrapped: a new snapshot is committed by continuous integration after every successful build from master.

Snapshot tarball: https://gitlab.inria.fr/tmartine/clangml/-/archive/snapshot/clangml-snapshot.tar.gz

To build clangml from snapshot or from a bootstrapped source directory, you may either: * execute ./configure && make && make install (this method is recommended if you have to pass some options to configure); * execute opam pin add git+https://gitlab.inria.fr/tmartine/clangml.git#snapshot.

To bootstrap the repository from a development branch (e.g., master), execute ./bootstrap.sh first, then ./configure && make && make install as usual.

clangml’s configure relies on llvm-config to find clang’s library. By default, llvm-config is searched in PATH, and you may specify a path with ./configure --with-llvm-config=....

clangml requires some dependencies: opam install dune stdcompat ppx_deriving visitors. Additionnally, to run make tests: opam install ocamlcodoc.

libclang and other external dependencies can be installed with opam depext plugin:

opam pin add -n git+https://gitlab.inria.fr/tmartine/clangml.git#snapshot
opam depext -i clangml

(-n option asks opam pin not to install clangml directly, and -i option asks opam depext to install clangml once dependencies are installed.)

Usage

The module Clang provides direct bindings to most of the symbols defined by libclang to match OCaml conventions, camel-case symbols have been renamed to lower-case symbols with underscores, and clang_ prefixes have been removed. Additional bindings have been defined in libclang_extensions.h for some parts of clang’s API that have not been covered by libclang.

The module Clang.Ast provides a higher-level interface to clang’s AST. The function Clang.Ast.parse_file returns the AST from a file and Clang.Ast.parse_string returns the AST from a string. You may try these functions in OCaml toplevel to discover the resulting data structure.

The module Clang.Ast includes in particular the module Clang__ast which declares the algebraic data types that represent the AST. The module Clang__ast uses ppx_deriving and visitors to make the data structure comparable, showable and visitable. The documentation of most of the nodes contains examples that can be used as references for how syntactic constructions are parsed, and that are extracted with ocamlcodoc and serve as unit tests with dune runtest (or, equivalently, make tests).

Modules Clang.Type, Clang.Expr, Clang.Stmt, Clang.Decl and Clang.Enum_constant provides sub-modules Set and Map as well as high-level abstractions to some libclang’s bindings.

In particular:

Generating a new seed

Three files, clang_stubs.c, clang__bindings.ml and clang__bindings.mli, are generated for each version of LLVM by the stubgen tool (sub-directory stubgen).

To generate these files for a given version of LLVM, you may run: stubgen --llvm-config=$PATH_TO_LLVM_CONFIG $TARGET_PATH.

stubgen depends on pcre and cmdliner.