module Cil:CIL API Documentation. An html version of this document can be found at http://hal.cs.berkeley.edu/cilsig..end
val initCIL : unit -> unitCil.msvcMode.val cilVersion : stringval cilVersionMajor : intval cilVersionMinor : intval cilVersionRevision : intFrontc.parse: string -> unit -> Cil.file. This function must be given
the name of a preprocessed C file and will return the top-level data
structure that describes a whole source file. By default the parsing and
elaboration into CIL is done as for GCC source. If you want to use MSVC
source you must set the Cil.msvcMode to true and must also invoke the
function Frontc.setMSVCMode: unit -> unit.Cil.file using the following iterators: Cil.mapGlobals,
Cil.iterGlobals and Cil.foldGlobals. You can also use the
Cil.dummyFile when you need a Cil.file as a placeholder. For each
global item CIL stores the source location where it appears (using the
type Cil.location)type file = {
|
mutable fileName : |
(* | The complete file name | *) |
|
mutable globals : |
(* | List of globals as they will appear in the printed file | *) |
|
mutable globinit : |
(* | An optional global initializer function. This is a function where
you can put stuff that must be executed before the program is
started. This function, is conceptually at the end of the file,
although it is not part of the globals list. Use Cil.getGlobInit
to create/get one. | *) |
|
mutable globinitcalled : |
(* | Whether the global initialization function is called in main. This should always be false if there is no global initializer. When you create a global initialization CIL will try to insert code in main to call it. This will not happen if your file does not contain a function called "main" | *) |
typecomment =location * string
type global =
| |
GType of |
(* | A typedef. All uses of type names (through the TNamed constructor)
must be preceded in the file by a definition of the name. The string
is the defined name and always not-empty. | *) |
| |
GCompTag of |
(* | Defines a struct/union tag with some fields. There must be one of
these for each struct/union tag that you use (through the TComp
constructor) since this is the only context in which the fields are
printed. Consequently nested structure tag definitions must be
broken into individual definitions with the innermost structure
defined first. | *) |
| |
GCompTagDecl of |
(* | Declares a struct/union tag. Use as a forward declaration. This is printed without the fields. | *) |
| |
GEnumTag of |
(* | Declares an enumeration tag with some fields. There must be one of
these for each enumeration tag that you use (through the TEnum
constructor) since this is the only context in which the items are
printed. | *) |
| |
GEnumTagDecl of |
(* | Declares an enumeration tag. Use as a forward declaration. This is printed without the items. | *) |
| |
GVarDecl of |
(* | A variable declaration (not a definition). If the variable has a function type then this is a prototype. There can be several declarations and at most one definition for a given variable. If both forms appear then they must share the same varinfo structure. A prototype shares the varinfo with the fundec of the definition. Either has storage Extern or there must be a definition in this file | *) |
| |
GVar of |
(* | A variable definition. Can have an initializer. The initializer is updateable so that you can change it without requiring to recreate the list of globals. There can be at most one definition for a variable in an entire program. Cannot have storage Extern or function type. | *) |
| |
GFun of |
(* | A function definition. | *) |
| |
GAsm of |
(* | Global asm statement. These ones can contain only a template | *) |
| |
GPragma of |
(* | Pragmas at top level. Use the same syntax as attributes | *) |
| |
GText of |
(* | Some text (printed verbatim) at top level. E.g., this way you can put comments in the output. | *) |
Cil.typ.
Among types we differentiate the integral types (with different kinds
denoting the sign and precision), floating point types, enumeration types,
array and pointer types, and function types. Every type is associated with
a list of attributes, which are always kept in sorted order. Use
Cil.addAttribute and Cil.addAttributes to construct list of
attributes. If you want to inspect a type, you should use
Cil.unrollType or Cil.unrollTypeDeep to see through the uses of
named types.Cil.bitsSizeOf, the alignment of a type
(in bytes) Cil.alignOf_int, and can convert an offset into a start and
width (both in bits) using the function Cil.bitsOffset. At the moment
these functions do not take into account the packed attributes and
pragmas.type typ =
| |
TVoid of |
(* | Void type. Also predefined as Cil.voidType | *) |
| |
TInt of |
(* | An integer type. The kind specifies the sign and width. Several
useful variants are predefined as Cil.intType, Cil.uintType,
Cil.longType, Cil.charType. | *) |
| |
TFloat of |
(* | A floating-point type. The kind specifies the precision. You can
also use the predefined constant Cil.doubleType. | *) |
| |
TPtr of |
(* | Pointer type. Several useful variants are predefined as
Cil.charPtrType, Cil.charConstPtrType (pointer to a
constant character), Cil.voidPtrType,
Cil.intPtrType | *) |
| |
TArray of |
(* | Array type. It indicates the base type and the array length. | *) |
| |
TFun of |
(* | Function type. Indicates the type of the result, the name, type
and name attributes of the formal arguments (None if no
arguments were specified, as in a function whose definition or
prototype we have not seen; Some [] means void). Use
Cil.argsToList to obtain a list of arguments. The boolean
indicates if it is a variable-argument function. If this is the
type of a varinfo for which we have a function declaration then
the information for the formals must match that in the
function's sformals. Use Cil.setFormals, or
Cil.setFunctionType, or Cil.makeFormalVar for this
purpose. | *) |
| |
TNamed of |
(* | The use of a named type. Each such type name must be preceded
in the file by a GType global. This is printed as just the
type name. The actual referred type is not printed here and is
carried only to simplify processing. To see through a sequence
of named type references, use Cil.unrollType or
Cil.unrollTypeDeep. The attributes are in addition to those
given when the type name was defined. | *) |
| |
TComp of |
(* | The most delicate issue for C types is that recursion that is possible by
using structures and pointers. To address this issue we have a more
complex representation for structured types (struct and union). Each such
type is represented using the Cil.compinfo type. For each composite
type the Cil.compinfo structure must be declared at top level using
GCompTag and all references to it must share the same copy of the
structure. The attributes given are those pertaining to this use of the
type and are in addition to the attributes that were given at the
definition of the type and which are stored in the Cil.compinfo. | *) |
| |
TEnum of |
(* | A reference to an enumeration type. All such references must
share the enuminfo among them and with a GEnumTag global that
precedes all uses. The attributes refer to this use of the
enumeration and are in addition to the attributes of the
enumeration itself, which are stored inside the enuminfo | *) |
| |
TBuiltin_va_list of |
(* | This is the same as the gcc's type with the same name | *) |
Cil.isIntegralType,
Cil.isArithmeticType,
Cil.isPointerType,
Cil.isFunctionType,
Cil.isArrayType.
There are two easy ways to scan a type. First, you can use the
Cil.existsType to return a boolean answer about a type. This function
is controlled by a user-provided function that is queried for each type that is
used to construct the current type. The function can specify whether to
terminate the scan with a boolean result or to continue the scan for the
nested types.
The other method for scanning types is provided by the visitor interface (see
Cil.cilVisitor).
If you want to compare types (or to use them as hash-values) then you should
use instead type signatures (represented as Cil.typsig). These
contain the same information as types but canonicalized such that simple Ocaml
structural equality will tell whether two types are equal. Use
Cil.typeSig to compute the signature of a type. If you want to ignore
certain type attributes then use Cil.typeSigWithAttrs.
type ikind =
| |
IChar |
(* | char | *) |
| |
ISChar |
(* | signed char | *) |
| |
IUChar |
(* | unsigned char | *) |
| |
IInt |
(* | int | *) |
| |
IUInt |
(* | unsigned int | *) |
| |
IShort |
(* | short | *) |
| |
IUShort |
(* | unsigned short | *) |
| |
ILong |
(* | long | *) |
| |
IULong |
(* | unsigned long | *) |
| |
ILongLong |
(* | long long (or _int64 on Microsoft Visual C) | *) |
| |
IULongLong |
(* | unsigned long long (or unsigned _int64 on Microsoft
Visual C) | *) |
type fkind =
| |
FFloat |
(* | float | *) |
| |
FDouble |
(* | double | *) |
| |
FLongDouble |
(* | long double | *) |
type attribute =
| |
Attr of |
(* | An attribute has a name and some optional parameters. The name should not start or end with underscore. When CIL parses attribute names it will strip leading and ending underscores (to ensure that the multitude of GCC attributes such as const, __const and __const__ all mean the same thing.) | *) |
typeattributes =attribute list
Cil.addAttribute and Cil.addAttributes to insert attributes in an
attribute list and maintain the sortedness.type attrparam =
| |
AInt of |
(* | An integer constant | *) |
| |
AStr of |
(* | A string constant | *) |
| |
ACons of |
(* | Constructed attributes. These
are printed foo(a1,a2,...,an).
The list of parameters can be
empty and in that case the
parentheses are not printed. | *) |
| |
ASizeOf of |
(* | A way to talk about types | *) |
| |
ASizeOfE of |
|||
| |
ASizeOfS of |
(* | Replacement for ASizeOf in type signatures. Only used for attributes inside typsigs. | *) |
| |
AAlignOf of |
|||
| |
AAlignOfE of |
|||
| |
AAlignOfS of |
|||
| |
AUnOp of |
|||
| |
ABinOp of |
|||
| |
ADot of |
(* | a.foo * | *) |
| |
AStar of |
(* | a | *) |
| |
AAddrOf of |
(* | & a * | *) |
| |
AIndex of |
(* | a1a2 | *) |
| |
AQuestion of |
(* | a1 ? a2 : a3 * | *) |
Cil.compinfo describes the definition of a
structure or union type. Each such Cil.compinfo must be defined at the
top-level using the GCompTag constructor and must be shared by all
references to this type (using either the TComp type constructor or from
the definition of the fields.
If all you need is to scan the definition of each
composite type once, you can do that by scanning all top-level GCompTag.
Constructing a Cil.compinfo can be tricky since it must contain fields
that might refer to the host Cil.compinfo and furthermore the type of
the field might need to refer to the Cil.compinfo for recursive types.
Use the Cil.mkCompInfo function to create a Cil.compinfo. You can
easily fetch the Cil.fieldinfo for a given field in a structure with
Cil.getCompField.
type compinfo = {
|
mutable cstruct : |
(* | True if struct, False if union | *) |
|
mutable cname : |
(* | The name. Always non-empty. Use Cil.compFullName to get the full
name of a comp (along with the struct or union) | *) |
|
mutable ckey : |
(* | A unique integer. This is assigned by Cil.mkCompInfo using a
global variable in the Cil module. Thus two identical structs in two
different files might have different keys. Use Cil.copyCompInfo to
copy structures so that a new key is assigned. | *) |
|
mutable cfields : |
(* | Information about the fields. Notice that each fieldinfo has a pointer back to the host compinfo. This means that you should not share fieldinfo's between two compinfo's | *) |
|
mutable cattr : |
(* | The attributes that are defined at the same time as the composite
type. These attributes can be supplemented individually at each
reference to this compinfo using the TComp type constructor. | *) |
|
mutable cdefined : |
(* | This boolean flag can be used to distinguish between structures that have not been defined and those that have been defined but have no fields (such things are allowed in gcc). | *) |
|
mutable creferenced : |
(* | True if used. Initially set to false. | *) |
Cil.mkCompInfo to
make one and use Cil.copyCompInfo to copy one (this ensures that a new
key is assigned and that the fields have the right pointers to parents.).Cil.fieldinfo structure is used to describe
a structure or union field. Fields, just like variables, can have
attributes associated with the field itself or associated with the type of
the field (stored along with the type of the field).type fieldinfo = {
|
mutable fcomp : |
(* | The host structure that contains this field. There can be only one
compinfo that contains the field. | *) |
|
mutable fname : |
(* | The name of the field. Might be the value of Cil.missingFieldName
in which case it must be a bitfield and is not printed and it does not
participate in initialization | *) |
|
mutable ftype : |
(* | The type | *) |
|
mutable fbitfield : |
(* | If a bitfield then ftype should be an integer type and the width of the bitfield must be 0 or a positive integer smaller or equal to the width of the integer type. A field of width 0 is used in C to control the alignment of fields. | *) |
|
mutable fattr : |
(* | The attributes for this field (not for its type) | *) |
|
mutable floc : |
(* | The location where this field is defined | *) |
GEnumTag for each of
of these.type enuminfo = {
|
mutable ename : |
(* | The name. Always non-empty. | *) |
|
mutable eitems : |
(* | Items with names and values. This list should be non-empty. The item values must be compile-time constants. | *) |
|
mutable eattr : |
(* | The attributes that are defined at the same time as the enumeration
type. These attributes can be supplemented individually at each
reference to this enuminfo using the TEnum type constructor. | *) |
|
mutable ereferenced : |
(* | True if used. Initially set to false | *) |
GEnumTag for each of
of these.type typeinfo = {
|
mutable tname : |
(* | The name. Can be empty only in a GType when introducing a composite
or enumeration tag. If empty cannot be referred to from the file | *) |
|
mutable ttype : |
(* | The actual type. This includes the attributes that were present in the typedef | *) |
|
mutable treferenced : |
(* | True if used. Initially set to false | *) |
Cil.varinfo
structure. A global Cil.varinfo can be introduced with the GVarDecl or
GVar or GFun globals. A local varinfo can be introduced as part of a
function definition Cil.fundec.
All references to a given global or local variable must refer to the same
copy of the varinfo. Each varinfo has a globally unique identifier that
can be used to index maps and hashtables (the name can also be used for this
purpose, except for locals from different functions). This identifier is
constructor using a global counter.
It is very important that you construct varinfo structures using only one
of the following functions:
Cil.makeGlobalVar : to make a global variableCil.makeTempVar : to make a temporary local variable whose name
will be generated so that to avoid conflict with other locals. Cil.makeLocalVar : like Cil.makeTempVar but you can specify the
exact name to be used. Cil.copyVarinfo: make a shallow copy of a varinfo assigning a new name
and a new unique identifiervarinfo is also used in a function type to denote the list of formals.type varinfo = {
|
mutable vname : |
(* | The name of the variable. Cannot be empty. It is primarily your
responsibility to ensure the uniqueness of a variable name. For local
variables Cil.makeTempVar helps you ensure that the name is unique. | *) |
|
mutable vtype : |
(* | The declared type of the variable. | *) |
|
mutable vattr : |
(* | A list of attributes associated with the variable. | *) |
|
mutable vstorage : |
(* | The storage-class | *) |
|
mutable vglob : |
(* | True if this is a global variable | *) |
|
mutable vinline : |
(* | Whether this varinfo is for an inline function. | *) |
|
mutable vdecl : |
(* | Location of variable declaration. | *) |
|
mutable vid : |
(* | A unique integer identifier. This field will be
set for you if you use one of the Cil.makeFormalVar,
Cil.makeLocalVar, Cil.makeTempVar, Cil.makeGlobalVar, or
Cil.copyVarinfo. | *) |
|
mutable vaddrof : |
(* | True if the address of this variable is taken. CIL will set these
flags when it parses C, but you should make sure to set the flag
whenever your transformation create AddrOf expression. | *) |
|
mutable vreferenced : |
(* | True if this variable is ever referenced. This is computed by
removeUnusedVars. It is safe to just initialize this to False | *) |
|
mutable vdescr : |
(* | For most temporary variables, a description of what the var holds. (e.g. for temporaries used for function call results, this string is a representation of the function call.) | *) |
|
mutable vdescrpure : |
(* | Indicates whether the vdescr above is a pure expression or call. Printing a non-pure vdescr more than once may yield incorrect results. | *) |
type storage =
| |
NoStorage |
(* | The default storage. Nothing is printed | *) |
| |
Static |
|||
| |
Register |
|||
| |
Extern |
Cil.exp. There are several
interesting aspects of CIL expressions:
Integer and floating point constants can carry their textual representation. This way the integer 15 can be printed as 0xF if that is how it occurred in the source.
CIL uses 64 bits to represent the integer constants and also stores the width
of the integer type. Care must be taken to ensure that the constant is
representable with the given width. Use the functions Cil.kinteger,
Cil.kinteger64 and Cil.integer to construct constant
expressions. CIL predefines the constants Cil.zero,
Cil.one and Cil.mone (for -1).
Use the functions Cil.isConstant and Cil.isInteger to test if
an expression is a constant and a constant integer respectively.
CIL keeps the type of all unary and binary expressions. You can think of that type qualifying the operator. Furthermore there are different operators for arithmetic and comparisons on arithmetic types and on pointers.
Another unusual aspect of CIL is that the implicit conversion between an
expression of array type and one of pointer type is made explicit, using the
StartOf expression constructor (which is not printed). If you apply the
AddrOf}constructor to an lvalue of type T then you will be getting an
expression of type TPtr(T).
You can find the type of an expression with Cil.typeOf.
You can perform constant folding on expressions using the function
Cil.constFold.
type exp =
| |
Const of |
(* | Constant | *) |
| |
Lval of |
(* | Lvalue | *) |
| |
SizeOf of |
(* | sizeof(<type>). Has unsigned int type (ISO 6.5.3.4). This is not
turned into a constant because some transformations might want to
change types | *) |
| |
SizeOfE of |
(* | sizeof(<expression>) | *) |
| |
SizeOfStr of |
(* | sizeof(string_literal). We separate this case out because this is the only instance in which a string literal should not be treated as having type pointer to character. | *) |
| |
AlignOf of |
(* | This corresponds to the GCC __alignof_. Has unsigned int type | *) |
| |
AlignOfE of |
|||
| |
UnOp of |
(* | Unary operation. Includes the type of the result. | *) |
| |
BinOp of |
(* | Binary operation. Includes the type of the result. The arithmetic conversions are made explicit for the arguments. | *) |
| |
CastE of |
(* | Use Cil.mkCast to make casts. | *) |
| |
AddrOf of |
(* | Always use Cil.mkAddrOf to construct one of these. Apply to an
lvalue of type T yields an expression of type TPtr(T). Use
Cil.mkAddrOrStartOf to make one of these if you are not sure which
one to use. | *) |
| |
StartOf of |
(* | Conversion from an array to a pointer to the beginning of the array.
Given an lval of type TArray(T) produces an expression of type
TPtr(T). Use Cil.mkAddrOrStartOf to make one of these if you are
not sure which one to use. In C this operation is implicit, the
StartOf operator is not printed. We have it in CIL because it makes
the typing rules simpler. | *) |
type constant =
| |
CInt64 of |
(* | Integer constant. Give the ikind (see ISO9899 6.1.3.2) and the
textual representation, if available. (This allows us to print a
constant as, for example, 0xF instead of 15.) Use Cil.integer or
Cil.kinteger to create these. Watch out for integers that cannot be
represented on 64 bits. OCAML does not give Overflow exceptions. | *) |
| |
CStr of |
(* | String constant. The escape characters inside the string have been already interpreted. This constant has pointer to character type! The only case when you would like a string literal to have an array type is when it is an argument to sizeof. In that case you should use SizeOfStr. | *) |
| |
CWStr of |
(* | Wide character string constant. Note that the local interpretation
of such a literal depends on Cil.wcharType and Cil.wcharKind.
Such a constant has type pointer to Cil.wcharType. The
escape characters in the string have not been "interpreted" in
the sense that L"A\xabcd" remains "A\xabcd" rather than being
represented as the wide character list with two elements: 65 and
43981. That "interpretation" depends on the underlying wide
character type. | *) |
| |
CChr of |
(* | Character constant. This has type int, so use charConstToInt to read the value in case sign-extension is needed. | *) |
| |
CReal of |
(* | Floating point constant. Give the fkind (see ISO 6.4.4.2) and also the textual representation, if available. | *) |
| |
CEnum of |
(* | An enumeration constant with the given value, name, from the given
enuminfo. This is used only if Cil.lowerConstants is true
(default). Use Cil.constFoldVisitor to replace these with integer
constants. | *) |
type unop =
| |
Neg |
(* | Unary minus | *) |
| |
BNot |
(* | Bitwise complement (~) | *) |
| |
LNot |
(* | Logical Not (!) | *) |
type binop =
| |
PlusA |
(* | arithmetic + | *) |
| |
PlusPI |
(* | pointer + integer | *) |
| |
IndexPI |
(* | pointer + integer but only when
it arises from an expression
e[i] when e is a pointer and
not an array. This is semantically
the same as PlusPI but CCured uses
this as a hint that the integer is
probably positive. | *) |
| |
MinusA |
(* | arithmetic - | *) |
| |
MinusPI |
(* | pointer - integer | *) |
| |
MinusPP |
(* | pointer - pointer | *) |
| |
Mult |
|||
| |
Div |
(* | / | *) |
| |
Mod |
(* | % | *) |
| |
Shiftlt |
(* | shift left | *) |
| |
Shiftrt |
(* | shift right | *) |
| |
Lt |
(* | < (arithmetic comparison) | *) |
| |
Gt |
(* | > (arithmetic comparison) | *) |
| |
Le |
(* | <= (arithmetic comparison) | *) |
| |
Ge |
(* | > (arithmetic comparison) | *) |
| |
Eq |
(* | == (arithmetic comparison) | *) |
| |
Ne |
(* | != (arithmetic comparison) | *) |
| |
BAnd |
(* | bitwise and | *) |
| |
BXor |
(* | exclusive-or | *) |
| |
BOr |
(* | inclusive-or | *) |
| |
LAnd |
(* | logical and. Unlike other
expressions this one does not
always evaluate both operands. If
you want to use these, you must
set Cil.useLogicalOperators. | *) |
| |
LOr |
(* | logical or. Unlike other
expressions this one does not
always evaluate both operands. If
you want to use these, you must
set Cil.useLogicalOperators. | *) |
a[0][1][2]might involve 1, 2 or 3 memory reads when used in an expression context, depending on the declared type of the variable
a. If a has type int
[4][4][4] then we have one memory read from somewhere inside the area
that stores the array a. On the other hand if a has type int *** then
the expression really means * ( * ( * (a + 0) + 1) + 2), in which case it is
clear that it involves three separate memory operations.
An lvalue denotes the contents of a range of memory addresses. This range
is denoted as a host object along with an offset within the object. The
host object can be of two kinds: a local or global variable, or an object
whose address is in a pointer expression. We distinguish the two cases so
that we can tell quickly whether we are accessing some component of a
variable directly or we are accessing a memory location through a pointer.
To make it easy to
tell what an lvalue means CIL represents lvalues as a host object and an
offset (see Cil.lval). The host object (represented as
Cil.lhost) can be a local or global variable or can be the object
pointed-to by a pointer expression. The offset (represented as
Cil.offset) is a sequence of field or array index designators.
Both the typing rules and the meaning of an lvalue is very precisely specified in CIL.
The following are a few useful function for operating on lvalues:
Cil.mkMem - makes an lvalue of Mem kind. Use this to ensure
that certain equivalent forms of lvalues are canonized.
For example, *&x = x. Cil.typeOfLval - the type of an lvalueCil.typeOffset - the type of an offset, given the type of the
host. Cil.addOffset and Cil.addOffsetLval - extend sequences
of offsets.Cil.removeOffset and Cil.removeOffsetLval - shrink sequences
of offsets.Mem(AddrOf(Mem a, aoff)), off = Mem a, aoff + off Mem(AddrOf(Var v, aoff)), off = Var v, aoff + off AddrOf (Mem a, NoOffset) = a
typelval =lhost * offset
type lhost =
| |
Var of |
(* | The host is a variable. | *) |
| |
Mem of |
(* | The host is an object of type T when the expression has pointer
TPtr(T). | *) |
Cil.lval.type offset =
| |
NoOffset |
(* | No offset. Can be applied to any lvalue and does not change either the starting address or the type. This is used when the lval consists of just a host or as a terminator in a list of other kinds of offsets. | *) |
| |