=========================================================================== GENERAL DESCRIPTION =========================================================================== This directory contains the formalism of Softbound's safety, which has been completely mechanized using the Coq proof assistant. The code run with Coq 8.1 (Feb. 2007), 8.1pl3 (Dec. 2007) and 8.2beta4 (Aug. 2008). The formalism includes: * A non-standard operational semantics for C programs. This semantics tracks information about which memory addresses have been allocated and the (run-time) types of values. Crucially, this semantics yields an "error" state whenever a bad C program would cause a spatial-safety violation. The approach adopted is similar to CCured's formalism, but the C language considered here includes the address-of operator & and a named structure type to define recursive types. * A predicate that models the invariants of stack and memory enforced by the Softbound instrumentation and meta-data propagation process. * A proof about the preservation and progress (safety) of the type system. The code in the diretory softbound is the formalism of softbound. The code in the directory softbound++ defines and proves a system more than the imple- mentation of softbound. Softbound has only one sort representation of pointers with the base and bound metadata information. This formalism also contains a SAFE pointer and a SEQ pointer, which are similar to CCured's pointer categories. SAFE pointers are not involved in any pointer arithmetic and in casts. SEQ pointers are not involved in any casts. The pointer representation of Softbound is called TAME pointer in this proof. TAME pointers can be involved in any pointer atithmetic and in casts. The proof proves the safety of a system more complicated than Softbound. The system defaults to Softbound if all pointers are TAME pointers. =========================================================================== CONTENTS =========================================================================== * DIRECTORY : Types.v DESCRIPTION : Definition of types * DIRECTORY : Envs.v DESCRIPTION : Definition of environment, such as Mem, Stack, APIs (read, write, copy, malloc) * DIRECTORY : CSyntax.v DESCRIPTION : Definition of syntax * DIRECTORY : CSemantics.v DESCRIPTION : Operational semantics * DIRECTORY : Axioms.v DESCRIPTION : Axioms of APIs (read, write, copy, malloc) * DIRECTORY : Libs.v DESCRIPTION : Supporting material lemmas * DIRECTORY : Props.v DESCRIPTION : Interesting Lemmas, such as progress and preservation =========================================================================== =========================================================================== Comments of PLDI reviews =========================================================================== * In Section 3.1, the definition for check() given would seem to suffer from an integer overflow. The user can write programs that contain casts such as: (* int[999999999])ptr Where adding the size of the base type will cause the value to wrap around. Thus out-of-bounds pointers can pass the check. Early versions of CCured suffered from such a weakness. Since your formal proof is not visible, I cannot tell if you are as well. Even if your formal proof is visible, it may not correctly model addition on 32-bit numbers. Answer: The integer overflow issue is not a problem in the formal proof. About the integer operations, the formal proof uses Basic Peano arithmetic and Binary integers (Coq libs EqNat, Compare_dec, Omega, Mult and Zdiv). Peano natural numbers is built from zero O and successor operation S. Binary integers, an extension from Peano natural numbers, includes both natural numbers and negative numbers. Those definitions and lemmas are totally abstract, and do not depend on any machine representation of integers, such as the size of an integer, and how to represent a negative number. http://coq.inria.fr/library-eng.html * The operational semantics in Section 4.2: the claims of preservation and progress would seem to suggest that every well-typed command should either result in OK, OutOfMem or Abort. Consider this well-typed program fragment: * ((void *)0) = 7 Neither of the two pointer dereference rules at the bottom of Page 5 would seem to handle this case, since "read(E.M) l" will not equal "some v_(b,e)" for any v_(b,e). Thus the execution will seem to get stuck instead of correctly reaching Abort. This would be a misreading on my part or a simple typo in the presentation; without the checkable proof one cannot be certain. Answer: Section 4.1 defines the syntax and grammar of the fragment of C used in the proof. According to those definitions, * ((void *)0) = 7 should be rewritten into void* _temp; ... _temp = (void *)0; //stmt 1 * _temp = 7; //stmt 2 The constant 0 is casted into a temporary variable _temp with type void *, and then _temp is derefered and assigned as the value 7. stmt1 propagates the meta data (base and bound) information to _temp. The meta data of _temp are 0_(0,0) after stmt1, and stored indiced by _temp. The formalism checks the out-of-bounds in stmt2 at the dereference of _temp. According to the semantics at the bottom of Page 5, _temp is evaluated into the address of _temp, l, first, and then read(E.M) l returns some 0_(0, 0). A runtime assertion will fail, because 0 <= 0 /\ 0 + sizeof(void) <= 0 does not hold.