Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Bounded Model Checking for C Programs in an Enterprise Environment
1. Bounded Model Checking for C Programs
in an Enterprise Environment
Michael Tautschnig
Amazon Web Services & Queen Mary University of London
2. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Customer: I would like
to get a guarantee that
there are no security
bugs in this software.
3. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
“Software”
4. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
“Software” eco system of
can’t be published,
but …
5. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Ample Open-Source Software “out there”
• Debian (http://sources.debian.net/stats/ 21st October 2016)
• 26,900 source packages
• 13,736,903 individual source files
• 1,276,743,654 lines of source code (any programming language)
• 45.5% (approx 500M) C code, 22.2% C++, 5.6% shell, 4.7% Java
• SourceForge, github, CodePlex, ...: how to automate any kind of analysis?
• Distributions (RedHat, Ubuntu/Debian, SuSE, … - but also industrial set ups)!
• Software organised in source packages
• Uniform interface to access/download packages
• Uniform build interface, dependency management
6. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
How?
7. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Building one Source Package: Compiler Tool-chain
• For now: C source code only
• goto-cc (part of CBMC distribution)
• Uses compiler’s (here: GCC’s) preprocessor
• Own C parser/front end (no Cil, LLVM, EDG, ...)
• Supports GCC, Visual Studio, CodeWarrior, ARM-CC dialects and command
line options
• Builds intermediate representation understood by CBMC/CProver tools
• Linking of compiled files/archives/libraries
8. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Supporting arbitrary Build Systems
• Builds are performed in chroot environments
• /usr/bin/gcc and /usr/bin/ld replaced by scripts invoking goto-cc (+ more work)
• Key procedure:
1. Run real compiler/linker (gcc/ld)
2. Compile/link using goto-cc
3. Add result as additional ELF section
• Resulting file remains executable
• Stable under file renaming, archiving, etc.
• Linking stage extracts intermediate representation from extra ELF section
x86
binary
CProver
IR
9. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Building Thousands of Packages
10. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Infrastructure: (Ab-)using Jenkins
Scripts, notes, configuration: https://github.com/tautschnig/cprover-debian
Jenkins master:
4 cores, 64 GB
5 slave nodes: each
64 cores,
256 GB memory
Ultimate Debian
Database:
Package versions, bugs
SQL
SSH
Debian mirror:
source archives
FTP
11. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Current per-package Work Flow
Compile, link
Store archive
of all object
files/
executables
dump-c:
create human-
readable C
code from IR
Add generic
assertions
(pointer
checks,
arithmetic
overflow, no-
NaN, ...)
Run CBMC
w/unwinding
bound 1, Z3/
Minisat
(DAC’03,
TACAS’04,
CAV’13)
Loop
acceleration
(CAV’13)
Re-compile using goto-cc
Static weak
memory cycles
(TOPLAS/
PLDI’14)
re-compile
using gcc
(errors not
fatal)
12. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Results?
13. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Exercising Language Front Ends
Compile, link
Store archive
of all object
files/
executables
dump-c:
create human-
readable C
code from IR
Add generic
assertions
(pointer
checks,
arithmetic
overflow, no-
NaN, ...)
Run CBMC
w/unwinding
bound 1, Z3/
Minisat
(DAC’03,
TACAS’04,
CAV’13)
Loop
acceleration
(CAV’13)
Re-compile using goto-cc
Static weak
memory cycles
(TOPLAS/
PLDI’14)
re-compile
using gcc
(errors not
fatal)
+
14. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Exercising Language Front Ends
• Many bug fixes and improvements to the parser, type checker
• Re-engineering of parts of the linker
• Bug fixes in IR construction
• Compilation (without further analysis steps) of entire archive: ~2 days
• > 250 GB of compressed archives of IR object files/executables
• 10314 archives available:
http://theory.eecs.qmul.ac.uk/debian+mole/pkgs/
15. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Results for relevant to Practitioners: Bug Reports
• Key feature: type checking at link time
• 844 bugs reported, 530 already fixed by developers
• Hundreds still to be reported
• http://bugs.debian.org/cgi-bin/pkgreport.cgi?users=mt@debian.org&tag=goto-
cc&archive=both
16. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Reporting bugs
17. Automated Testing using SMID | Michael Tautschnig
Where are the cats?
• CAV’14: J. Alglave, D. Kroening, V. Nimal, D. Poetzl: Don't sit on the fence: A
static analysis approach to automatic fence insertion
• PLDI’14/TOPLAS: J. Alglave, L. Maranget, M. Tautschnig: Herding Cats -
Modelling, simulation, testing, and data-mining for weak memory (cited in Linux
Weekly News and C/C++ WG21/N4036)
18. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Focus on improving/developing Methods
Compile, link
Store archive
of all object
files/
executables
dump-c:
create human-
readable C
code from IR
Add generic
assertions
(pointer
checks,
arithmetic
overflow, no-
NaN, ...)
Run CBMC
w/unwinding
bound 1, Z3/
Minisat
(DAC’03,
TACAS’04,
CAV’13)
Loop
acceleration
(CAV’13)
Re-compile using goto-cc
Static weak
memory cycles
(TOPLAS/
PLDI’14)
re-compile
using gcc
(errors not
fatal)
19. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
TOPLAS/PLDI’14: analysing 200 million LOC for
potential weak memory susceptibility
20. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Automated Information Leak Detection
21. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Analysing the Patched Version
22. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Overall Analysis Status (preliminary!)
Compile, link
Store archive
of all object
files/
executables
dump-c:
create human-
readable C
code from IR
Add generic
assertions
(pointer
checks,
arithmetic
overflow, no-
NaN, ...)
Run CBMC
w/unwinding
bound 1, Z3/
Minisat
(DAC’03,
TACAS’04,
CAV’13)
Loop
acceleration
(CAV’13)
Re-compile using goto-cc
Static weak
memory cycles
(TOPLAS/
PLDI’14)
re-compile
using gcc
(errors not
fatal)
23. Bounded Model Checking for C Programs in an Enterprise Environment | Michael Tautschnig
Overall Analysis Status (preliminary!)
• In addition to 314 bugs reported and not yet fixed: 4915 packages with error
reports - top causes:
1789 CBMC counterexamples (including several using loop acceleration)
1711 Loop acceleration bugs
200 Floating point support in Z3 back end
198 Type-inconsistent access to heap with symbolic offset
129 CBMC Out-of-memory
54 Parameter counts differ
48 Conflicting array sizes
46 Conflicting types
42 Conflicting struct types
32 Conflicting return types (byte size)