Sherri Sparks
University of Central Florida
USA
Shawn Embleton
University of Central Florida
USA
Ryan Cunningham
University of Central Florida
USA
Cliff Zou
University of Central Florida
USA
We present an extension of traditional "black box" fuzz testing using a genetic algorithm based upon a Dynamic Markov Model fitness heuristic. This heuristic allows us to "intelligently" guide input selection based upon feedback concerning the "success" of past inputs that have been tried. We use dynamic program instrumentation to gather runtime information about each input’s progress on the control flow graph, and using this information, we calculate and assign it a “fitness” value. Inputs which make more runtime progress on the control flow graph or explore new, previously unexplored regions receive a higher fitness value. Eventually, the inputs achieving the highest fitness are “mated” (e.g. combined using various operators) to produce a new generation of inputs. This process continues iteratively until either a designated node (for example, a node containing a potentially vulnerable system call) is reached or the test has run for a user specified maximum number of generations. Unlike many software testing tools, our implementation is strictly based upon binary code and does not require that source code be available. Our evaluation on a Windows server program shows that this approach is superior to random black box fuzzing for increasing code coverage and depth of penetration into program control flow logic. As a result, the technique may be beneficial to the development of future automated vulnerability analysis tools.
Keywords: Software Security, Vulnerability Analysis, Code Coverage, Fuzz Testing, Dynamic Markov Chain, Genetic Algorithm