PASAN: Automatic Patch and Signature Generation for Network Buffer-Overflow Attacks
2005 - 2006, SUNY Stony Brook
Introduction

Control-hijacking attacks eventually usurp the control of the applications and potentially their underlying machines. Even though many dynamic checking systems could effectively detect and prevent control-hijacking attacks, they share one weakness: lack of post-attack response that could prevent recurrence of the same attack and its variants. An ideal post-attack response system should be able to generate a characterizing signature for the detected attack that the front-end firewall can use to block the attack and potentially its variants, and to generate a patch that can seal the security hole that the detected attack exploits.
PASAN takes a program transformation approach to the problem of automating attack detection, identification (i.e., generating an attack signature) and repair (i.e., generating a fix). More concretely, PASAN takes an application's source code and augments it with additional instructions that check for tampered control-sensitive data structures, and record sufficiently detailed execution state from which to derive an identifying signature for a detected attack and its corresponding patch at repair time. Conceptually, the recorded execution state contains enough information to reconstruct the data and control dependencies that actually take place at run time. When detecting a control-hijacking attack, PASAN pinpoints the corresponding target address (for example a return address) that gets tampered by the attack. At repair time, PASAN computes a backward slice from the tampered target address through the dynamic data/control dependencies existing in the execution state log back to the input packets, and uses the resulting slice to identify relevant portions of relevant input packets that collectively contribute to the successful manipulation of the tampered target address. The bytes in the input packets thus identified form a matching signature that the firewall can use to filter out the detected attack.
Although PASAN produces attack signatures automatically, these signatures are more accurate in that they minimize the false positive and negative rate, because PASAN's signatures could contain multiple disjoint byte sequences, each of which is characterized by a regular expression and/or a length constraint. To the best of our knowledge, no other systems can achieve this level of signature accuracy. Similarly, in addition to automating patch creation, the patches that PASAN delivers are more human-made so that they are more likely to be merged with the original source code tree without additional modifications.

Signature Generation

PASAN is able to generate attack signatures in the form of multiple patterns in a byte stream, each of which can be characterized by a regular expression and/or a length constraint. Initially, each byte in the input byte stream that leads to an attack is irrelevant. Then PASAN's attack identification algorithm convert all bytes in the input byte stream that contribute to the detected attack as relevant. These relevant bytes are explicitly specified in the resulting signature, whereas each irrelevant byte is represented as a don't-care character, which means it may or may not exist in any attacks that try to exploit the same vulnerability as the detected attack.

Patch Generation and Testing

From the corrupted return address associated with a buffer overflow attack, PASAN first produces a patch and then tests the patch against the original attack packets to ensure that the patch fixes the original vulnerability as well as related vulnerabilities.
The current PASAN prototype can handle the following three types of buffer overflow vulnerabilities: (1) buffer overflow because of an unsafe libc function such as strcpy() (2) buffer overflow because of an array copying loop that eventually corrupts the return address; (3) buffer overflow that does not corrupt the return address. Only array bounds checking can detect type (3) vulnerability.

Example of Automatically Generated Signatures and Patches

ghttpd is a web server with a vulnerability with Bugtraq ID 5960.
ghttpd attack: This attack uses an excessively long URL. PASAN identifies the attack packet and finds out that ghttpd looks at the first four characters "GET " only but ignores the rest of the packet which represents the actual URL. The attack overwrites the return address using strcpy() function call. The length constraint generation algorithm correctly identifies the terminating character '\n' and the maximum size allowed for the URL.
At its first iteration the patch generation algorithm patches the vsprintf() function in function Log(). The patch testing algorithm then detects an off-by-N bug in the sprintf() function. Only BCC can detect this off-by-N bug because it does not overwrite the return address.

Paper
References

  • Attack identification: Buttercup, Autograph, Nemean, Polygraph, Arbor, Covers, Xu et al..
  • Signatures: DACODA.