Auditing open source software

Monday, October 8, 2007 4:13 PM



Google encourages its employees to contribute back to the open source community, and there is no exception in Google's Security Team. Let's look at some interesting open source vulnerabilities that were located and fixed by members of Google's Security team. It is interesting to classify and aggregate the code flaws leading to the vulnerabilities, to see if any particular type of flaw is more prevalent.
  1. JDK. In May 2007, I released details on an interesting bug in the ICC profile parser in Sun's JDK. The bug is particularly interesting because it could be exploited by an evil image. Most previous JDK bugs involve a user having to run a whole evil applet. The key parts of code which demonstrate the bug are as follows:

    TagOffset = SpGetUInt32 (&Ptr);
    if (ProfileSize < TagOffset)
      return SpStatBadProfileDir;
    ...
    TagSize = SpGetUInt32 (&Ptr);
    if (ProfileSize < TagOffset + TagSize)
      return SpStatBadProfileDir;
    ...
    Ptr = (KpInt32_t *) malloc ((unsigned int)numBytes+HEADER);

    Both TagSize and TagOffset are untrusted unsigned 32-bit values pulled out of images being parsed. They are added together, causing a classic integer overflow condition and the bypass of the size check. A subsequent additional integer overflow in the allocation of a buffer leads to a heap-based buffer overflow.

  2. gunzip. In September 2006, my colleague Tavis Ormandy reported some interesting vulnerabilities in the gunzip decompressor. They were triggered when an evil compressed archive is decompressed. A lot of programs will automatically pass compressed data through gunzip, making it an interesting attack. The key parts of the code which demonstrate one of the bugs are as follows:

    ush count[17], weight[17], start[18], *p;
    ...
    for (i = 0; i < (unsigned)nchar; i++) count[bitlen[i]]++;

    Here, the stack-based array "count" is indexed by values in the "bitlen" array. These values are under the control of data in the incoming untrusted compressed data, and were not checked for being within the bounds of the "count" array. This led to corruption of data on the stack.


  3. libtiff. In August 2006, Tavis reported a range of security vulnerabilities in the libtiff image parsing library. A lot of image manipulation programs and services will be using libtiff if they handle TIFF format files. So, an evil TIFF file could compromise a lot of desktops or even servers. The key parts of the code which demonstrate one of the bugs are as follows:

    if (sp->cinfo.d.image_width != segment_width ||
        sp->cinfo.d.image_height != segment_height) {
      TIFFWarningExt(tif->tif_clientdata, module,
        "Improper JPEG strip/tile size, expected %dx%d, got %dx%d",

    Here, a TIFF file containing a JPEG image is being processed. In this case, both the TIFF header and the embedded JPEG image contain their own copies of the width and height of the image in pixels. This check above notices when these values differ, issues a warning, and continues. The destination buffer for the pixels is allocated based on the TIFF header values, and it is filled based on the JPEG values. This leads to a buffer overflow if a malicious image file contains a JPEG with larger dimensions than those in the TIFF header. Presumably the intent here was to support broken files where the embedded JPEG had smaller dimensions than those in the TIFF header. However, the consequences of larger dimensions that those in the TIFF header had not been considered.

We can draw some interesting conclusions from these bugs. The specific vulnerabilities are integer overflows, out-of-bounds array accesses and buffer overflows. However, the general theme is using an integer from an untrusted source without adequately sanity checking it. Integer abuse issues are still very common in code, particular code which is decoding untrusted binary data or protocols. We recommend being careful using any such code until it has been vetted for security (by extensive code auditing, fuzz testing, or preferably both). It is also important to watch for security updates for any decoding software you use, and keep patching up to date.
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

6 comments:

Security Retentive said...

Can you comment on how these defects were detected - code review (manual or automated), fuzz testing, other?

David Thiel's paper and talk from Blackhat this year talked about specific file fuzzers for media formats and such and the major lesson for me is that building file-specific fuzzers like that that subtly tweak file formats is laborious.

I'd be interested to know how you detected these, and what your experience is in finding these types of defects via various methods.

hung said...
This comment has been removed by a blog administrator.
Frances said...

I hate GMail. GMail is the Biggest Spammer of all. I have had nothing but trouble with Gmail and I intend to get rid of Gmail.

Ron said...
This comment has been removed by a blog administrator.
Yash said...

As someone said earlier; it would be great to know how Google works on auditing its projects. Possibly even releasing or demonstrating some tools.

Google has some of the best resources and talent; it could do great things for the Security Community.

On a side-note: Gmail is definitely the best mail service I've used. In regards to spam; I went from 40-50 spam mails a day to 0 after forwarding my mail through Gmail.

--
Yash Kadakia
CTO, Security Brigade
http://www.securitybrigade.com
Penetration Testing, PCI DSS Compliance, Security Consulting etc.

reddog said...
This comment has been removed by a blog administrator.