Secure unserialize()

PHP is designed to process HTTP requests, and respond with a HTTP response. HTTP is a stateless protocol, but the web applications we build require state: usually, we need to (at least) keep track of the user that is currently working with the application. On the server side, sessions are used to keep (at least some) application state. When PHP’s built-in session management is used, the session ID will be automatically detected from a session cookie the browser usually sends along with any request. PHP will also automatically load any session data that was stored when processing the previous request. Sessions can not only contain scalar data, but also arrays and objects.

To save session data, it must be converted to a format that can be stored on disk. This is done using the built-in serialize() function, which converts data to a string. In the following example, we use serialize() on an object:

$something = new Something;
var_dump(serialize($something));

class Something
{
    private int $foo = 42;
}

The result is:

string(47) "O:9:"Something":1:{s:14:"\000Something\000foo";i:42;}"

We can now pass this string to the unserialize() function to re-create the object:

var_dump(
    unserialize(
        "O:9:\"Something\":1:{s:14:\"\000Something\000foo\";i:42;}"
    )
);

class Something
{
    private int $foo;
}

To make this work, the class Something must be defined. The definition need not be exactly the same as when serializing, but both definitions must be compatible. As you can see, we have left out the default value for foo, but things still work out:

class Something#1 (1) {
  private $foo =>
  int(42)
}

When PHP encounters an undefined class on unserializing a string, you will end up with this:

class __PHP_Incomplete_Class#1 (2) {
  public $__PHP_Incomplete_Class_Name =>
  string(9) "Something"
  private $foo =>
  int(42)
}

Serializing and unserializing data can be extremely useful. When communicating between different applications, a format like JSON or XML might be the better choice to serialize data, but within PHP serialize() and unserialize() work well, and do not require additional mapping effort.

Any convenience comes at a price, though, in this case with a security problem, or at least security concerns. When unserializing a string that contains an object description, the __wakeup() interceptor method will automatically be called on this object:

$something  = new Something;
$serialized = serialize($something);

unserialize($serialized);

class Something
{
    public function __wakeup()
    {
        var_dump('code is being executed');
    }
}

In this case, unserializing data leads to code execution:

string(22) "code is being executed"

It is hard to tell whether this is a bug or a feature. There may be some valid use cases for performing certain actions like re-opening a file, or re-connecting to a database when unserializing an object, which is exactly what the __wakeup() interceptor method was designed for. On the other hand, an attacker might be able to fabricate a serialized string that leads to the unexpected execution of code in the scope of a process that has some user’s privileges, and almost certainly contains some sensitive data about this user.

In PHP 7, to mitigate this problem, an additional parameter has been added to unserialize(). You can now optionally specify an array of classes that can be unserialized, potentially leading to code execution. To specify the allowed class names, you should use the class class constant that has been introduced in PHP 5.5. It automatically exists for every class and always contains the fully qualified class name:

$object = unserialize(
    serialize(new Something),
    ['allowed_classes' => [Something::class]]
);
var_dump($object);

class Something
{
    public function __wakeup()
    {
        var_dump('code executed');
    }
}

Since Something is listed as an allowed class, unserializing will work, and the __wakeup() method will be executed:

string(13) "code executed"
class Something#1 (0) {
}

Let us try to unserialize a class that is not allowed:

$object = unserialize(
    serialize(new SomethingElse),
    ['allowed_classes' => [Something::class]]
);
var_dump($object);

class Something
{
    public function __wakeup()
    {
        var_dump('code executed');
    }
}

class SomethingElse
{
    public function __wakeup()
    {
        var_dump('code executed');
    }
}

This yields an incomplete class, even though SomethingElse was defined:

class __PHP_Incomplete_Class#1 (1) {
  public $__PHP_Incomplete_Class_Name =>
  string(13) "SomethingElse"
}

No code has been executed. In addition, the result of serializing such an incomplete class is fascinating:

$object = unserialize(
    serialize(new SomethingElse),
    ['allowed_classes' => [Something::class]]
);
var_dump(serialize($object));

class Something
{
    public function __wakeup()
    {
        var_dump('code executed');
    }
}

class SomethingElse
{
    public function __wakeup()
    {
        var_dump('code executed');
    }
}

As you can see, incomplete classes will be serialized as the original class:

string(25) "O:13:"SomethingElse":0:{}"

This allows you to retrieve the data safely, without executing code, and still re-serialize the string back to the same representation.

Trying to call a method on an incomplete class, however, will lead to a fatal error. To get such an incomplete class, we disallow any class to be unserialized by setting allowed_classes to false:

$object = unserialize(
    serialize(new Something),
    ['allowed_classes' => false]
);
var_dump($object);

$object->run();

class Something
{
    public function run()
    {
        var_dump('method executed');
    }
}

First, unserialize() returns an incomplete class. Then we are trying to call the method, which leads to a fatal error:

class __PHP_Incomplete_Class#1 (1) {
  public $__PHP_Incomplete_Class_Name =>
  string(9) "Something"
}
PHP Fatal error:  main(): The script tried to execute a method
or access a property of an incomplete object. Please ensure that
the class definition "Something" of the object you are trying to
operate on was loaded _before_ unserialize() gets called or
provide a __autoload() function to load the class definition
in ...

After all, it’s an incomplete class and not an object instance.