Preloading
In the “Bytecode Caching” section we already discussed how OpCache, PHP’s built-in and default solution for bytecode optimization and caching, can be used to avoid compiling PHP source code into PHP bytecode for each request, over and over again. While this already provides quite a substantial performance boost, compilation is not the only thing that the PHP interpreter repeatedly has to do before it can actually run the code.
By default, OpCache checks whether the original source file had
been modified and thus requires compilation. Although this check can
be disabled (opcache.validate_timestamps=0
) for
production systems to reduce the amount of I/O operations, there is
still a cost associated with fetching the bytecode from OpCache’s
shared memory into the context of the current request and preparing
it for execution.
This is, at least in part, because PHP compiles and caches each file independently and logically separated from other files. PHP therefore needs to re-establish the links between classes, interfaces, and traits to prepare the code for execution in the context of the current request. This includes, for instance, checking that the language’s rules for inheritance and trait usage are followed.
When preloading is used, this “linking” is no longer performed
redundantly and on-demand in the context of each request. It is
instead performed once, and only once, during server startup. This
makes preloading conceptually quite different from traditional
file-based bytecode caching. Changes to preloaded source files will
not have any effect until the server is restarted, for instance. In
other words: the PHP interpreter will behave as if
opcache.validate_timestamps=0
was configured.
Furthermore, no autoload callback will be triggered for classes,
interfaces, and traits that are preloaded.
Preloading requires a so called preload script that is controlled
by two php.ini
configuration directives:
opcache.preload_user=www
opcache.preload=/path/to/preload_script.php
opcache.preload_user
is used to configure the name
of the system user under which the preload script is executed. This
is important as most services, at least initially, run as the
root
super-user and no PHP script, not even “just” a
preload script, should be run with such extensive privileges.
opcache.preload
is a regular PHP script that is
automatically executed on server startup. While the entire power of
the PHP language is available to be used in this script, you should
only (pre)load your classes, interfaces, and traits here.
The simplest thing that could possibly work as a preload script
is a (potentially long) list of require
statements:
require __DIR__ . '/MyClassA.php';
require __DIR__ . '/MyClassB.php';
require __DIR__ . '/MyClassC.php';
// ...
There is a catch, though, to using require
as this
statement not only loads and compiles a file but also executes any
code in the file’s global scope. This can lead to unintended side
effects that can be avoided by using the
opcache_compile_file()
function instead:
opcache_compile_file(__DIR__ . '/MyClassA.php');
opcache_compile_file(__DIR__ . '/MyClassB.php');
opcache_compile_file(__DIR__ . '/MyClassC.php');
// ...
opcache_compile_file()
only loads and compiles a
file but does not execute any code in the file’s global scope.
Another difference is that opcache_compile_file()
can load files in any order. When you have a file
MyClassA.php
which declares a class named
MyClassA
and a file MyClassB.php
which
declares a class named MyClassB
that extends
MyClassA
, MyClassA.php
has to be loaded
before MyClassB.php
with include
,
include_once
, require
, and
require_once
. opcache_compile_file()
does
not care about the order in which MyClassA.php
and
MyClassB.php
are loaded.
It is important to note that only include
,
include_once
, require
, and
require_once
support the conditional declaration of
functions, classes, interfaces, and traits like so:
if (true) {
require __DIR__ . '/MyClassA.php';
}
When a preloaded file is loaded again later using
include
, include_once
,
require
, or require_once
then its code
outside the declaration of functions, classes, interfaces, and
traits will still be executed. Any functions, classes, interfaces,
or traits will not be re-defined, though. Using
include_once
and require_once
does not
prevent a preloaded file from being loaded again.
All variables, objects, and resources that may be created or opened by the preload script will be garbage-collected after server startup. They will not be available in the requests later on.
It is also important to realize that the order in which files get loaded is very likely to be relevant: to compile a unit of code, all its dependencies need to be resolvable. For that to work, a parent class, a trait, or an interface needs to be known before they can be extended, used, or implemented.
For units of code with unresolvable dependencies, PHP will still keep the bytecode of the file but will otherwise refuse to preload the unit of code itself:
NOTICE: PHP message: PHP Warning: Can't preload unlinked class
Foo: Unknown interface DemoInterface in ...
At the time of writing, finding errors such as the one shown
above is unfortunately a bit tricky. The problem arises during
startup and not at request time and therefore looking at the error
log of an PHP-FPM pool, for instance, does not help. Even a
configured opcache.error_log
or the general FPM
error_log
do not contain these warnings. We have to
start PHP-FPM in the foreground (php-fpm -F
) to see
them as they occur or by asking systemd
’s journal (for
instance using
journalctl -e -u php-fpm -g "Can't preload"
) in case
PHP-FPM is run as a systemd
service.
Depending on the code base, not all dependency problems may be resolvable. If classes have missing (third-party) dependencies, are dynamically generated at runtime, or conditionally defined, preloading them is not possible. Another reason for preloading to fail are type compatibility checks the engine cannot perform at that point (see the section on Method Compatibility).
To use such a unit of code at request time, the file declaring it
has to be made available by either registering a conventional
autoloader or by explicit require
or
include
statements.
It is important to note that the preload functionality operates on the instance level. When, for example, PHP-FPM is configured with multiple pools then all pools share a common preload cache. This is very convenient for many use cases, but it might lead to unexpected behavior or even cause security problems.
By the very nature of preloading, the name of a unit of code has to be unique per instance. It is not possible to have multiple versions of the same class preloaded at the same time in the same instance. That means projects served via separate pools cannot have their own version of a class preloaded individually under the same name.
Sharing an instance of PHP with preloaded code requires full trust among all sharing parties. Since no files are read at runtime, file permissions that traditionally would have protected one project from accessing files of another are no longer effective. This may have unexpected security implications.