Behind the Scenes

Many of the enhancements brought about by PHP 7 – from performance improvements to reduced memory footprint, from increased maintainability of PHP runtime’s C source code to new language features that make your daily work more convenient – are enabled by changes “behind the scenes” and “under the hood”. To better understand these, let us shine a light on how PHP compiles and executes code!

One of the key characteristics of a programming language is how programs written in it are executed. Compiled languages and interpreted languages are categories commonly used for this distinction. Most developers will immediately think of C, C++, C#, and Java as examples for compiled languages and Perl, PHP, Python, and Ruby as examples for interpreted languages.

A compiler is a piece of software that is used to translate source code from a higher level language such as C or Java to a lower level language such as native machine code or Java bytecode, respectively. Thinking in terms of compiled versus interpreted, however, is not very helpful when reasoning about programming languages because the runtime environment – the interpreter – of interpreted languages also contains a compiler.

Another characterization approach could be to talk about ahead-of-time compilation versus on-demand compilation. Programming languages such as C, C++, C#, and Java require an explicit compilation step before a program can be executed. The compiler has to be invoked, either on the command-line or from within an IDE, to compile the source code into a so-called binary.

In the case of C and C++ that binary usually contains native machine code while the compilers for C# and Java will generate binaries that contain bytecode for the respective virtual machines Common Language Runtime (CLR) and Java Virtual Machine (JVM). Bytecode is a compiled representation of a program that can be executed using a program called a virtual machine or software interpreter.

In contrast to compiled languages – languages that require an explicit, ahead-of-time compilation step – are interpreted languages. Perl, PHP, Python, and Ruby, for instance, do not need an explicit compilation step, but instead their interpreter implicitly compiles the source code when needed.

Early implementations of interpreted languages were compiled line-by-line. After a line of code was compiled it was executed immediately. If a line of code was to be executed multiple times, for instance when it was written in the body of a loop, then it needed to be recompiled each time. This is how PHP 3 worked, by the way. Since PHP 4, the PHP interpreter compiles PHP source code to an intermediate representation called PHP bytecode which is then executed.