How JavaScript engines achieve great performance

By Robin Heggelund Hansen

JavaScript is an impressive technology. Not because it’s particularly well-designed (it isn’t). Not because almost every single consumer device with internet access in the world has executed a JavaScript program. Instead, JavaScript is impressive because almost every single feature of the language makes it a nightmare to optimize and yet, it is fast.

Think about it. There is no type information. Every single object can gain and lose properties over the lifetime of the program. There are six(!) different kinds of falsy values, and every number is a 64-bit float. As if that wasn’t enough, JavaScript is expected to execute quickly, so you can’t spend a lot of time analyzing and optimizing it either.

And yet, JavaScript is fast.

How can this be?

In this article, we’re going to look closer at a few techniques that different JavaScript engines use to achieve good runtime performance. Keep in mind that I’m purposefully leaving a few details out, and simplifying things. It’s not a goal of this article that you learn how things work exactly, but that you understand enough to comprehend the theory behind the experiments we’ll perform later in this series.

When your browser downloads JavaScript, its top priority is to get it running as quickly as possible. It does this by translating the code to bytecode, virtual machine instructions, which is then handed over to an interpreter, or virtual machine, that understands how to execute them.

You might question why the browser would convert JavaScript to virtual machine instructions instead of actual machine instructions. It’s a good question. In fact, converting straight to machine instructions is what V8 (Chrome’s JavaScript engine) used to do until recently.

A virtual machine for a specific programming language is usually an easier compilation target because it has a closer relation to the source language. An actual machine has a much more generic instruction set, and so it requires more work to translate the programming language to work well with those instructions. This difficulty means compilation takes longer, which again means it takes longer for the JavaScript to start executing.

As an example, a virtual machine that understands JavaScript is also likely to understand JavaScript objects. Because of this, the virtual instructions required to execute a statement like object.x might be one or two instructions. An actual machine, with no understanding of how JavaScript objects work, will need a lot more instructions to figure out where .x resides in memory and how to get it.

The problem with a virtual machine is that it is, well, virtual. It doesn’t exist. The instructions cannot be executed directly, but must be interpreted at runtime. Interpreting code will always be slower than executing code directly.

There’s a tradeoff here. Faster compilation time versus faster runtime. In many cases, faster compilation is a good tradeoff to make. The user is unlikely to care whether a single button click takes 20 or 40 milliseconds to execute, especially if the button is only pressed once. Compiling the JavaScript quickly, even if the resulting code is slower to execute, will let the user see and interact with the page faster.

There are situations that are computationally expensive. Stuff like games, syntax highlighting or calculating the fizzbuzz string of a thousand numbers. In these cases, the combined time of compiling and executing machine instructions is likely to reduce the total execution time. So how does JavaScript handle these kinds of situations?

Whenever the JavaScript engine detects that a function is hot (that is, executed many times) it hands that function over to an optimizing compiler. This compiler translates the virtual machine instructions into actual machine instructions. What’s more, since the function has already been run several times, the optimizing compiler can make several assumptions based on previous runs. In other words, it can perform speculative optimizations to make even faster code.

What happens if, later on, these speculations turn out to be wrong? The JavaScript engine can simply delete the optimized, but wrong, function, and revert to using the unoptimized version. Once the function has been run several more times, it can attempt to pass it to the optimizing compiler again, this time with even more information that it can use for speculative optimizations.

Now that we know that frequently run functions use information from previous executions during optimization, the next thing to explore is what kind of information this is.

Almost everything in JavaScript is an object. Unfortunately, JavaScript objects are tricky things to teach a machine to deal with. Let’s look at the following code:

function addFive(obj) { return obj.method() + 5;

}

A function is pretty straightforward to translate to machine instructions, as is returning from a function. But a machine doesn’t know what objects are, so how would you translate accessing the method property of obj?

It would help to know what obj looks like, but in JavaScript we can never really be certain. Any object can have a method property added to, or removed from, it. Even when it does exist, we cannot actually be certain if it is a function, much less what calling it returns.

Let’s attempt to translate the above code to a subset of JavaScript that doesn’t have objects, to get an idea of what translating to machine instructions might be like.

First, we need a way to represent objects. We also need a way to retrieve values from one. Arrays are trivial to support in machine code, so we might go with a representation like this:

// An object like { method: function() {} }// could be represented as:// [ [ "method" ], // property names// [ function() {} ] ] // property valuesfunction lookup(obj, name) { for (var i = 0; i < obj[0].length; i++) { if (obj[0][i] === name) return i; } return -1;

}

With this, we can attempt to make a naive implementation of addFive :

function addFive(obj) { var propertyIndex = lookup(obj, "method"); var property = propertyIndex < 0 ? undefined : obj[1][propertyIndex]; if (typeof(property) !== "function") { throw NotAFunction(obj, "method");

}

var callResult = property(/* this */ obj); return callResult + 5;

}

Of course, this doesn’t work in the case where obj.method() returns a something other than a number, so we need to tweak the implementation a little:

function addFive(obj) { var propertyIndex = lookup(obj, "method"); var property = propertyIndex < 0 ? undefined : obj[1][propertyIndex]; if (typeof(property) !== "function") { throw NotAFunction(obj, "method");

}

var callResult = property(/* this */ obj); if (typeof(callResult) === "string") { return stringConcat(callResult, "5"); } else if (typeof(callResult !== "number") { throw NotANumber(callResult); } return callResult + 5;

}

This would work, but I hope it’s apparent that this code could skip a few steps (and thus be faster) if we could somehow know ahead of time what the structure of obj is, and what the type of method is.

All the major JavaScript engines keep track of an object’s shape in some way. In Chrome, this concept is known as hidden classes. It’s what we will call it in this article as well.

Let’s start by looking at the following snippet of code:

var obj = {}; // empty objectobj.x = 1; // shape has now changed to include a `x` propertyobj.toString = function() { return "TODO"; }; // shape changes

delete obj.x; // shape changes again

If we were to translate this to machine instructions, how would we keep track of the object’s shape as new properties are added and removed? If we use the previous example’s idea of representing objects as arrays, it might look something like this:

var emptyObj__Class = [ null, // No parent hidden class [], // Property names [] // Property types];var obj = [ emptyObj__Class, // Hidden class of `obj` [] // Property values];var obj_X__Class = [ emptyObj__Class, // Contains same properties as empty object ["x"], // As well as one property called `x` ["number"] // Where `x` is a number];obj[0] = obj_X__Class; // Shape changesobj[1].push(1); // value of `x`var obj_X_ToString__Class = [ obj_X__Class, // Contains same properties as previous shape ["toString"], // And one property called `toString` ["function"] // Where `toString` is a function];obj[0] = obj_X_ToString__Class; // shape changeobj[1].push(function() { return "TODO"; }); // `toString` valuevar obj_ToString__Class = [ null, // Starting from scratch when deleting `x` ["toString"], ["function"] ];obj[0] = obj_ToString__Class;

obj[1] = [obj[1][1]];

If we were to generate virtual machine instructions like this, we now would have a way to track what an object looks like at any given time. However, this on its own doesn’t really help us. We need to store this information somewhere where it would be valuable.

Whenever JavaScript code performs property access on an object, the JavaScript engine stores that object’s hidden class, as well as the result of the lookup (the mapping of property name to index) in a cache. These caches are known as inline caches, and they serve two important purposes:

  • When executing bytecode, they speed up property access if the object involved has a hidden class that is in the cache.
  • During optimization, they contain information about what type of objects that have been involved when accessing an object property, which helps the optimizing compiler generate code specially suited for those types.

Inline caches have a limit on how many hidden classes they store information on. This preserves memory, but also makes sure that performing lookups in the cache is fast. If retrieving an index from the inline cache takes longer than retrieving the index from the hidden class, the cache serves no purpose.

From what I can tell, inline caches will keep track of 4 hidden classes at most, at least in Chrome. After this, the inline cache will be disabled and the information will instead be stored in a global cache. The global cache is also limited in size, and once it has reached its limit, newer entries will overwrite older ones.

To best utilize inline caches, and aid the optimizing compiler, one should try to write functions that only perform property access on objects of a single type. More than that and the performance of the generated code will be sub-optimal.

A separate, but significant, kind of optimization is inlining. In short, this optimization replaces a function call with the implementation of the called function. An example:

function map(fn, list) { var newList = []; for (var i = 0; i < list.length; i++) { newList.push(fn(list[i])); } return newList;}function incrementNumbers(list) { return map(function(n) { return n + 1; }, list);}

incrementNumbers([1, 2, 3]); // returns [2, 3, 4]

After inlining, the code might end up looking something like this:

function incrementNumbers(list) { var newList = []; var fn = function(n) { return n + 1; }; for (var i = 0; i < list.length; i++) { newList.push(fn(list[i]));

}

return newList;}

incrementNumbers([1, 2, 3]); // returns [2, 3, 4]

One benefit of this is that a function call has been removed. An even bigger benefit is that the JavaScript engine now has even more insight into what the function actually does. Based on this new version, the JavaScript engine might decide to perform inlining again:

function incrementNumbers(list) { var newList = []; for (var i = 0; i < list.length; i++) { newList.push(list[i] + 1); } return newList;}

incrementNumbers([1, 2, 3]); // returns [2, 3, 4]

Another function call has been removed. What’s more, the optimizer might now speculate that incrementNumbers is only ever called with a list of numbers as an argument. It might also decide to inline the incrementNumbers([1, 2, 3]) call itself, and discover that list.length is 3, which again might lead to:

var list = [1, 2, 3];var newList = [];newList.push(list[0] + 1);newList.push(list[1] + 1);newList.push(list[2] + 1);

list = newList;

In short, inlining enables optimizations that would not have been possible to perform across function boundaries.

There are limits to what can be inlined, however. Inlining can lead to larger functions due to code duplication, which requires additional memory. The JavaScript engine has a budget on how big a function can get before it skips inlining altogether.

Some function calls are also difficult to inline. Particularly when a function is passed in as an argument.

In addition, functions passed as arguments can be difficult to inline unless it’s always the same function. While that might strike you as a strange thing to do, it might end up being the case because of inlining.

JavaScript engines have many tricks to improve runtime performance, many more than what has been covered here. However, the optimizations described in this article apply to most browsers, and are easy to verify if they’re being applied. Because of this, we’re mainly going to be focusing on these optimizations when we attempt to improve Elm’s runtime performance.

But before we start trying to optimize anything, we need a way of identifying what code can be improved. The tools that give us this information, is the topic of the next article.

I’m not the first to try to explain how JavaScript engines work. Here are a few articles that go more in depth, and which have a different way of explaining similar concepts: