Memory Analysis 101
This is an introduction to memory analysis. Terms and notions described here are used in the Heap Profiler UI and the corresponding documentation. You need to be familiar with them to use the tool effectively.
Common Terms
This section describes common terms used in memory analysis, and is applicable to a variety of memory profiling tools for different languages. If you already have experience with, say, Java or .Net memory profilers, chances are high that you are familiar with them.
Object Sizes
Memory can be held by an object in two ways: directly by the object itself, and implicitly by holding references to other objects, and thus preventing them from being automatically disposed by a garbage collector (GC for short).
The size of memory that is held by the object itself is called shallow size. Typical JavaScript objects have some memory reserved for their description and for storing immediate values.
Usually, only arrays and strings can have significant shallow sizes. However, strings often have their main storage in renderer memory, exposing only a small wrapper object on the JavaScript heap.
Nevertheless, even a small object can hold a large amount of memory indirectly, by preventing other objects from being disposed by the automatic garbage collection process. The size of memory that will be freed, when the object itself is deleted, and its dependent objects made unreachable from GC roots, is called retained size.
Retaining Paths
The heap is a network of interconnected objects. In the mathematical world, this structure is called a graph. A graph is constructed from nodes connected by means of edges. Both nodes and edges have labels: nodes (objects) are labelled using the name of the constructor function that was used to build them, edges are labelled using the names of properties.
The sequence of edges that needs to be traversed in order to reach one object from another, is called a path. Usually, we are only interested in simple paths, i.e. paths that do not go through the same node twice.
We call a retaining path any path from GC roots to a particular object. If there are no such paths, the object is called unreachable and is subject to disposal during garbage collection.
Dominators
The dominator of an object A is an object that exists in every simple path from the root to the object A. That means, having the dominator object removed from the heap (with all its references being cut), the object A will become unreachable from GC roots, and will be disposed.
Dominator objects comprise a tree structure, because each object has exactly one dominator. A dominator of an object may lack direct references to an object it dominates, that is, the dominators tree is not a spanning tree of the graph.
Collection-like objects may retain big amounts of memory, when they dominate other objects. Such nodes of the tree are called accumulation points.
V8 Specifics
In this section we describe some memory-related topics that only correspond to the V8 JavaScript virtual machine. Reading them might help to understand why heap snapshots look this way.
JavaScript Object Representation
Numbers can be presented either as immediate 31-bit integer values (they are called small integers, or SMIs for short), or as heap objects (called heap numbers). The latter are used for values that can't fit into the SMI form, e.g. doubles, or for cases when a value needs to be boxed, e.g. for setting properties on it.
String content can be either stored in the VM heap, or externally in the renderer's memory. Content received from the Web (e.g. scripts sources) doesn't get copied onto the VM heap, instead, a wrapper object is created and used to access external storage.
When two strings are concatenated, their contents are initially stored separately, and are joined only logically, using an object called cons string. Joining of the cons string contents is performed only when it's needed, e.g. when a substring of a joined string needs to be constructed.
Arrays are used extensively in the V8 VM for storing large amounts of data. Dictionaries (sets of key-value pairs) are backed up by arrays. Thus, arrays are the basic building block for JavaScript objects. A typical JavaScript object posesses two arrays: one for storing named properties, another for storing numeric elements. In the case when the number of properties is very small, they can be stored internally in the JavaScript object itself.
A map object describes object kind and its layout. For example, maps are used to describe implicit object hierarchies, as described here.
Object Groups
Each native objects group is made up from objects that hold mutual references to each other. Consider for example a DOM subtree, where every node has a link to its parent and links to the next child and next sibling, thus forming a connected graph. Note that native objects are not represented in the JavaScript heap — that's why they have zero size. Instead, wrapper objects are created. Each wrapper object holds a reference to the corresponding native object, for redirecting commands to it. In its own turn, an object group holds wrapper objects. However, this doesn't create an uncollectable cycle, as GC is smart enough to release object groups whose wrappers are no longer referenced. But forgetting to release a single wrapper will hold the whole group and associated wrappers.