Find memory leak in Node server using Chrome inspector

Shaohao Lin
5 min readMar 29, 2021

Recently, I enable server side data fetching using React-Query with its SSR feature in an isomorphic application. Since the feature introduced, server memory and CPU usage went up to the roof (See the memory usage graph below). The page response was exponential increased. I realized that a memory leak was introduced. The memory leak issue was mitigated by rebooting the server and revert the SSR changes. Rebooting the server is a short-term solution by resetting a machine to initial state. I want to find out root cause of the leak and fix it.

memory usage spike up

If you search memory leak in Node, there are a bunch of posts on this topic on tools to find memory leak. My post is focus on an use case and debugging memory leak issue step by step in details and fix it.

Before jumping to how to find leak, understanding the mechanism of how memory management work in NodeJS is important. Every application needs memory to work properly. Memory management provides ways to dynamically allocate memory chunks for programs when they request it, and release the allocated memory when they are no longer needed — so that they can be reused. Application-level memory management can be manual or automatic. In NodeJS application, memory management is automatically and involves a garbage collection. The purpose of a garbage collection is to monitor memory allocation and determine when a block of allocated memory is no longer needed and reclaim it. This automatic process is an approximation since the general problem of finding whether some memory “is not needed anymore” is undecidable. Garbage collection algorithms rely on the concept of reference. Within the context of memory management, an object is said to reference another object if the former has access to the latter (either implicitly or explicitly). An object is garbage collectible if the object has zero references to it. What that means is an allocated memory for an object will be released if the object has no reference to it.

Usually, memory leak happens because garbage collection can’t determine whether or not an object is still needed because the object is referenced by others. With the understanding of garbage collection, we are looking for an object that is been referenced longer that its needed.

Debugging steps

  1. Start the Node application with inspector enable by adding — inspect flag (e.g. the following command). This allows us to attach debugger to NodeJS application.

node--inspect=0.0.0.0:9229

2. Open Chrome DevTools and select the Node icon

Open Node DevTools

3. Click on Memory Tab and select Heap snapshot as profiling type

4. Set up a clean base heap snapshot. Before taking the 1st heap snapshot, make sure clean up the garbage by clicking on the garbage icon. This is to ensure all collectable garbage have been collect by garbage collector(GC) and our base is clean.

5. click snapshot button (You should have a similar heap snapshot).

6. Repeat a same action 10 times (any action is suspect causing memory leak, e.g. refresh a page, or click on submit button)

7. Take 2nd snapshot.

8. Change the view from summary to comparison

9. Sort by #Delta in descending order.

10. Looking for #Delta value is multiply of 10(because we perform the same action 10 times in step 6). In our case, The delta of MutationCache, QueryCache, QueryClient and Retryer are 10. These four constructors could be a potential memory leak candidate.

11. Click on each constructor to investigate further by looking at retainers session. Look for retained size of a object with its percentage value is not 0.

Analysis

In the above example, the memory issue is related to MutationCache, QueryCache, QueryClient and Retryer. These four objects cannot be collectible because setTimeout function references them. The setTimeout function is expired in 6 mins, which means these four objects can be collectible after 6 mins. Further investigation, the leak is related to how I use React-Query in SSR setting. Basically, React-Query stores cache in-memory. The default cache time is 6 mins. React-Query has a setTimeout function validates the cache whether or not cache is stale. The timeout function has reference to the cache object, which means the cache wouldn’t get garbage collected until 6 mins(if there is no other object reference the same cache). The solution is perfectly fine for client side caching. However, in server-side, each request generates each cache in memory (by designed I don’t want to share cache between two requests especially related to user information). Each cache takes 6 mins until GC collects it. Assuming each cache object size is 12 bytes and RPS is 4000. The memory usage in 6 mins (simplify version) is: 6 x 60 x 12 x 4000 x 6 x 60 = 6.2208 GB. In a 16 GB RAM, that is about 38 % memory usage. A server could reach to 100% memory usage in less than couple hours. Circle back to the memory usage graph, it clearly confirmed my analysis: memory spikes up because in-memory caching.

To fix this issue, I set server-side cache time to zero and clear React-Query client after hydration(See Github discussion to how to implement it). What that does is removing any reference to any cache object in React-Query on server side so it can be garbage collected.

Conclusion

Fixing a memory leak issue is relatively easy. The challenge part is to find it.

--

--

Shaohao Lin

A Front-end engineer who loves all these in web!