Classic Frontend Engineering Problem Series: Virtualized Lists

Disclaimer: The knowledge and techniques described in this blog is from Understand Virtual List by Building One. This blog serves as my personal notes and understanding of the concepts presented in the video.

One of the common libraries that I use at work is react-virtualized. I've used it many times without fully understanding the mechanics beneath.

This blog is my attempt to understand the problem and the solution.

The problem is straightforward: browsers struggle with rendering thousands of DOM elements. Apps like Twitter and Facebook need a better approach for their endless scrolling lists.

The Problem: A Million-Item List

This approach doesn't work - rendering every item at once:

function NaiveList() {
  // A million items - DO NOT TRY THIS AT HOME!
  const items = Array.from({ length: 1000000 }, (_, i) => ({
    id: i,
    name: `Item ${i + 1}`,
  }));

  return (
    <ul>
      {items.map(item => (
        <li key={item.id}>{item.name}</li>
      ))}
    </ul>
  );
}

This crashes most browsers (crashes my MacBook Pro M2 16GB RAM). A million DOM nodes consume too much memory and CPU.

The Key Insight: Only Render What's Visible

The core insight: render only what's in the viewport. If the viewport only fits 20 items, we only need 20 DOM nodes.

The building blocks:

import { useState } from "react";

function randomId() {
  return (
    Math.random().toString(36).substring(2, 15) +
    Math.random().toString(36).substring(2, 15)
  );
}

function generateList(num) {
  return Array.from({ length: num }, (_, i) => ({
    id: randomId(),
    name: `Item ${i + 1}`,
    description: `Description ${i + 1}`,
    price: (Math.random() * 100).toFixed(2),
  }));
}

const list = generateList(1_000_000);
const itemHeight = 20;
const windowHeight = 200;
const overscan = 10;

function App() {
  return (
    <div>
      <TopPositioningMethod />
      <TranslateYMethod />
    </div>
  );
}

export default App;

Two main implementation approaches exist. Here's my breakdown of both.

Approach 1: Absolute Positioning with 'top'

First approach - absolute positioning:

function TopPositioningMethod() {
  const [scrollTop, setScrollTop] = useState(0);
  const _startIndex = scrollTop / itemHeight;
  const _endIndex = Math.ceil((scrollTop + windowHeight) / itemHeight);

  const startIndex = Math.max(0, Math.floor(_startIndex) - overscan);
  const endIndex = Math.min(list.length, _endIndex + overscan);

  const handleScroll = (e) => {
    setScrollTop(e.target.scrollTop);
  };

  const visibleList = list.slice(startIndex, endIndex);

  return (
    <div>
      <h1>Top Positioning Method</h1>
      <ul
        style={{
          height: `${windowHeight}px`,
          overflowY: "scroll",
          position: "relative",
        }}
        onScroll={handleScroll}
      >
        <div style={{ height: `${list.length * itemHeight}px` }}>
          {visibleList.map((item, index) => (
            <li
              key={item.id}
              style={{
                backgroundColor: index % 2= 0 ? "#fff" : "#f0f0f0",
                height: `${itemHeight}px`,
                position: "absolute",
                top: `${(index + startIndex) * itemHeight}px`,
                left: 0,
                right: 0,
              }}
            >
              {item.name}
            </li>
          ))}
        </div>
      </ul>
    </div>
  );
}

The Parts That Make It Work:

The Container: A fixed-height box with scrolling.
Total Height Dummy Element: A div set to the full height of all items. This creates a proper scrollbar representing the entire list.
Scroll Position Tracking: We track scroll position to calculate which items should be visible.
The Math:
- scrollTop / itemHeight determines the first visible item index
- (scrollTop + windowHeight) / itemHeight identifies the last visible item index
Overscan: We render extra items above and below the visible area for smoother scrolling.
Absolute Positioning: Each item is positioned absolutely based on its index in the full list.

The critical point: we're only creating 30-40 DOM nodes instead of a million.

Approach 2: TranslateY Method

Second approach - CSS transforms:

function TranslateYMethod() {
  const [scrollTop, setScrollTop] = useState(0);
  const _startIndex = scrollTop / itemHeight;

  const startIndex = Math.max(0, Math.floor(_startIndex) - overscan);
  let renderedNodesCount = Math.ceil(windowHeight / itemHeight) + 2 * overscan;
  renderedNodesCount = Math.min(renderedNodesCount, list.length - startIndex);

  const handleScroll = (e) => {
    setScrollTop(e.target.scrollTop);
  };

  const visibleList = list.slice(startIndex, startIndex + renderedNodesCount);

  return (
    <div>
      <h1>TranslateY Method</h1>
      <div
        style={{
          height: `${windowHeight}px`,
          overflowY: "scroll",
        }}
        onScroll={handleScroll}
      >
        <div style={{ height: `${list.length * itemHeight}px` }}>
          <div
            style={{ transform: `translateY(${startIndex * itemHeight}px)` }}
          >
            {visibleList.map((item, index) => (
              <div
                key={item.id}
                style={{
                  backgroundColor: index % 2= 0 ? "#fff" : "#f0f0f0",
                }}
              >
                {item.name}
              </div>
            ))}
          </div>
        </div>
      </div>
    </div>
  );
}

Key Differences:

Instead of positioning each item absolutely, this approach renders items in normal flow then shifts the entire container using a transform.

I note three advantages:

It makes one style calculation instead of many
It can use GPU acceleration
The code is cleaner - no absolute positioning calculations for each item

Key Takeaways

From this deep dive, I've noted several important patterns:

Minimize DOM Nodes: Fewer DOM elements = faster page.
Fixed Heights Simplify Math: Known sizes make calculations trivial.
Overscan Prevents Flashing: A buffer zone improves perceived performance.
Math Over Markup: Virtualization essentially boils down to calculating what's visible.
Transforms vs. Absolute Positioning: Each has tradeoffs between simplicity and performance.

Understanding how virtualization works makes me better at implementing and debugging complex list interfaces.

Credits: This blog was inspired by Understand Virtual List by Building One, which provides an excellent breakdown of the concept.

Classic FE Problem: Virtualized Lists