15/08/257 min read
#performance#tables#react#virtualization

Virtualizing data-heavy tables (200k+ rows) without breaking UX

The first time I tried to render 200,000 rows in a React table, Chrome's memory usage shot past 4GB and the tab crashed. The naive approach—render everything, let the browser handle it—doesn't work at scale. But virtualization isn't just about swapping in a library. It's a set of tradeoffs that affect scrolling, filtering, accessibility, and how your users perceive performance.

This is the playbook I now reach for when a table crosses ~100k rows.

What virtualization actually means

Virtualization is simple in concept: only render what's visible. If your viewport shows 50 rows, render 50 rows (plus a buffer), not 200,000. As the user scrolls, swap rows in and out of the DOM.

In practice, this means:

  • Calculating which rows are visible based on scroll position
  • Maintaining a "window" of rendered rows
  • Keeping total scrollable height accurate so the scrollbar behaves correctly
  • Handling variable row heights (if you have them)

Libraries like react-window, react-virtualized, or @tanstack/react-virtual handle the mechanics. The harder decisions are architectural.

Overscan and row height strategy

Two decisions you'll hit immediately: how much to overscan, and whether rows have fixed or variable height.

Overscan — Rendering extra rows above and below the viewport smooths scrolling (no blank flash as you scroll) but increases work. Start at 5–10 rows; measure jank and CPU (e.g. frame drops in the Performance panel) before increasing. More overscan isn't always better.

Fixed height rows — Simplest and fastest. Use a constant estimateSize (or equivalent in your library). No measurement, no layout thrash.

Variable height rows — Requires measurement and caching. The library needs to know row heights to compute scroll position; that usually means measuring after render and storing results. Watch for layout thrash if you measure too often or during scroll.

Rows, columns, or both?

Most virtualization tutorials focus on rows. But if you have 50+ columns, column virtualization matters too.

Row-only virtualization works when:

  • You have fewer than ~30 columns
  • Column widths are fixed or easily calculated
  • Horizontal scrolling is minimal

Column virtualization becomes necessary when:

  • You have wide tables with many columns (I've worked on tables with 100+ columns)
  • Users horizontally scroll frequently
  • Column content is expensive to render (formatted numbers, status badges, nested components)

Both (2D virtualization) is the most complex but necessary for truly large grids. The tradeoff is complexity—you're now managing two scroll axes, and features like frozen columns or sticky headers get harder.

My rule: start with row virtualization only. Add column virtualization when you can measure a performance problem, not before. The added complexity isn't free.

The re-render storm problem

Virtualization solves the "too many DOM nodes" problem but creates a new one: every scroll event potentially triggers re-renders. Scroll 10 rows down, and you might unmount 10 rows at the top and mount 10 at the bottom. If each row render is expensive, scrolling feels janky.

What triggers re-render storms:

  1. Inline object/array props — Every render creates new references
// Bad: new object every render
<Cell style={{ color: row.status === 'active' ? 'green' : 'red' }} />

// Better: stable reference
const cellStyle = useMemo(() => getStyleForStatus(row.status), [row.status]);
  1. Uncontrolled context updates — A context that updates on scroll will re-render every consumer

  2. Row components that depend on table-level state — If every row reads from a "selectedRows" Set that changes frequently, every row re-renders when selection changes

How to bound updates:

  • Memoize row componentsReact.memo with a custom comparison function that only compares row data, not callbacks
  • Isolate selection state — Don't pass the entire selection Set to rows. Pass a isSelected boolean computed per-row
  • Batch state updates — If multiple cells update, batch into one state change
  • Use CSS for visual states — Selection highlighting via CSS classes is cheaper than conditional rendering
const Row = React.memo(({ data, isSelected, onSelect }) => {
  return (
    <tr className={isSelected ? 'row-selected' : ''}>
      {/* cells */}
    </tr>
  );
}, (prev, next) => {
  return prev.data === next.data && prev.isSelected === next.isSelected;
});

The custom comparison ignores onSelect because it's a stable callback. One caveat: useCallback with empty deps is stable but can capture stale state or props. If your callback needs the current row or selection, use a ref-based "latest handler" (update a ref in useEffect, call ref.current inside the callback) or a small useEvent-style hook so the callback always sees fresh values.

Loading states that don't feel broken

With 200k rows, you're not loading everything upfront. You're paginating, cursor-based fetching, or streaming. The UX challenge is making partial data feel complete.

Skeletons vs. spinners:

  • Skeletons work when you know the shape of incoming data (row count, column layout)
  • Spinners work for unknown quantities but feel worse for tables—users expect to see structure

What I've found works:

  1. Load the first page fast, then fetch ahead — Show 100 rows immediately. While the user reads, fetch the next 500 in the background. By the time they scroll, data is ready.

  2. Placeholder rows, not empty space — If the user scrolls faster than data loads, show skeleton rows at the expected height. Empty space breaks the mental model of a continuous table.

  3. Scroll position restoration — If the user navigates away and returns, restore their scroll position. This is harder than it sounds with virtualized lists because you need to restore to a row index, not just a pixel offset. Store either row index or scrollTop depending on your virtualization API. Restore via the library's scrollToOffset or scrollToIndex when the list mounts.

  4. Incremental loading indicators — A small progress bar at the top or bottom showing "Loading rows 1000-2000..." is better than nothing. Users understand the table is still loading without feeling like it's frozen.

// Scroll restoration: persist row index or scrollTop depending on your virtualization API
const savedScroll = sessionStorage.getItem('tableScroll');
// On unmount: save listRef.current?.scrollTop or the library's getScrollOffset() / getStartIndex()
// On mount: restore via the library's scrollToOffset(px) or scrollToIndex(rowIndex)

Sorting and filtering: server vs. client

The instinct is to do everything client-side once data is loaded. It feels faster. But with 200k rows, "client-side" has real costs.

Push to the server when:

  • Full data isn't loaded yet (obvious)
  • Sorting/filtering criteria match indexed database columns
  • The operation is expensive (multi-column sort, fuzzy text search)
  • You need consistent behavior regardless of what's cached

Keep client-side when:

  • All data is already in memory
  • The operation is fast (boolean filter, single-column sort on small dataset)
  • You want instant feedback without network latency
  • Offline support matters

The hybrid approach I use:

  1. Server-side for initial query — Sorting and filtering are URL parameters. Page load respects them.
  2. Client-side for refinement — If the user has 1000 rows loaded and applies an additional filter, do it locally.
  3. Debounce and indicate — For text search, debounce input by 300ms. Show a subtle "filtering..." indicator so users know something is happening.
const [serverFilters, setServerFilters] = useState(initialFilters);
const [localFilters, setLocalFilters] = useState({});

const filteredData = useMemo(() => {
  // Apply local filters to server-fetched data
  return applyFilters(serverData, localFilters);
}, [serverData, localFilters]);

const handleFilterChange = (filter) => {
  if (shouldPushToServer(filter)) {
    setServerFilters(prev => ({ ...prev, ...filter }));
  } else {
    setLocalFilters(prev => ({ ...prev, ...filter }));
  }
};

The perceived performance game

Raw performance matters, but perceived performance often matters more. A table that loads in 2 seconds but shows progress feels faster than one that loads in 1.5 seconds with a blank screen.

Techniques that help:

  • Prioritize above-the-fold — Render the first 20 rows before anything else. Users see content immediately.
  • Delay heavy columns — If a column requires expensive computation (parsing dates, formatting currencies), render a placeholder first and fill in asynchronously.
  • Animate loading — A shimmer effect on skeleton rows suggests activity. Static gray boxes feel stuck.
  • Respond to intent — If the user hovers near the scroll area, start prefetching the next page. Don't wait for the scroll event.

Accessibility considerations

Virtualization can break accessibility if you're not careful.

  • Keep row semantics — Use proper <table>, <tr>, <td> elements, not just divs. Screen readers understand tables.
  • Announce dynamic content — When new rows load, announce it via an ARIA live region (subtly—don't spam the user).
  • Keyboard navigation — Arrow keys should move focus between cells. This is hard with virtualization because the target cell might not be rendered yet. Focus management needs to account for this.
  • Don't break find-in-page — Browser ctrl+F won't find text in unrendered rows. This is a known limitation. Consider adding an in-app search as an alternative.

Gotchas that bite in production

  • Sticky header + virtualization — The virtual list usually only manages the body. Use a separate header layer pinned outside the scroll container so the header stays fixed while the body scrolls.
  • Dynamic column widths — Measure once (e.g. on first render or via a resize observer); avoid recalculating on every scroll or you get layout thrash.
  • Selection across unloaded rows — Store selected row IDs, not indices. When data is paginated or filtered, indices shift; IDs stay correct.

Wrapping up

Virtualizing large tables is less about picking the right library and more about understanding complexity vs. performance, server vs. client, raw speed vs. perceived speed. The specifics depend on your data shape, user workflows, and what "fast enough" means for your app.

Start simple—row virtualization with a proven library. Measure where users experience slowness. Add complexity (column virtualization, aggressive prefetching, hybrid filtering) only when you can point to a real problem it solves.

The goal isn't a table that handles 200k rows. It's a table that feels fast at 200k rows, loads what users need before they ask for it, and never leaves them staring at a spinner wondering if something broke.

Related Articles