The Problem With Data Visualization
Most data visualization tools make the same tradeoff: legibility over feeling. You get accurate charts, clean axes, correctly-labelled data points. You understand the data intellectually. But you don't feel it.
There's a category of insight that comes not from reading a chart but from navigating a space — from having scale, distance, and topology work on you over time. Cartographers understood this long before computers: a well-made map doesn't just communicate information, it creates a relationship between the reader and the territory.
Cartograph is an attempt to apply that philosophy to arbitrary datasets.
The Core Idea: Terrain as Data
The central interaction is this: drop in a CSV, and the data becomes topography. Numeric columns map to elevation. Categorical columns map to biome — color, material, and texture variations that give different data regions distinct visual identities. Time-series data animates as geological change, the landscape shifting and reforming as you scrub through.
The interface is a first-person camera navigating this terrain. You don't click on a data point to see its value — you walk toward it. Its absolute size in your field of view tells you something about its relative magnitude. Its material tells you which category it belongs to. Its neighborhood tells you which other points it correlates with.
Rendering Architecture
The Geometry Problem: 500K Points at 60fps
The naive approach to rendering a large dataset in WebGL is to create one mesh object per data point. At 1,000 points that works fine. At 100,000 points, you're making 100,000 draw calls per frame, and the CPU-to-GPU communication overhead destroys frame rate.
The solution is InstancedMesh — a Three.js primitive that submits all instances of a geometry in a single draw call, passing per-instance data (position, scale, color) through a single InstancedBufferAttribute. The GPU then renders all instances in parallel using its own internal parallelism.
For Cartograph, each data point becomes an instance. The vertex shader receives position (X=feature 1, Z=feature 2, Y=numeric magnitude), color (categorical encoding), and a scale factor. The fragment shader applies the biome material using a custom color map that interpolates between six predefined biome palettes.
// Vertex shader (simplified)
attribute vec3 instancePosition;
attribute float instanceScale;
attribute vec3 instanceColor;
varying vec3 vColor;
varying float vFogFactor;
void main() {
vColor = instanceColor;
vec4 worldPos = vec4(position * instanceScale + instancePosition, 1.0);
// Height-based fog for depth perception
float fogDist = length(worldPos.xyz - cameraPosition);
vFogFactor = clamp((fogDist - 80.0) / 120.0, 0.0, 1.0);
gl_Position = projectionMatrix * modelViewMatrix * worldPos;
}
Normal computation for the height-map terrain mesh — the continuous surface underneath the data points — runs in a WebAssembly module to avoid blocking the main thread during recalculation when datasets change.
D3 as Data Infrastructure, Not Renderer
Every Cartograph tutorial starts by treating D3 as a charting library — calling d3.select, appending SVG elements, binding data. That's not how Cartograph uses D3.
D3's actual value is in its mathematical and statistical utilities: scales, projections, voronoi diagrams, force simulations, color interpolation, contour generation. Cartograph uses D3 purely as a data transformation and coordinate mapping layer, with Three.js handling all rendering.
The pipeline for a new dataset:
- Schema inference: D3's
csvParse+ type coercion detection classifies each column as numeric, categorical, temporal, or geographic - Scale construction:
d3.scaleLinear,d3.scaleOrdinal, andd3.scaleSequentialwith domain/range computed from the data - Normalization: All numeric features normalized to [0, 1] for consistent spatial mapping
- Outlier handling: Values beyond 3σ are clamped and visually flagged with an emission shader override
- Coordinate computation: (x, y, z) world positions computed for all instances, written into
Float32Arraybuffers - Buffer upload: Single
BufferAttribute.set()call uploads all instance data to GPU memory
The full pipeline for a 100K-row dataset runs in ~180ms on modern hardware.
The Camera: Navigation as Analysis
A significant design investment went into the camera system, because navigation is the analysis in Cartograph. What you see as you approach a cluster, what you notice when you turn around, what emerges as you gain altitude — these are not incidental to the experience, they are the insight mechanism.
Three camera modes:
Exploration mode is a first-person WASD+mouse controller. You navigate the terrain like you'd explore a game environment. Momentum and inertia make movement feel physical. Raycasting detects when you're looking at a data point and surfaces its raw values in a minimal HUD overlay.
Cinematic mode is triggered when you select a data point or cluster. The camera executes a smooth Bezier-path fly-to animation, approaching from a calculated "best angle" that maximizes the visual separation of the selected point from its neighbors. This is the equivalent of a zoom-to in a traditional chart, but it preserves spatial context throughout the transition.
Orbit mode locks the camera to a user-defined focal point and lets you rotate around it freely. This is the primary mode for examining dense clusters — you can circle a group of related data points to understand their 3D distribution.
Post-Processing Pipeline
Raw WebGL rendering of the terrain looks correct but not cinematic. The post-processing stack is where the visual identity comes from:
- SSAO (Screen-Space Ambient Occlusion): Adds subtle contact shadow where data points sit close to the terrain surface, grounding them visually
- Bloom: Bright data points (high-magnitude outliers) emit a subtle luminescence, making them pop from the landscape
- Depth of Field: Background elements blur when the camera focuses on a foreground cluster, using cinematic focus cueing to guide attention
- Film grain: A subtle noise layer on the final output prevents the "too clean" look of raw GPU rendering
All post-processing runs through @react-three/postprocessing (EffectComposer) in a single render pass to minimize GPU overhead.
Results
50,000+ datasets have been processed through Cartograph since public beta launch. The use cases that emerged were more diverse than anticipated: academic researchers mapping citation networks, urban planners visualizing geospatial population data, a product team using it to navigate user behavior clusters that previously lived in flat CSV exports.
The feature that generated the most feedback was the temporal animation mode — watching a dataset's topology change over time as you scrub through a time series. Several users described it as "the first time I actually understood the trend" — because seeing the landscape physically shift is a different cognitive experience from reading a trend line.
Stack
- Rendering: Three.js r160 + React-Three-Fiber 8 + Drei
- Data: D3.js 7 (pipeline only — no SVG rendering)
- Shaders: GLSL (custom vertex + fragment per material)
- Post-processing: @react-three/postprocessing (SSAO, Bloom, DOF)
- Performance: InstancedMesh + WebAssembly normal computation
- Framework: React 18 + Next.js 14 (App Router)
- State: Zustand + Jotai (camera/selection state split)