A Linear-Time Alternative to t-SNE: Sine Landmark Reduction

The Problem With t-SNE and UMAP

t-SNE and UMAP are the workhorses of high-dimensional visualisation. They produce beautiful, cluster-separated scatter plots that have become ubiquitous in machine learning papers, bioinformatics, and data exploration. But they share a fundamental limitation: O(N²) pairwise comparisons.

For a dataset of 10,000 points in 50 dimensions, that means 100 million distance calculations — before the iterative optimisation even begins. In a Jupyter notebook, this is a minor annoyance. In a browser-based tool where users expect immediate feedback, it is a hard wall.

Both methods are also stochastic: run them twice, get different layouts. They distort global geometry to preserve local neighborhoods. And they provide no mechanism for embedding new points without re-running the entire computation.

What if we could get most of the visual clarity of t-SNE — the cluster separation, the structure — without the quadratic cost, the randomness, or the iteration?

The Core Idea: Landmarks, Not Pairwise

Instead of comparing every point to every other point, SLR compares each point to a small, fixed set of landmarks. For N points and k landmarks (typically ~50), this reduces complexity from O(N²) to O(N × k) — effectively linear when k is small.

The key insight is that you don't need all pairwise relationships to reconstruct a useful low-dimensional layout. If you know how far each point is from a well-placed set of reference points, you can triangulate its position analytically.

Two Landmark Strategies

SLR supports two approaches to choosing landmarks:

Synthetic Sine Skeleton. Generates deterministic landmarks using independent sine waves per dimension. Each landmark coordinate is defined by:

γ(t) = a · sin(ω · t + φ)

where amplitude a, frequency ω, and phase φ are drawn uniformly. This produces smooth, well-distributed coverage through high-dimensional space — completely independent of the data. The same landmarks work for any dataset.

Data-Derived Landmarks. Selects k random points from the dataset itself. Uses hybrid normalisation: raw selection preserves cluster structure, normalised computation ensures fair feature weighting.

The Algorithm

SLR runs in four phases. Each is non-iterative and has a closed-form solution.

Phase 1 Landmark Projection

Project the k high-dimensional landmarks into the target dimensionality (2D or 3D) using PCA. Then scale the projected coordinates so that the geometric spread in low-D matches the spread in high-D:

rms_high = √(mean of all pairwise squared distances in high-D) rms_low = √(mean of all pairwise squared distances in low-D) L_low = L_low_raw × (rms_high / rms_low)

This preserves the relative scale of the landmark constellation across dimensions.

Phase 2 Linearised Trilateration

For each data point x, compute the squared distance δ²_j to each landmark j. The ideal embedding satisfies:

||y - L_j||² = δ²_j

This is non-linear in y. But subtracting the equation for landmark 0 from all others eliminates the non-linear y^Ty term:

2(L_j - L_0)^T y = (δ²_j - δ²_0) + (||L_j||² - ||L_0||²)

This yields a linear system Ay = b, solved via the Moore-Penrose pseudoinverse. The matrix A⁺ is precomputed once for all points — making the per-point cost a simple matrix multiply.

For a batch of N points:

Compute k squared distances per point: O(N × k)
Matrix multiply by precomputed A⁺: O(N × k × m)
Total: linear in N

Phase 3 Alpha Refinement

A global scalar α corrects residual scale mismatches between the high-D distances and the low-D embedding:

α = Σ(dist_low × dist_high) / Σ(dist_low²)

The trilateration is then recomputed with corrected distances: δ²_corrected = α² × δ². This is a single pass — no iteration.

Phase 4 Distance Warping

A tunable parameter p interpolates between global geometry and local structure:

δ_warped = δ^p

At p = 1.0, the embedding preserves global distances faithfully. At p ≈ 0.33, it produces t-SNE-like cluster separation — compressing large distances and expanding small ones. The user controls the trade-off with a single slider.

Implementation

The core class in Python:

import numpy as np

class SineLandmarkReduction:
    def __init__(self, n_components=2, n_landmarks=50,
                 random_state=42, synthetic_landmarks=False):
        self.n_components = n_components
        self.k = n_landmarks
        self.synthetic_landmarks = synthetic_landmarks
        self.rng = np.random.RandomState(random_state)

    def _gamma(self, t, a, omega, phi):
        """Synthetic sine path: γ(t) = a·sin(ω·t + φ)"""
        return a[:, None] * np.sin(
            omega[:, None] * t + phi[:, None]
        )

The embedding pipeline:

Standardise input features (zero mean, unit variance)
Generate synthetic landmarks or select k random data points
PCA-project landmarks to target dimensions (2D or 3D)
Precompute A and its pseudoinverse A⁺
For each batch: compute distances → build b → solve Ay = b
Apply alpha refinement
Return low-dimensional coordinates

Usage:

from sklearn.datasets import make_blobs

X, y = make_blobs(n_samples=5000, n_features=20,
                  centers=5, cluster_std=0.80,
                  random_state=42)

slr = SineLandmarkReduction(
    n_components=2, n_landmarks=50,
    random_state=42, synthetic_landmarks=False
)
Y = slr.fit_transform(X)

How It Compares

Property	t-SNE	UMAP	SLR
Complexity	O(N²) or O(N log N)	O(N¹⋅&sup7;)	O(N × k)
Deterministic	No	No	Yes
Global geometry	Distorted	Partially preserved	Preserved (tunable)
Iterative	Yes (gradient descent)	Yes (SGD)	No (analytic)
Out-of-sample	No (requires re-run)	Partial (parametric UMAP)	Yes (precomputed A⁺)
Local clusters	Excellent	Excellent	Good (tunable via p)
Browser-native	Impractical	Impractical	Yes

t-SNE produces stronger local cluster separation — that is its core strength. SLR trades some local resolution for determinism, speed, and global geometry preservation. The warping parameter p lets you slide between these trade-offs: lower values produce more t-SNE-like layouts, higher values preserve faithful distances.

Why This Matters for the Browser

The entire algorithm requires only basic linear algebra and a pseudoinverse — operations that translate directly to JavaScript or WebAssembly. There are no gradient calculations, no learning rate schedules, no convergence criteria to tune.

This makes SLR uniquely suited for interactive data exploration in the browser. Users upload a CSV, and the visualisation appears in under two seconds — not after a loading spinner and a background computation.

SLR powers the 3D scatter plot in CSV X-Ray, Thingbook's drag-and-drop anomaly detection tool. When you upload a multi-feature dataset, the 3D embedding you see is computed client-side using this algorithm.

The best visualisation algorithm is the one that runs before the user loses interest.

Limitations and Trade-offs

Local resolution. For datasets where local neighborhood structure is the primary concern (e.g., single-cell RNA-seq clustering), t-SNE or UMAP will produce tighter, more informative clusters.
Landmark placement. The quality of the embedding depends on landmark coverage. For extremely non-uniform data distributions, data-derived landmarks may leave sparse regions under-represented.
No learned representation. SLR is a geometric method, not a learned one. It does not capture non-linear manifold structure the way neural approaches (e.g., autoencoders) can.

Originally published in Towards AI on Medium (June 2024). Updated with CSV X-Ray context and comparison table for this version. Full source code available on GitHub.

A Linear-Time Alternative to t-SNE for Dimensionality Reduction