An overview of metric embeddings and metric repair (and why they matter)

Speaker: Anna Gilbert
Date of Talk: May 13, 2025
Upstream link: Distinguished Lecture Series

In the words of the speaker, this first talk is a high-level advertisement more than anything else.

“word2vec is the death of modern knowledge. Nobody knows anything except machine learning.”

Applications of Metric Embedding

A metric embedding is a function \(f : X\to Y\) between metric spaces such that for all \(x_1, x_2\in X\), \[d_X \left( x_1, x_2 \right) \sim d_Y \left( f \left( x_1 \right), f \left( x_2 \right) \right),\] i.e. \(f\) is an embedding in the usual sense.

In many cases, \(X\) is a discrete set of objects that we are collecting data on, and \(d_X \left( x_1, x_2 \right)\) is some measurement of similarity between the experimental objects \(x_1\) and \(x_2\). Thus, there is interest in finding a metric embedding \(f : X\to \mathbb{R}^N\) with \(N\) very small.

Many algorithms and problems (e.g. the travelling salesman problem) are much easier where there’s Euclidean structure attached to data, so running these algorithms on the \(\mathbb{R}^N\) side of the metric embedding often shakes out better.

Broken Metrics

What happens if your data can’t actually arise from a metric \(d_X\)? This can happen if your scientists are morons and transform their data in dumb ways, or if your data is noisy (or perhaps something else altogether). Rectifying this is the problem of metric repair (it seems). Lots of algorithms that use the principle of metric embedding (e.g. the MDS algorithm) are highly sensitive to this kind of noise.

In fact, the classical implementations of MDS don’t quite solve the optimisation problem that one may expect. As the target embedding space \(\mathbb{R}^N\)’s dimension increases, we expect the implicit constants in the \(\sim\) to get better and better. However, there is a global minimum at a moderate value of \(N\); as \(N\to\infty\), the distortion factor gets wrecked.

The problem of metric repair is thus to correct the faulty data so that they satisfy the triangle inequality (or, God forbid, nonnegativity) at all triplets of points.

Graph Algorithms

Many algorithms for metric repair have the unfortunate side-effect of perturbing a significant portion of the input data. This is undesirable, and one seeks constraints on what kinds of perturbations are allowed.

One idea is to minimise the number of perturbations altogether, and a plausible idea is to first (efficiently) determine every failure of the triangle inequality, which are hopefully sparse to begin with. However, in 2010, Williams and Williams showed that, of the following three problems, either all three have subcubic algorithms, or none of them do:

The all-pairs shortest paths problem,
determining if a dataset really comes from a metric, and
some other janky thing I didn’t catch.

Thus, it’s unlikely that a subcubic solution exists. But, this does indicate that the combinatorics of all-pairs shortest paths plays a role; indeed, the Floyd-Warshall algorithm gives a way to repair a metric by only shrinking broken triangles. In fact, it’s the solution that minimises the number of changes among those that only decrease distance data.

This becomes interesting when we impose that distances must only increase: how does one pick which “broken leg” to lengthen? Whereas the choice is clear for decrease-only solutions, this is no longer an option. Moreover, convex constraints will improve efficiency, but will result in far-from-sparse perturbations.

As of 2022, there is an algorithm by Fan et al. that gets a (rather poor) approximation of the optimum in \(O\left( n^6 \right)\) time. How can this be improved? Stay tuned for talk 2….

Hunter Liu's Website