Confidence Sets for Multivariate Gaussians

August 2024

For a standard Gaussian $\Normal(\mu, \sigma^2)$ , we have the confidence intervals

\begin{array}{rcl} p(\mu-1\sigma < x < \mu + 1\sigma) &\approx& 68.3\% \\ p(\mu-2\sigma < x < \mu + 2\sigma) &\approx& 95.5\% \\ p(\mu-3\sigma < x < \mu + 3\sigma) &\approx& 99.7\% \end{array}

What would those look like for a multivariate Gaussian, say to plot interpretable contours?

Gaussian Mixture Models, plotted using 3-sigma Figure 1. A Gaussian mixture model fit to the Old faithful dataset. The 3 contours indicate areas that are the 2d equivalent of the $1\sigma$ , $2\sigma$ and $3\sigma$ intervals in that they contain 68.3%, 95.5% and 99.7% of the probability mass.¹

We will build a similar result for multivariate Gaussians, in $d=2$ dimensions;

\begin{array}{rcl} p(\norm{x-\mu}_{\Sigma^{-1}} \leq 1.516...) \approx 68.3\%, \\ p(\norm{x-\mu}_{\Sigma^{-1}} \leq 2.490...) \approx 95.5\%, \\ p(\norm{x-\mu}_{\Sigma^{-1}} \leq 3.409...) \approx 99.7\%. \\ \end{array}

The general form is that $p(\norm{x-\mu}_{\Sigma^{-1}} \leq R) = c$ for $R = \sqrt{-2\log(1-c)}$ .

Confidence sets

The equivalent of the “confidence interval” in $d$ dimensions for a distribution $\Normal(\mu, \Sigma)$ would be a set $\mathcal{S}$ that is centered at $\mu$ and has the same contours as the sublevel sets of the density

\Normal(x; \mu, \Sigma) = \sqrt{\frac{1}{(2\pi)^d \det(\Sigma)}} \exp\left(-\frac{1}{2}\Vert x-\mu\Vert_{\Sigma^{-1}}^2\right).

We want this set to contain a given proportion $c$ of the probability mass, $p(x \in \mathcal{S}) = c$ .

The easy case

Starting with the easy case, let’s look at a standard multivariate Gaussian $x \sim \Normal(0, I)$ . Because symmetry, the confidence set will be a ball $\Vert x\Vert \leq R$ . The question is how to set the radius $R$ as a function of $c$ . We are looking for the solution of

\int_{\Vert x \Vert < R} \sqrt{\frac{1}{(2\pi)^d}} \exp\left(-\frac{1}{2}\Vert x\Vert^2\right) \mathrm{d}x = c.

To simplify the equation, we can integrate over the radius and the area separately,

\begin{aligned} c &= \int_{r=0}^R \int_{\Vert x \Vert = r} \sqrt{\frac{1}{(2\pi)^d}} \exp\left(-\frac{1}{2}r^2\right) \mathrm{d}x \mathrm{d}r, \\ &= \sqrt{\frac{1}{(2\pi)^d}} \int_{r=0}^R \exp\left(-\frac{1}{2} r^2\right) \left( \int_{\Vert x \Vert = r} 1 \mathrm{d}x \right) \mathrm{d}r. \end{aligned}

The integral in parentheses is the volume of the sphere of radius $r$ in $d$ dimensions,

\int_{\Vert x \Vert = r} 1 \mathrm{d}x = r^{d-1} \frac{2\sqrt{\pi^d}}{\Gamma(d/2)},

where $\Gamma$ is the Gamma function. Plugging this relationship in, we get

\begin{aligned} c &= \sqrt{\frac{1}{2^d}} \frac{2}{\Gamma(d/2)} \int_{r=0}^R r^{d-1} \exp\left(-\frac{1}{2} r^2\right) \mathrm{d}r. \end{aligned}

This equation has a surprisingly simple solution in $d=2$ dimensions, as

c = \underbrace{\sqrt{\frac{1}{4}} \frac{2}{\Gamma(1)}}_{=1} \int_{r=0}^R r \exp\left(-\frac{1}{2} r^2\right) \mathrm{d}r = 1 - \exp\left(-\frac{1}{2}R^2\right).

This gives the inverse relation $R = \sqrt{-2\log(1-c)}$ , and the thresholds

\begin{aligned} p(\norm{x} < 1.516) \approx 68.3\%, \\ p(\norm{x} < 2.490) \approx 95.5\%, \\ p(\norm{x} < 3.409) \approx 99.7\%. \\ \end{aligned}

Arbitrary Gaussians

For the general case of $x \sim \Normal(\mu, \Sigma)$ , the sets will be of the form $\Vert x-\mu\Vert_{\Sigma^{-1/2}} \leq R$ . Perhaps surprisingly, the formula for the radius $R$ is the same. To see this, we can use the reparametrization trick; if $x \sim \Normal(0, I)$ , then $z = Ax + \mu$ follows the distribution $z \sim \Normal(\mu, AA^\top)$ , meaning that

\mathbb{E}_{z \sim \Normal(\mu, \Sigma)}[f(z)] = \mathbb{E}_{x \sim \Normal(0, I)}[f(\Sigma^{-1/2}x + \mu)].

Writing the proportion in terms of the expectation of the indicator function,

c = \int_{\frac{1}{2}\norm{z-\mu}^2_{\Sigma^{-1}} < \frac{1}{2}R^2} \Normal(z; \mu, \Sigma) \mathrm{d}z = \mathbb{E}_{z \sim \Normal(\mu, \Sigma)}\brackets{ \ind{ \frac{1}{2}\norm{z-\mu}^2_{\Sigma^{-1}} < \frac{1}{2}R^2 } },

we can see the equivalence with the standard Gaussian case as

\begin{aligned} c &= \mathbb{E}_{z \sim \Normal(\mu, \Sigma)}\brackets{ \ind{ \frac{1}{2}\norm{z-\mu}^2_{\Sigma^{-1}} < \frac{1}{2}R^2 } }, \\ &= \mathbb{E}_{x \sim \Normal(0, I)}\brackets{ \ind{ \frac{1}{2}\norm{\Sigma^{1/2}x + \mu - \mu}^2_{\Sigma^{-1}} < \frac{1}{2}R^2 } }, \\ &= \mathbb{E}_{x \sim \Normal(0, I)}\brackets{ \ind{ \frac{1}{2}\norm{x}^2 < \frac{1}{2}R^2 } } = 1 - \exp\left(-\frac{1}{2}R^2\right). \end{aligned}

Higher dimensions

The simplicity of the formula for the relationship between $c$ and $R$ in $d=2$ dimensions is special. With $d=1$ , the probability depends on the $\mathrm{erf}$ function, and the situation gets worse with higher $d$ ;

\begin{aligned} \text{for } d = 1, & \quad c = \mathrm{erf}\paren{\frac{R}{\sqrt{2}}} = \int_{0}^{\frac{R}{\sqrt{2}}} \exp(-t^2) \mathrm{d}t, \\ \text{for } d = 2, & \quad c = 1 - \exp\paren{-\frac{1}{2}R^2}, \\ \text{for any } d, & \quad c = 1 - \frac{ \Gamma\paren{\frac{d}{2}, \frac{R^2}{2}} }{ \Gamma\paren{\frac{d}{2}} }. \end{aligned}

For the general case, we have to solve

\begin{aligned} c &= \sqrt{\frac{1}{2^d}} \frac{2}{\Gamma(d/2)} \int_{r=0}^R r^{d-1} \exp\left(-\frac{1}{2} r^2\right) \mathrm{d}r. \end{aligned}

The integral part has the following expression

\int_{r=0}^R r^{d-1} \exp\left(-\frac{1}{2} r^2\right) \mathrm{d}r = \frac{\sqrt{2^{d}}}{2} \paren{ \Gamma\paren{\frac{d}{2}} - \Gamma\paren{\frac{d}{2}, \frac{R^2}{2}} },

where $\Gamma(a, b)$ is the incomplete Gamma function. The constants simplify to give

\begin{aligned} c &= \sqrt{\frac{1}{2^d}} \frac{2}{\Gamma(d/2)} \frac{\sqrt{2^{d}}}{2} \paren{ \Gamma\paren{\frac{d}{2}} - \Gamma\paren{\frac{d}{2}, \frac{R^2}{2}} } = 1 - \frac{ \Gamma\paren{\frac{d}{2}, \frac{R^2}{2}} }{ \Gamma\paren{\frac{d}{2}} } \end{aligned}

Plotting code

As we started with a plotting question,

To finish, suppose we want to plot this 2d confidence region. One option would be to generate points on the boundary $\norm{x-\mu}_{\Sigma^{-1}} = R$ and ax.fill the resulting polygon. To generate those points, we can again use the reparametrization trick. For a given point $z = \Sigma^{1/2}x + \mu$ , $\norm{z - \mu}_{\Sigma^{-1}}^2 = R^2$ and $\norm{x}^2 = R^2$ are equivalent. We can thus generate points along $\norm{x} = R$ and by sampling an angle $\alpha$ between $0$ and $2\pi$ , compute $x = [\sin(\alpha),\cos(\alpha)]$ and rescale appropriately.

import numpy as np 
import scipy as sp

def contour(mu, Sigma, r=1.0, density=100):
    """
    Coordinates of the confidence region ‖x−𝜇‖_{𝛴⁻¹} = R
    """
    angles = np.linspace(0, 2 * np.pi, density)
    sin, cos = np.sin(angles), np.cos(angles)
    A = sp.linalg.sqrtm(Sigma) * r
    xs = A[0, 0] * sin + A[0, 1] * cos + mu[0]
    ys = A[1, 1] * cos + A[1, 0] * sin + mu[1]
    return xs, ys

We can now plot Gaussian densities

import matplotlib.pyplot as plt

fig = plt.figure()
fig.set_size_inches(6, 6)
ax = fig.add_subplot(111)


mu, Sigma = np.array([2, 2]), np.diag([1, 2])
for r in [1.515, 2.490, 3.409]:
    ax.fill(*contour(mu, Sigma, r=r), alpha=0.2, color="red")

mu, Sigma = np.array([-2, -2]), np.diag([2, 1])
for r in [1.515, 2.490, 3.409]:
    ax.fill(*contour(mu, Sigma, r=r), alpha=0.2, color="blue")

ax.set_xlim([-8, 8])
ax.set_ylim([-8, 8])

plt.show()

Script to reproduce Figure 1.

An alternative definition for the $1\sigma$ , $2\sigma$ and $3\sigma$ confidence sets could be the sets $\norm{x-\mu}_{\Sigma^{-1}} \leq 1, 2, 3$ . Those have the nice property that any one-dimensional slice passing through the center of the set recovers the one-dimensional confidence intervals. But those sets are smaller, and their probability mass is only 39.3%, 86.5% and 98.8%. ↩