Gaussian Distribution Integral

14 May, 2020

Introduction

The Gaussian Distribution is often represented as N(x;μ,σ)\large{\mathcal{N}(x; \mu, \sigma)} where μ\mu is the mean and σ\sigma is the standard deviation. The Probability Density Function (PDF) for a Gaussian distribution is defined as p(x)=12πσe12(xμσ)2(1) \Large{p(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}} \tag{1} Since it’s a PDF, p(x)dx=1(2) \Large{\int_{-\infty}^{\infty}{p(x)dx} = 1} \tag{2}

This post is about showing how the integral equals 1.

Laplace’s method

A convenient method for solving integrals of the form eMf(x)dx\large\int{e^{Mf(x)}dx} was first shown by Pierre-Simon Laplace in his Memoir on the Probability of the Cause of Events.

Using Laplace’s method, we’re going to find the value of I=ex2dx(3) \large I = \int_{-\infty}^\infty{e^{-x^2}dx} \tag{3}

First, let’s see what the plot for f(x)=ex2\large f(x) = e^{-x^2} looks like and come up with a rough estimate for the value.

 

Figure 1: Plot for ex2e^{-x^2}

The integral for f(x)f(x) is just the orange shaded area for xx from -\infty to \infty (Note that this plot only shows f(x)f(x) for xx from 2.5-2.5 to 2.52.5, but f(x)0 for x2.5 and x2.5f(x) \rightarrow 0 \text{ for } x \ge 2.5 \text{ and } x \le -2.5).

We can get a rough estimate for the area by drawing a triangle over the curve as shown with the Blue triangle in the above figure. The area of the triangle is 12×4×1=2\frac{1}{2}\times{4}\times{1} = 2. From the plot, it’s fair to assume that the integral is less than 2. So, our first guess for a rough upper bound of eq (3)eq\ (3) is 2\sim 2.

It’s hard to calculate the integral directly, so we’ll go through the steps of Laplace’s method. The first step of Laplace’s method is to multiply eq (3)eq\ (3) by itself.

I2=(ex2dx)2=(ex2dx)(ex2dx)=(ex2dx)(ey2dy)=(ex2)(ey2)dxdy=e(x2+y2)dxdy(4) \large \begin{aligned} I^2 &= \Bigg(\int_{-\infty}^\infty{e^{-x^2}dx}\Bigg)^2 \\ \\ &= \Bigg(\int_{-\infty}^\infty{e^{-x^2}dx}\Bigg)\Bigg(\int_{-\infty}^\infty{e^{-x^2}dx}\Bigg) \\ \\ &= \Bigg(\int_{-\infty}^\infty{e^{-x^2}dx}\Bigg)\Bigg(\int_{-\infty}^\infty{e^{-y^2}dy}\Bigg) \\ \\ &= \int_{-\infty}^\infty{\int_{-\infty}^\infty{(e^{-x^2})(e^{-y^2})dxdy}} \\ \\ &= \int_{-\infty}^\infty{\int_{-\infty}^\infty{e^{-(x^2+y^2)}dxdy}} \tag{4} \end{aligned}

 

Basically, I2I^2 is the volume under the surface of f(x,y)=e(x2+y2)f(x,y) = e^{-(x^2+y^2)}. Let’s take a look at the surface plot f(x,y)f(x, y) below:

 

From the plot, we can see that I2I^2 is the volume under the cone like surface. The surface is pretty much encompassed by a cone with the base centered at 0, radius r=2r = 2 and height h=1h = 1 (You can hover over the 3D plot to see circles emulating the base of the cone). The volume of the cone is then given by πr2h3=43π\pi r^2\frac{h}{3} = \frac{4}{3}\pi. So, our second guess for the upper bound of eq (3)eq\ (3) is 4π32\sqrt{\frac{4\pi}{3}} \sim 2, which is pretty close to the first guess.

It’s hard to calculate this double integral as well, so let’s move on to the second step of the method. Every point in the xyx-y plane can also be expressed in polar co-ordinates r,θr,\theta as x=rcosθy=rsinθ(5) x = rcos\theta\qquad y = rsin\theta \tag{5} r2=x2+y2(6) r^2 = x^2 + y^2 \tag{6}

The dxdydxdy in eq (4)eq\ (4) represents an infinitesimal area in the xyx-y plane. The same area in the polar plane is given by rdrdθrdrd\theta. Converting eq (4)eq\ (4) to polar co-ordinates, we get I2=002πer2rdrdθ=0er2rdr02πdθ=2π0er2rdrUsing the substitution  u=x2, we get du=2xdx=π0eudu=π(eu0)=π(7) \large \begin{aligned} I^2 &= \int_0^\infty{\int_0^{2\pi}{e^{-r^2}rdrd\theta}} \\ \\ &= \int_0^\infty{e^{-r^2}rdr\int_0^{2\pi}{d\theta}} \\ \\ &= 2\pi\int_0^\infty{e^{-r^2}rdr} \\ \\ &\text{Using the substitution }\ u = x^2 \text{, we get } du = 2xdx \\ \\ &= \pi\int_0^\infty{e^{-u}du} \\ \\ &= \pi(-e^{-u}{\Large|}_0^\infty) \\ \\ &= \pi\tag{7} \end{aligned}

Taking the square root of eq(7)eq (7), we get the value for eq(3)eq (3): I=π(8) \large I = \sqrt{\pi} \tag{8}

So the area under ex2e^{-x^2} is π1.772\sqrt{\pi}\approx1.772. Our upper bound guesses of 2 were a little far off, but gave a good geometrical intuition. Eq (8)Eq\ (8) can be generalized further in the following way: emx2dx=πm(9) \large \int_{-\infty}^\infty{e^{-mx^2}dx} = \sqrt{\frac{\pi}{m}} \tag{9}

Gaussian PDF Integral

We can now apply eq(9)eq(9) to eq(2)eq(2) as follows: I=12πσe12(xμσ)2dx=12πσ2e12σ2(xμ)2dxSubstituting xμ=u and using eq(9)=12πσ2π12σ2=1(10) \large \begin{aligned} I &= \int_{-\infty}^{\infty}{\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}dx} \\ \\ &= \frac{1}{\sqrt{2\pi\sigma^2}}\int_{-\infty}^{\infty}{e^{-\frac{1}{2\sigma^2}(x-\mu)^2}dx} \\ \\ &\text{Substituting } x - \mu = u \text{ and using } eq(9) \\ \\ &= \frac{1}{\sqrt{2\pi\sigma^2}}{\sqrt{\frac{\pi}{\frac{1}{2\sigma^2}}}} \\ \\ &= 1\tag{10} \end{aligned}

I hope the constant 12πσ2\frac{1}{\sqrt{2\pi\sigma^2}} in the PDF for the Gaussian Distribution now makes sense. Laplace’s method can also be used to show that E[x]=μE[x] = \mu and Var(x)=σ2Var(x) = \sigma^2. However, those integrals are a little bit more complicated than eq(10)eq(10).

Further Learning

In this post, we showed that the Gaussian Distribution function is indeed a probability density function. The Gaussian Distribution function also has many other amazing properties which make it a popular choice for many Machine Learning modeling tasks. To learn more about these properties, I recommend watching Probabilistic ML - Lecture 6 - Gaussian Distributions.

Remarks

I first learned about Laplace’s method from Gaussian Integrals. I’ve been trying to sharpen the mathematical skills needed for Machine Learning, and this was good practice for integration. This post also helped me come up with an easy process to embed interactive plotly figures. See the Hugo shortcode for plotly here - https://github.com/hemildesai/dev/blob/master/layouts/shortcodes/plotly.html.