Categories: Mathematics, Statistics, Stochastic analysis.

Kolmogorov equations

Consider the following general Itō diffusion \(X_t \in \mathbb{R}\), which is assumed to satisfy the conditions for unique existence on the entire time axis:

\[\begin{aligned} \dd{X}_t = f(X_t, t) \dd{t} + g(X_t, t) \dd{B_t} \end{aligned}\]

Let \(\mathcal{F}_t\) be the filtration to which \(X_t\) is adapted, then we define \(Y_s\) as shown below, namely as the conditional expectation of \(h(X_t)\), for an arbitrary bounded function \(h(x)\), given the information \(\mathcal{F}_s\) available at time \(s \le t\). Because \(X_t\) is a Markov process, \(Y_s\) must be \(X_s\)-measurable, so it is a function \(k\) of \(X_s\) and \(s\):

\[\begin{aligned} Y_s \equiv \mathbf{E}[h(X_t) | \mathcal{F}_s] = \mathbf{E}[h(X_t) | X_s] = k(X_s, s) \end{aligned}\]

Consequently, we can apply Itō’s lemma to find \(\dd{Y_s}\) in terms of \(k\), \(f\) and \(g\):

\[\begin{aligned} \dd{Y_s} &= \bigg( \pdv{k}{s} + \pdv{k}{x} f + \frac{1}{2} \pdv[2]{k}{x} g^2 \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s} \\ &= \bigg( \pdv{k}{s} + \hat{L} k \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s} \end{aligned}\]

Where we have defined the linear operator \(\hat{L}\) to have the following action on \(k\):

\[\begin{aligned} \hat{L} k \equiv \pdv{k}{x} f + \frac{1}{2} \pdv[2]{k}{x} g^2 \end{aligned}\]

At this point, we need to realize that \(Y_s\) is a martingale with respect to \(\mathcal{F}_s\), since \(Y_s\) is \(\mathcal{F}_s\)-adapted and finite, and it satisfies the martingale property, for \(r \le s \le t\):

\[\begin{aligned} \mathbf{E}[Y_s | \mathcal{F}_r] = \mathbf{E}\Big[ \mathbf{E}[h(X_t) | \mathcal{F}_s] \Big| \mathcal{F}_r \Big] = \mathbf{E}\big[ h(X_t) \big| \mathcal{F}_r \big] = Y_r \end{aligned}\]

Where we used the tower property of conditional expectations, because \(\mathcal{F}_r \subset \mathcal{F}_s\). However, an Itō diffusion can only be a martingale if its drift term (the one containing \(\dd{s}\)) vanishes, so, looking at \(\dd{Y_s}\), we must demand that:

\[\begin{aligned} \pdv{k}{s} + \hat{L} k = 0 \end{aligned}\]

Because \(k(X_s, s)\) is a Markov process, we can write it with a transition density \(p(s, X_s; t, X_t)\), where in this case \(s\) and \(X_s\) are given initial conditions, \(t\) is a parameter, and the terminal state \(X_t\) is a random variable. We thus have:

\[\begin{aligned} k(x, s) = \int_{-\infty}^\infty p(s, x; t, y) \: h(y) \dd{y} \end{aligned}\]

We insert this into the equation that we just derived for \(k\), yielding:

\[\begin{aligned} 0 = \int_{-\infty}^\infty \!\! \Big( \pdv{s} p(s, x; t, y) + \hat{L} p(s, x; t, y) \Big) h(y) \dd{y} \end{aligned}\]

Because \(h\) is arbitrary, and this must be satisfied for all \(h\), the transition density \(p\) fulfills:

\[\begin{aligned} 0 = \pdv{s} p(s, x; t, y) + \hat{L} p(s, x; t, y) \end{aligned}\]

Here, \(t\) is a known parameter and \(y\) is a “known” integration variable, leaving only \(s\) and \(x\) as free variables for us to choose. We therefore define the likelihood function \(\psi(s, x)\), which gives the likelihood of an initial condition \((s, x)\) given that the terminal condition is \((t, y)\):

\[\begin{aligned} \boxed{ \psi(s, x) \equiv p(s, x; t, y) } \end{aligned}\]

And from the above derivation, we conclude that \(\psi\) satisfies the following PDE, known as the backward Kolmogorov equation:

\[\begin{aligned} \boxed{ - \pdv{\psi}{s} = \hat{L} \psi = f \pdv{\psi}{x} + \frac{1}{2} g^2 \pdv[2]{\psi}{x} } \end{aligned}\]

Moving on, we can define the traditional probability density function \(\phi(t, y)\) from the transition density \(p\), by fixing the initial \((s, x)\) and leaving the terminal \((t, y)\) free:

\[\begin{aligned} \boxed{ \phi(t, y) \equiv p(s, x; t, y) } \end{aligned}\]

With this in mind, for \((s, x) = (0, X_0)\), the unconditional expectation \(\mathbf{E}[Y_t]\) (i.e. the conditional expectation without information) will be constant in time, because \(Y_t\) is a martingale:

\[\begin{aligned} \mathbf{E}[Y_t] = \mathbf{E}[k(X_t, t)] = \int_{-\infty}^\infty k(y, t) \: \phi(t, y) \dd{y} = \braket{k}{\phi} = \mathrm{const} \end{aligned}\]

This integral has the form of an inner product, so we switch to Dirac notation. We differentiate with respect to \(t\), and use the backward equation \(\pdv*{k}{t} + \hat{L} k = 0\):

\[\begin{aligned} 0 = \pdv{t} \braket{k}{\phi} = \braket{k}{\pdv{\phi}{t}} + \braket{\pdv{k}{t}}{\phi} = \braket{k}{\pdv{\phi}{t}} - \braket{\hat{L} k}{\phi} = \braket{k}{\pdv{\phi}{t} - \hat{L}{}^\dagger \phi} \end{aligned}\]

Where \(\hat{L}{}^\dagger\) is by definition the adjoint operator of \(\hat{L}\), which we calculate using partial integration, where all boundary terms vanish thanks to the existence of \(X_t\); in other words, \(X_t\) cannot reach infinity at any finite \(t\), so the integrand must decay to zero for \(|y| \to \infty\):

\[\begin{aligned} \braket{\hat{L} k}{\phi} &= \int_{-\infty}^\infty \pdv{k}{y} f \phi + \frac{1}{2} \pdv[2]{k}{y} g^2 \phi \dd{y} \\ &= \bigg[ k f \phi + \frac{1}{2} \pdv{k}{y} g^2 \phi \bigg]_{-\infty}^\infty - \int_{-\infty}^\infty k \pdv{y}(f \phi) + \frac{1}{2} \pdv{k}{y} \pdv{y}(g^2 \phi) \dd{y} \\ &= \bigg[ -\frac{1}{2} k g^2 \phi \bigg]_{-\infty}^\infty + \int_{-\infty}^\infty - k \pdv{y}(f \phi) + \frac{1}{2} k \pdv[2]{y}(g^2 \phi) \dd{y} \\ &= \int_{-\infty}^\infty k \: \big( \hat{L}{}^\dagger \phi \big) \dd{y} = \braket{k}{\hat{L}{}^\dagger \phi} \end{aligned}\]

Since \(k\) is arbitrary, and \(\pdv*{\braket{k}{\phi}}{t} = 0\) for all \(k\), we thus arrive at the forward Kolmogorov equation, describing the evolution of the probability density \(\phi(t, y)\):

\[\begin{aligned} \boxed{ \pdv{\phi}{t} = \hat{L}{}^\dagger \phi = - \pdv{y}(f \phi) + \frac{1}{2} \pdv[2]{y}(g^2 \phi) } \end{aligned}\]

This can be rewritten in a way that highlights the connection between Itō diffusions and physical diffusion, if we define the diffusivity \(D\), advection \(u\), and probability flux \(J\):

\[\begin{aligned} D \equiv \frac{1}{2} g^2 \qquad \quad u = f - \pdv{D}{x} \qquad \quad J \equiv u \phi - D \pdv{\phi}{x} \end{aligned}\]

Such that the forward Kolmogorov equation takes the following conservative form, so called because it looks like a physical continuity equation:

\[\begin{aligned} \boxed{ \pdv{\phi}{t} = - \pdv{J}{x} = - \pdv{x} \Big( u \phi - D \pdv{\phi}{x} \Big) } \end{aligned}\]

Note that if \(u = 0\), then this reduces to Fick’s second law. The backward Kolmogorov equation can also be rewritten analogously, although it is less noteworthy:

\[\begin{aligned} \boxed{ - \pdv{\psi}{t} = u \pdv{\psi}{x} + \pdv{x} \Big( D \pdv{\psi}{x} \Big) } \end{aligned}\]

Notice that the diffusivity term looks the same in both the forward and backward equations; we say that diffusion is self-adjoint.


  1. U.H. Thygesen, Lecture notes on diffusions and stochastic differential equations, 2021, Polyteknisk Kompendie.

© Marcus R.A. Newman, a.k.a. "Prefetch". Available under CC BY-SA 4.0.