Categories: Mathematics, Statistics, Stochastic analysis.

Kolmogorov equations

Consider the following general Itō diffusion XtRX_t \in \mathbb{R}, which is assumed to satisfy the conditions for unique existence on the entire time axis:

dXt=f(Xt,t)dt+g(Xt,t)dBt\begin{aligned} \dd{X}_t = f(X_t, t) \dd{t} + g(X_t, t) \dd{B_t} \end{aligned}

Let Ft\mathcal{F}_t be the filtration to which XtX_t is adapted, then we define YsY_s as shown below, namely as the conditional expectation of h(Xt)h(X_t), for an arbitrary bounded function h(x)h(x), given the information Fs\mathcal{F}_s available at time sts \le t. Because XtX_t is a Markov process, YsY_s must be XsX_s-measurable, so it is a function kk of XsX_s and ss:

YsE[h(Xt)Fs]=E[h(Xt)Xs]=k(Xs,s)\begin{aligned} Y_s \equiv \mathbf{E}[h(X_t) | \mathcal{F}_s] = \mathbf{E}[h(X_t) | X_s] = k(X_s, s) \end{aligned}

Consequently, we can apply Itō’s lemma to find dYs\dd{Y_s} in terms of kk, ff and gg:

dYs=(ks+kxf+122kx2g2)ds+kxgdBs=(ks+L^k)ds+kxgdBs\begin{aligned} \dd{Y_s} &= \bigg( \pdv{k}{s} + \pdv{k}{x} f + \frac{1}{2} \pdvn{2}{k}{x} g^2 \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s} \\ &= \bigg( \pdv{k}{s} + \hat{L} k \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s} \end{aligned}

Where we have defined the linear operator L^\hat{L} to have the following action on kk:

L^kkxf+122kx2g2\begin{aligned} \hat{L} k \equiv \pdv{k}{x} f + \frac{1}{2} \pdvn{2}{k}{x} g^2 \end{aligned}

At this point, we need to realize that YsY_s is a martingale with respect to Fs\mathcal{F}_s, since YsY_s is Fs\mathcal{F}_s-adapted and finite, and it satisfies the martingale property, for rstr \le s \le t:

E[YsFr]=E[E[h(Xt)Fs]Fr]=E[h(Xt)Fr]=Yr\begin{aligned} \mathbf{E}[Y_s | \mathcal{F}_r] = \mathbf{E}\Big[ \mathbf{E}[h(X_t) | \mathcal{F}_s] \Big| \mathcal{F}_r \Big] = \mathbf{E}\big[ h(X_t) \big| \mathcal{F}_r \big] = Y_r \end{aligned}

Where we used the tower property of conditional expectations, because FrFs\mathcal{F}_r \subset \mathcal{F}_s. However, an Itō diffusion can only be a martingale if its drift term (the one containing ds\dd{s}) vanishes, so, looking at dYs\dd{Y_s}, we must demand that:

ks+L^k=0\begin{aligned} \pdv{k}{s} + \hat{L} k = 0 \end{aligned}

Because k(Xs,s)k(X_s, s) is a Markov process, we can write it with a transition density p(s,Xs;t,Xt)p(s, X_s; t, X_t), where in this case ss and XsX_s are given initial conditions, tt is a parameter, and the terminal state XtX_t is a random variable. We thus have:

k(x,s)=p(s,x;t,y)h(y)dy\begin{aligned} k(x, s) = \int_{-\infty}^\infty p(s, x; t, y) \: h(y) \dd{y} \end{aligned}

We insert this into the equation that we just derived for kk, yielding:

0= ⁣ ⁣(sp(s,x;t,y)+L^p(s,x;t,y))h(y)dy\begin{aligned} 0 = \int_{-\infty}^\infty \!\! \Big( \pdv{}{s}p(s, x; t, y) + \hat{L} p(s, x; t, y) \Big) h(y) \dd{y} \end{aligned}

Because hh is arbitrary, and this must be satisfied for all hh, the transition density pp fulfills:

0=sp(s,x;t,y)+L^p(s,x;t,y)\begin{aligned} 0 = \pdv{}{s}p(s, x; t, y) + \hat{L} p(s, x; t, y) \end{aligned}

Here, tt is a known parameter and yy is a “known” integration variable, leaving only ss and xx as free variables for us to choose. We therefore define the likelihood function ψ(s,x)\psi(s, x), which gives the likelihood of an initial condition (s,x)(s, x) given that the terminal condition is (t,y)(t, y):

ψ(s,x)p(s,x;t,y)\begin{aligned} \boxed{ \psi(s, x) \equiv p(s, x; t, y) } \end{aligned}

And from the above derivation, we conclude that ψ\psi satisfies the following PDE, known as the backward Kolmogorov equation:

ψs=L^ψ=fψx+12g22ψx2\begin{aligned} \boxed{ - \pdv{\psi}{s} = \hat{L} \psi = f \pdv{\psi}{x} + \frac{1}{2} g^2 \pdvn{2}{\psi}{x} } \end{aligned}

Moving on, we can define the traditional probability density function ϕ(t,y)\phi(t, y) from the transition density pp, by fixing the initial (s,x)(s, x) and leaving the terminal (t,y)(t, y) free:

ϕ(t,y)p(s,x;t,y)\begin{aligned} \boxed{ \phi(t, y) \equiv p(s, x; t, y) } \end{aligned}

With this in mind, for (s,x)=(0,X0)(s, x) = (0, X_0), the unconditional expectation E[Yt]\mathbf{E}[Y_t] (i.e. the conditional expectation without information) will be constant in time, because YtY_t is a martingale:

E[Yt]=E[k(Xt,t)]=k(y,t)ϕ(t,y)dy=k|ϕ=const\begin{aligned} \mathbf{E}[Y_t] = \mathbf{E}[k(X_t, t)] = \int_{-\infty}^\infty k(y, t) \: \phi(t, y) \dd{y} = \Inprod{k}{\phi} = \mathrm{const} \end{aligned}

This integral has the form of an inner product, so we switch to Dirac notation. We differentiate with respect to tt, and use the backward equation k/t+L^k=0\ipdv{k}{t} + \hat{L} k = 0:

0=tk|ϕ=k|ϕt+kt|ϕ=k|ϕtL^k|ϕ=k|ϕtL^ϕ\begin{aligned} 0 = \pdv{}{t}\Inprod{k}{\phi} = \Inprod{k}{\pdv{\phi}{t}} + \Inprod{\pdv{k}{t}}{\phi} = \Inprod{k}{\pdv{\phi}{t}} - \Inprod{\hat{L} k}{\phi} = \Inprod{k}{\pdv{\phi}{t} - \hat{L}{}^\dagger \phi} \end{aligned}

Where L^\hat{L}{}^\dagger is by definition the adjoint operator of L^\hat{L}, which we calculate using partial integration, where all boundary terms vanish thanks to the existence of XtX_t; in other words, XtX_t cannot reach infinity at any finite tt, so the integrand must decay to zero for y|y| \to \infty:

L^k|ϕ=kyfϕ+122ky2g2ϕdy=[kfϕ+12kyg2ϕ]ky(fϕ)+12kyy(g2ϕ)dy=[12kg2ϕ]+ky(fϕ)+12k2y2(g2ϕ)dy=k(L^ϕ)dy=k|L^ϕ\begin{aligned} \Inprod{\hat{L} k}{\phi} &= \int_{-\infty}^\infty \pdv{k}{y} f \phi + \frac{1}{2} \pdvn{2}{k}{y} g^2 \phi \dd{y} \\ &= \bigg[ k f \phi + \frac{1}{2} \pdv{k}{y} g^2 \phi \bigg]_{-\infty}^\infty - \int_{-\infty}^\infty k \pdv{}{y}(f \phi) + \frac{1}{2} \pdv{k}{y} \pdv{}{y}(g^2 \phi) \dd{y} \\ &= \bigg[ -\frac{1}{2} k g^2 \phi \bigg]_{-\infty}^\infty + \int_{-\infty}^\infty - k \pdv{}{y}(f \phi) + \frac{1}{2} k \pdvn{2}{}{y}(g^2 \phi) \dd{y} \\ &= \int_{-\infty}^\infty k \: \big( \hat{L}{}^\dagger \phi \big) \dd{y} = \Inprod{k}{\hat{L}{}^\dagger \phi} \end{aligned}

Since kk is arbitrary, and k|ϕ/t=0\ipdv{\Inprod{k}{\phi}}{t} = 0 for all kk, we thus arrive at the forward Kolmogorov equation, describing the evolution of the probability density ϕ(t,y)\phi(t, y):

ϕt=L^ϕ=y(fϕ)+122y2(g2ϕ)\begin{aligned} \boxed{ \pdv{\phi}{t} = \hat{L}{}^\dagger \phi = - \pdv{}{y}(f \phi) + \frac{1}{2} \pdvn{2}{}{y}(g^2 \phi) } \end{aligned}

This can be rewritten in a way that highlights the connection between Itō diffusions and physical diffusion, if we define the diffusivity DD, advection uu, and probability flux JJ:

D12g2u=fDxJuϕDϕx\begin{aligned} D \equiv \frac{1}{2} g^2 \qquad \quad u = f - \pdv{D}{x} \qquad \quad J \equiv u \phi - D \pdv{\phi}{x} \end{aligned}

Such that the forward Kolmogorov equation takes the following conservative form, so called because it looks like a physical continuity equation:

ϕt=Jx=x(uϕDϕx)\begin{aligned} \boxed{ \pdv{\phi}{t} = - \pdv{J}{x} = - \pdv{}{x}\Big( u \phi - D \pdv{\phi}{x} \Big) } \end{aligned}

Note that if u=0u = 0, then this reduces to Fick’s second law. The backward Kolmogorov equation can also be rewritten analogously, although it is less noteworthy:

ψt=uψx+x(Dψx)\begin{aligned} \boxed{ - \pdv{\psi}{t} = u \pdv{\psi}{x} + \pdv{}{x}\Big( D \pdv{\psi}{x} \Big) } \end{aligned}

Notice that the diffusivity term looks the same in both the forward and backward equations; we say that diffusion is self-adjoint.


  1. U.H. Thygesen, Lecture notes on diffusions and stochastic differential equations, 2021, Polyteknisk Kompendie.