---
title: "Kolmogorov equations"
sort_title: "Kolmogorov equations"
date: 2021-11-14
categories:
- Mathematics
- Statistics
- Stochastic analysis
layout: "concept"
---

Consider the following general [Itō diffusion](/know/concept/ito-calculus/)
$$X_t \in \mathbb{R}$$, which is assumed to satisfy
the conditions for unique existence on the entire time axis:

$$\begin{aligned}
    \dd{X}_t
    = f(X_t, t) \dd{t} + g(X_t, t) \dd{B_t}
\end{aligned}$$

Let $$\mathcal{F}_t$$ be the filtration to which $$X_t$$ is adapted,
then we define $$Y_s$$ as shown below,
namely as the [conditional expectation](/know/concept/conditional-expectation/)
of $$h(X_t)$$, for an arbitrary bounded function $$h(x)$$,
given the information $$\mathcal{F}_s$$ available at time $$s \le t$$.
Because $$X_t$$ is a [Markov process](/know/concept/markov-process/),
$$Y_s$$ must be $$X_s$$-measurable,
so it is a function $$k$$ of $$X_s$$ and $$s$$:

$$\begin{aligned}
    Y_s
    \equiv \mathbf{E}[h(X_t) | \mathcal{F}_s]
    = \mathbf{E}[h(X_t) | X_s]
    = k(X_s, s)
\end{aligned}$$

Consequently, we can apply Itō's lemma to find $$\dd{Y_s}$$
in terms of $$k$$, $$f$$ and $$g$$:

$$\begin{aligned}
    \dd{Y_s}
    &= \bigg( \pdv{k}{s} + \pdv{k}{x} f + \frac{1}{2} \pdvn{2}{k}{x} g^2 \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s}
    \\
    &= \bigg( \pdv{k}{s} + \hat{L} k \bigg) \dd{s} + \pdv{k}{x} g \dd{B_s}
\end{aligned}$$

Where we have defined the linear operator $$\hat{L}$$
to have the following action on $$k$$:

$$\begin{aligned}
    \hat{L} k
    \equiv \pdv{k}{x} f + \frac{1}{2} \pdvn{2}{k}{x} g^2
\end{aligned}$$

At this point, we need to realize that $$Y_s$$ is
a [martingale](/know/concept/martingale/) with respect to $$\mathcal{F}_s$$,
since $$Y_s$$ is $$\mathcal{F}_s$$-adapted and finite,
and it satisfies the martingale property,
for $$r \le s \le t$$:

$$\begin{aligned}
    \mathbf{E}[Y_s | \mathcal{F}_r]
    = \mathbf{E}\Big[ \mathbf{E}[h(X_t) | \mathcal{F}_s] \Big| \mathcal{F}_r \Big]
    = \mathbf{E}\big[ h(X_t) \big| \mathcal{F}_r \big]
    = Y_r
\end{aligned}$$

Where we used the tower property of conditional expectations,
because $$\mathcal{F}_r \subset \mathcal{F}_s$$.
However, an Itō diffusion can only be a martingale
if its drift term (the one containing $$\dd{s}$$) vanishes,
so, looking at $$\dd{Y_s}$$, we must demand that:

$$\begin{aligned}
    \pdv{k}{s} + \hat{L} k
    = 0
\end{aligned}$$

Because $$k(X_s, s)$$ is a Markov process,
we can write it with a transition density $$p(s, X_s; t, X_t)$$,
where in this case $$s$$ and $$X_s$$ are given initial conditions,
$$t$$ is a parameter, and the terminal state $$X_t$$ is a random variable.
We thus have:

$$\begin{aligned}
    k(x, s)
    = \int_{-\infty}^\infty p(s, x; t, y) \: h(y) \dd{y}
\end{aligned}$$

We insert this into the equation that we just derived for $$k$$, yielding:

$$\begin{aligned}
    0
    = \int_{-\infty}^\infty \!\! \Big( \pdv{}{s}p(s, x; t, y) + \hat{L} p(s, x; t, y) \Big) h(y) \dd{y}
\end{aligned}$$

Because $$h$$ is arbitrary, and this must be satisfied for all $$h$$,
the transition density $$p$$ fulfills:

$$\begin{aligned}
    0
    = \pdv{}{s}p(s, x; t, y) + \hat{L} p(s, x; t, y)
\end{aligned}$$

Here, $$t$$ is a known parameter and $$y$$ is a "known" integration variable,
leaving only $$s$$ and $$x$$ as free variables for us to choose.
We therefore define the **likelihood function** $$\psi(s, x)$$,
which gives the likelihood of an initial condition $$(s, x)$$
given that the terminal condition is $$(t, y)$$:

$$\begin{aligned}
    \boxed{
        \psi(s, x)
        \equiv p(s, x; t, y)
    }
\end{aligned}$$

And from the above derivation,
we conclude that $$\psi$$ satisfies the following PDE,
known as the **backward Kolmogorov equation**:

$$\begin{aligned}
    \boxed{
        - \pdv{\psi}{s}
        = \hat{L} \psi
        = f \pdv{\psi}{x} + \frac{1}{2} g^2 \pdvn{2}{\psi}{x}
    }
\end{aligned}$$

Moving on, we can define the traditional
**probability density function** $$\phi(t, y)$$ from the transition density $$p$$,
by fixing the initial $$(s, x)$$
and leaving the terminal $$(t, y)$$ free:

$$\begin{aligned}
    \boxed{
        \phi(t, y)
        \equiv p(s, x; t, y)
    }
\end{aligned}$$

With this in mind, for $$(s, x) = (0, X_0)$$,
the unconditional expectation $$\mathbf{E}[Y_t]$$
(i.e. the conditional expectation without information)
will be constant in time, because $$Y_t$$ is a martingale:

$$\begin{aligned}
    \mathbf{E}[Y_t]
    = \mathbf{E}[k(X_t, t)]
    = \int_{-\infty}^\infty k(y, t) \: \phi(t, y) \dd{y}
    = \Inprod{k}{\phi}
    = \mathrm{const}
\end{aligned}$$

This integral has the form of an inner product,
so we switch to [Dirac notation](/know/concept/dirac-notation/).
We differentiate with respect to $$t$$,
and use the backward equation $$\ipdv{k}{t} + \hat{L} k = 0$$:

$$\begin{aligned}
    0
    = \pdv{}{t}\Inprod{k}{\phi}
    = \Inprod{k}{\pdv{\phi}{t}} + \Inprod{\pdv{k}{t}}{\phi}
    = \Inprod{k}{\pdv{\phi}{t}} - \Inprod{\hat{L} k}{\phi}
    = \Inprod{k}{\pdv{\phi}{t} - \hat{L}{}^\dagger \phi}
\end{aligned}$$

Where $$\hat{L}{}^\dagger$$ is by definition the adjoint operator of $$\hat{L}$$,
which we calculate using partial integration,
where all boundary terms vanish thanks to the *existence* of $$X_t$$;
in other words, $$X_t$$ cannot reach infinity at any finite $$t$$,
so the integrand must decay to zero for $$|y| \to \infty$$:

$$\begin{aligned}
    \Inprod{\hat{L} k}{\phi}
    &= \int_{-\infty}^\infty \pdv{k}{y} f \phi + \frac{1}{2} \pdvn{2}{k}{y} g^2 \phi \dd{y}
    \\
    &= \bigg[ k f \phi + \frac{1}{2} \pdv{k}{y} g^2 \phi \bigg]_{-\infty}^\infty
    - \int_{-\infty}^\infty k \pdv{}{y}(f \phi) + \frac{1}{2} \pdv{k}{y} \pdv{}{y}(g^2 \phi) \dd{y}
    \\
    &= \bigg[ -\frac{1}{2} k g^2 \phi \bigg]_{-\infty}^\infty
    + \int_{-\infty}^\infty - k \pdv{}{y}(f \phi) + \frac{1}{2} k \pdvn{2}{}{y}(g^2 \phi) \dd{y}
    \\
    &= \int_{-\infty}^\infty k \: \big( \hat{L}{}^\dagger \phi \big) \dd{y}
    = \Inprod{k}{\hat{L}{}^\dagger \phi}
\end{aligned}$$

Since $$k$$ is arbitrary, and $$\ipdv{\Inprod{k}{\phi}}{t} = 0$$ for all $$k$$,
we thus arrive at the **forward Kolmogorov equation**,
describing the evolution of the probability density $$\phi(t, y)$$:

$$\begin{aligned}
    \boxed{
        \pdv{\phi}{t}
        = \hat{L}{}^\dagger \phi
        = - \pdv{}{y}(f \phi) + \frac{1}{2} \pdvn{2}{}{y}(g^2 \phi)
    }
\end{aligned}$$

This can be rewritten in a way
that highlights the connection between Itō diffusions and physical diffusion,
if we define the **diffusivity** $$D$$, **advection** $$u$$, and **probability flux** $$J$$:

$$\begin{aligned}
    D
    \equiv \frac{1}{2} g^2
    \qquad \quad
    u
    = f - \pdv{D}{x}
    \qquad \quad
    J
    \equiv u \phi - D \pdv{\phi}{x}
\end{aligned}$$

Such that the forward Kolmogorov equation takes the following **conservative form**,
so called because it looks like a physical continuity equation:

$$\begin{aligned}
    \boxed{
        \pdv{\phi}{t}
        = - \pdv{J}{x}
        = - \pdv{}{x}\Big( u \phi - D \pdv{\phi}{x} \Big)
    }
\end{aligned}$$

Note that if $$u = 0$$, then this reduces to
[Fick's second law](/know/concept/ficks-laws/).
The backward Kolmogorov equation can also be rewritten analogously,
although it is less noteworthy:

$$\begin{aligned}
    \boxed{
        - \pdv{\psi}{t}
        = u \pdv{\psi}{x} + \pdv{}{x}\Big( D \pdv{\psi}{x} \Big)
    }
\end{aligned}$$

Notice that the diffusivity term looks the same
in both the forward and backward equations;
we say that diffusion is self-adjoint.


## References
1.  U.H. Thygesen,
    *Lecture notes on diffusions and stochastic differential equations*,
    2021, Polyteknisk Kompendie.