Consider the following general Itō diffusionXt∈R, which is assumed to satisfy
the conditions for unique existence on the entire time axis:
dXt=f(Xt,t)dt+g(Xt,t)dBt
Let Ft be the filtration to which Xt is adapted,
then we define Ys as shown below,
namely as the conditional expectation
of h(Xt), for an arbitrary bounded function h(x),
given the information Fs available at time s≤t.
Because Xt is a Markov process,
Ys must be Xs-measurable,
so it is a function k of Xs and s:
Ys≡E[h(Xt)∣Fs]=E[h(Xt)∣Xs]=k(Xs,s)
Consequently, we can apply Itō’s lemma to find dYs
in terms of k, f and g:
Where we have defined the linear operator L^
to have the following action on k:
L^k≡∂x∂kf+21∂x2∂2kg2
At this point, we need to realize that Ys is
a martingale with respect to Fs,
since Ys is Fs-adapted and finite,
and it satisfies the martingale property,
for r≤s≤t:
Where we used the tower property of conditional expectations,
because Fr⊂Fs.
However, an Itō diffusion can only be a martingale
if its drift term (the one containing ds) vanishes,
so, looking at dYs, we must demand that:
∂s∂k+L^k=0
Because k(Xs,s) is a Markov process,
we can write it with a transition density p(s,Xs;t,Xt),
where in this case s and Xs are given initial conditions,
t is a parameter, and the terminal state Xt is a random variable.
We thus have:
k(x,s)=∫−∞∞p(s,x;t,y)h(y)dy
We insert this into the equation that we just derived for k, yielding:
0=∫−∞∞(∂s∂p(s,x;t,y)+L^p(s,x;t,y))h(y)dy
Because h is arbitrary, and this must be satisfied for all h,
the transition density p fulfills:
0=∂s∂p(s,x;t,y)+L^p(s,x;t,y)
Here, t is a known parameter and y is a “known” integration variable,
leaving only s and x as free variables for us to choose.
We therefore define the likelihood functionψ(s,x),
which gives the likelihood of an initial condition (s,x)
given that the terminal condition is (t,y):
ψ(s,x)≡p(s,x;t,y)
And from the above derivation,
we conclude that ψ satisfies the following PDE,
known as the backward Kolmogorov equation:
−∂s∂ψ=L^ψ=f∂x∂ψ+21g2∂x2∂2ψ
Moving on, we can define the traditional
probability density functionϕ(t,y) from the transition density p,
by fixing the initial (s,x)
and leaving the terminal (t,y) free:
ϕ(t,y)≡p(s,x;t,y)
With this in mind, for (s,x)=(0,X0),
the unconditional expectation E[Yt]
(i.e. the conditional expectation without information)
will be constant in time, because Yt is a martingale:
This integral has the form of an inner product,
so we switch to Dirac notation.
We differentiate with respect to t,
and use the backward equation ∂k/∂t+L^k=0:
Where L^† is by definition the adjoint operator of L^,
which we calculate using partial integration,
where all boundary terms vanish thanks to the existence of Xt;
in other words, Xt cannot reach infinity at any finite t,
so the integrand must decay to zero for ∣y∣→∞:
Since k is arbitrary, and ∂⟨k∣ϕ⟩/∂t=0 for all k,
we thus arrive at the forward Kolmogorov equation,
describing the evolution of the probability density ϕ(t,y):
∂t∂ϕ=L^†ϕ=−∂y∂(fϕ)+21∂y2∂2(g2ϕ)
This can be rewritten in a way
that highlights the connection between Itō diffusions and physical diffusion,
if we define the diffusivityD, advectionu, and probability fluxJ:
D≡21g2u=f−∂x∂DJ≡uϕ−D∂x∂ϕ
Such that the forward Kolmogorov equation takes the following conservative form,
so called because it looks like a physical continuity equation:
∂t∂ϕ=−∂x∂J=−∂x∂(uϕ−D∂x∂ϕ)
Note that if u=0, then this reduces to
Fick’s second law.
The backward Kolmogorov equation can also be rewritten analogously,
although it is less noteworthy:
−∂t∂ψ=u∂x∂ψ+∂x∂(D∂x∂ψ)
Notice that the diffusivity term looks the same
in both the forward and backward equations;
we say that diffusion is self-adjoint.
References
U.H. Thygesen,
Lecture notes on diffusions and stochastic differential equations,
2021, Polyteknisk Kompendie.