Given two stochastic processesFt and Gt, consider the following random variable Xt,
where Bt is the Wiener process,
i.e. Brownian motion:
Xt=X0+∫0tFsds+∫0tGsdBs
Where the latter is an Itō integral,
assuming Gt is Itō-integrable.
We call Xt an Itō process if Ft is locally integrable,
and the initial condition X0 is known,
i.e. X0 is F0-measurable,
where Ft is the filtration
to which Ft, Gt and Bt are adapted.
The above definition of Xt is often abbreviated as follows,
where X0 is implicit:
dXt=Ftdt+GtdBt
Typically, Ft is referred to as the drift of Xt,
and Gt as its intensity.
Because the Itō integral of Gt is a
martingale,
it does not contribute to the mean of Xt:
E[Xt]=∫0tE[Fs]ds
Now, consider the following Itō stochastic differential equation (SDE),
where ξt=dBt/dt is white noise,
informally treated as the t-derivative of Bt:
dtdXt=f(Xt,t)+g(Xt,t)ξt
An Itō process Xt is said to satisfy this equation
if f(Xt,t)=Ft and g(Xt,t)=Gt,
in which case Xt is also called an Itō diffusion.
All Itō diffusions are Markov processes,
since only the current value of Xt determines the future,
and Bt is also a Markov process.
Itō’s lemma
Classically, given y≡h(x(t),t),
the chain rule of differentiation states that:
dy=∂t∂hdt+∂x∂hdx
However, for a stochastic process Yt≡h(Xt,t),
where Xt is an Itō process,
the chain rule is modified to the following,
known as Itō’s lemma:
We start by applying the classical chain rule,
but we go to second order in x.
This is also valid classically,
but there we would neglect all higher-order infinitesimals:
dYt=∂t∂hdt+∂x∂hdXt+21∂x2∂2hdXt2
But here we cannot neglect dXt2.
We insert the definition of an Itō process:
In the limit of small dt, we can neglect dt2,
and as it turns out, dtdBt too:
dtdBt=(Bt+dt−Bt)dt∼dtN(0,dt)∼N(0,dt3)⟶0
However, due to the scaling property of Bt,
we cannot ignore dBt2, which has order dt:
dBt2=(Bt+dt−Bt)2∼(N(0,dt))2∼χ12(dt)⟶dt
Where χ12(dt) is the generalized chi-squared distribution
with one term of variance dt.
The most important application of Itō’s lemma
is to perform coordinate transformations,
to make the solution of a given Itō SDE easier.
Coordinate transformations
The simplest coordinate transformation is a scaling of the time axis.
Defining s≡αt, the goal is to keep the Itō process.
We know how to scale Bt, be setting Ws≡αBs/α.
Let Ys≡Xt be the new variable on the rescaled axis, then:
Ws is a valid Wiener process,
and the other changes are small,
so this is still an Itō process.
To solve SDEs analytically, it is usually best
to have additive noise, i.e. g=1.
This can be achieved using the Lamperti transform:
define Yt≡h(Xt), where h is given by:
h(x)=∫x0xg(y)1dy
Then, using Itō’s lemma, it is straightforward
to show that the intensity becomes 1.
Note that the lower integration limit x0 does not enter:
Similarly, we can eliminate the drift f=0,
thereby making the Itō process a martingale.
This is done by defining Yt≡h(Xt), with h(x) given by:
h(x)=∫x0xexp(−∫x1yg2(z)2f(z)dz)dy
The goal is to make the parenthesized first term (see above)
of Itō’s lemma disappear, which this h(x) does indeed do.
Note that x0 and x1 do not enter:
It is worth knowing under what condition a solution to a given SDE exists,
in the sense that it is finite on the entire time axis.
Suppose the drift f and intensity g satisfy these inequalities,
for some known constant K and for all x:
xf(x)≤K(1+x2)g2(x)≤K(1+x2)
When this is satisfied, we can find the following upper bound
on an Itō process Xt,
which clearly implies that Xt is finite for all t:
E[Xt2]≤(X02+3Kt)exp(3Kt)
If we define Yt≡Xt2,
then Itō’s lemma tells us that the following holds:
dYt=(2Xtf(Xt)+g2(Xt))dt+2Xtg(Xt)dBt
Integrating and taking the expectation value
removes the Wiener term, leaving:
E[Yt]=Y0+E∫0t2Xsf(Xs)+g2(Xs)ds
Given that K(1+x2) is an upper bound of xf(x) and g2(x),
we get an inequality:
We then apply the
Grönwall-Bellman inequality,
noting that (Y0+3Kt) does not decrease with time, leading us to:
E[Yt]≤(Y0+3Kt)exp(∫0t3Kds)≤(Y0+3Kt)exp(3Kt)
If a solution exists, it is also worth knowing whether it is unique.
Suppose that f and g satisfy the following inequalities,
for some constant K and for all x and y:
f(x)−f(y)≤Kx−yg(x)−g(y)≤Kx−y
Let Xt and Yt both be solutions to a given SDE,
but the initial conditions need not be the same,
such that the difference is initially X0−Y0.
Then the difference Xt−Yt is bounded by:
E[(Xt−Yt)2]≤(X0−Y0)2exp((2K+K2)t)
We define Dt≡Xt−Yt and Zt≡Dt2≥0,
together with Ft≡f(Xt)−f(Yt) and Gt≡g(Xt)−g(Yt),
such that Itō’s lemma states:
dZt=(2DtFt+Gt2)dt+2DtGtdBt
Integrating and taking the expectation value
removes the Wiener term, leaving:
E[Zt]=Z0+E∫0t2DsFs+Gs2ds
The Cauchy-Schwarz inequality states that ∣DsFs∣≤∣Ds∣∣Fs∣,
and then the given fact that Fs and Gs satisfy
∣Fs∣≤K∣Ds∣ and ∣Gs∣≤K∣Ds∣ gives:
Where we have implicitly used that DsFs=∣DsFs∣
because Zt is positive for all Gs2,
and that ∣Ds∣2=Ds2 because Ds is real.
We then apply the
Grönwall-Bellman inequality,
recognizing that Z0 does not decrease with time (since it is constant):
E[Zt]≤Z0exp(∫0t2K+K2ds)≤Z0exp((2K+K2)t)
Using these properties, it can then be shown
that if all of the above conditions are satisfied,
then the SDE has a unique solution,
which is Ft-adapted, continuous, and exists for all times.
References
U.H. Thygesen,
Lecture notes on diffusions and stochastic differential equations,
2021, Polyteknisk Kompendie.