Categories: Mathematics, Stochastic analysis.

Itō process

Given two stochastic processes FtF_t and GtG_t, consider the following random variable XtX_t, where BtB_t is the Wiener process, i.e. Brownian motion:

Xt=X0+0tFsds+0tGsdBs\begin{aligned} X_t = X_0 + \int_0^t F_s \dd{s} + \int_0^t G_s \dd{B_s} \end{aligned}

Where the latter is an Itō integral, assuming GtG_t is Itō-integrable. We call XtX_t an Itō process if FtF_t is locally integrable, and the initial condition X0X_0 is known, i.e. X0X_0 is F0\mathcal{F}_0-measurable, where Ft\mathcal{F}_t is the filtration to which FtF_t, GtG_t and BtB_t are adapted. The above definition of XtX_t is often abbreviated as follows, where X0X_0 is implicit:

dXt=Ftdt+GtdBt\begin{aligned} \dd{X_t} = F_t \dd{t} + G_t \dd{B_t} \end{aligned}

Typically, FtF_t is referred to as the drift of XtX_t, and GtG_t as its intensity. Because the Itō integral of GtG_t is a martingale, it does not contribute to the mean of XtX_t:

E[Xt]=0tE[Fs]ds\begin{aligned} \mathbf{E}[X_t] = \int_0^t \mathbf{E}[F_s] \dd{s} \end{aligned}

Now, consider the following Itō stochastic differential equation (SDE), where ξt=dBt/dt\xi_t = \idv{B_t}{t} is white noise, informally treated as the tt-derivative of BtB_t:

dXtdt=f(Xt,t)+g(Xt,t)ξt\begin{aligned} \dv{X_t}{t} = f(X_t, t) + g(X_t, t) \: \xi_t \end{aligned}

An Itō process XtX_t is said to satisfy this equation if f(Xt,t)=Ftf(X_t, t) = F_t and g(Xt,t)=Gtg(X_t, t) = G_t, in which case XtX_t is also called an Itō diffusion. All Itō diffusions are Markov processes, since only the current value of XtX_t determines the future, and BtB_t is also a Markov process.

Itō’s lemma

Classically, given yh(x(t),t)y \equiv h(x(t), t), the chain rule of differentiation states that:

dy=htdt+hxdx\begin{aligned} \dd{y} = \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{x} \end{aligned}

However, for a stochastic process Yth(Xt,t)Y_t \equiv h(X_t, t), where XtX_t is an Itō process, the chain rule is modified to the following, known as Itō’s lemma:

dYt=(ht+hxFt+122hx2Gt2)dt+hxGtdBt\begin{aligned} \boxed{ \dd{Y_t} = \bigg( \pdv{h}{t} + \pdv{h}{x} F_t + \frac{1}{2} \pdvn{2}{h}{x} G_t^2 \bigg) \dd{t} + \pdv{h}{x} G_t \dd{B_t} } \end{aligned}

We start by applying the classical chain rule, but we go to second order in xx. This is also valid classically, but there we would neglect all higher-order infinitesimals:

dYt=htdt+hxdXt+122hx2dXt2\begin{aligned} \dd{Y_t} = \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{X_t} + \frac{1}{2} \pdvn{2}{h}{x} \dd{X_t}^2 \end{aligned}

But here we cannot neglect dXt2\dd{X_t}^2. We insert the definition of an Itō process:

dYt=htdt+hx(Ftdt+GtdBt)+122hx2(Ftdt+GtdBt)2=htdt+hx(Ftdt+GtdBt)+122hx2(Ft2dt2+2FtGtdtdBt+Gt2dBt2)\begin{aligned} \dd{Y_t} &= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big) + \frac{1}{2} \pdvn{2}{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big)^2 \\ &= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big) + \frac{1}{2} \pdvn{2}{h}{x} \Big( F_t^2 \dd{t}^2 + 2 F_t G_t \dd{t} \dd{B_t} + G_t^2 \dd{B_t}^2 \Big) \end{aligned}

In the limit of small dt\dd{t}, we can neglect dt2\dd{t}^2, and as it turns out, dtdBt\dd{t} \dd{B_t} too:

dtdBt=(Bt+dtBt)dtdtN(0,dt)N(0,dt3)0\begin{aligned} \dd{t} \dd{B_t} &= (B_{t + \dd{t}} - B_t) \dd{t} \sim \dd{t} \mathcal{N}(0, \dd{t}) \sim \mathcal{N}(0, \dd{t}^3) \longrightarrow 0 \end{aligned}

However, due to the scaling property of BtB_t, we cannot ignore dBt2\dd{B_t}^2, which has order dt\dd{t}:

dBt2=(Bt+dtBt)2(N(0,dt))2χ12(dt)dt\begin{aligned} \dd{B_t}^2 &= (B_{t + \dd{t}} - B_t)^2 \sim \big( \mathcal{N}(0, \dd{t}) \big)^2 \sim \chi^2_1(\dd{t}) \longrightarrow \dd{t} \end{aligned}

Where χ12(dt)\chi_1^2(\dd{t}) is the generalized chi-squared distribution with one term of variance dt\dd{t}.

The most important application of Itō’s lemma is to perform coordinate transformations, to make the solution of a given Itō SDE easier.

Coordinate transformations

The simplest coordinate transformation is a scaling of the time axis. Defining sαts \equiv \alpha t, the goal is to keep the Itō process. We know how to scale BtB_t, be setting WsαBs/αW_s \equiv \sqrt{\alpha} B_{s / \alpha}. Let YsXtY_s \equiv X_t be the new variable on the rescaled axis, then:

dYs=dXt=f(Xt)dt+g(Xt)dBt=1αf(Ys)ds+1αg(Ys)dWs\begin{aligned} \dd{Y_s} = \dd{X_t} &= f(X_t) \dd{t} + g(X_t) \dd{B_t} \\ &= \frac{1}{\alpha} f(Y_s) \dd{s} + \frac{1}{\sqrt{\alpha}} g(Y_s) \dd{W_s} \end{aligned}

WsW_s is a valid Wiener process, and the other changes are small, so this is still an Itō process.

To solve SDEs analytically, it is usually best to have additive noise, i.e. g=1g = 1. This can be achieved using the Lamperti transform: define Yth(Xt)Y_t \equiv h(X_t), where hh is given by:

h(x)=x0x1g(y)dy\begin{aligned} \boxed{ h(x) = \int_{x_0}^x \frac{1}{g(y)} \dd{y} } \end{aligned}

Then, using Itō’s lemma, it is straightforward to show that the intensity becomes 11. Note that the lower integration limit x0x_0 does not enter:

dYt=(f(Xt)h(Xt)+12g2(Xt)h(Xt))dt+g(Xt)h(Xt)dBt=(f(Xt)g(Xt)12g2(Xt)g(Xt)g2(Xt))dt+g(Xt)g(Xt)dBt=(f(Xt)g(Xt)12g(Xt))dt+dBt\begin{aligned} \dd{Y_t} &= \bigg( f(X_t) \: h'(X_t) + \frac{1}{2} g^2(X_t) \: h''(X_t) \bigg) \dd{t} + g(X_t) \: h'(X_t) \dd{B_t} \\ &= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g^2(X_t) \frac{g'(X_t)}{g^2(X_t)} \bigg) \dd{t} + \frac{g(X_t)}{g(X_t)} \dd{B_t} \\ &= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g'(X_t) \bigg) \dd{t} + \dd{B_t} \end{aligned}

Similarly, we can eliminate the drift f=0f = 0, thereby making the Itō process a martingale. This is done by defining Yth(Xt)Y_t \equiv h(X_t), with h(x)h(x) given by:

h(x)=x0xexp ⁣( ⁣ ⁣ ⁣x1y2f(z)g2(z)dz)dy\begin{aligned} \boxed{ h(x) = \int_{x_0}^x \exp\!\bigg( \!-\!\! \int_{x_1}^y \frac{2 f(z)}{g^2(z)} \dd{z} \bigg) \dd{y} } \end{aligned}

The goal is to make the parenthesized first term (see above) of Itō’s lemma disappear, which this h(x)h(x) does indeed do. Note that x0x_0 and x1x_1 do not enter:

0=f(x)h(x)+12g2(x)h(x)=(f(x)12g2(x)2f(x)g2(x))exp ⁣( ⁣ ⁣ ⁣x1x2f(y)g2(y)dy)\begin{aligned} 0 &= f(x) \: h'(x) + \frac{1}{2} g^2(x) \: h''(x) \\ &= \Big( f(x) - \frac{1}{2} g^2(x) \frac{2 f(x)}{g^2(x)} \Big) \exp\!\bigg( \!-\!\! \int_{x_1}^x \frac{2 f(y)}{g^2(y)} \dd{y} \bigg) \end{aligned}

Existence and uniqueness

It is worth knowing under what condition a solution to a given SDE exists, in the sense that it is finite on the entire time axis. Suppose the drift ff and intensity gg satisfy these inequalities, for some known constant KK and for all xx:

xf(x)K(1+x2)g2(x)K(1+x2)\begin{aligned} x f(x) \le K (1 + x^2) \qquad \quad g^2(x) \le K (1 + x^2) \end{aligned}

When this is satisfied, we can find the following upper bound on an Itō process XtX_t, which clearly implies that XtX_t is finite for all tt:

E[Xt2](X02+3Kt)exp ⁣(3Kt)\begin{aligned} \boxed{ \mathbf{E}[X_t^2] \le \big(X_0^2 + 3 K t\big) \exp\!\big(3 K t\big) } \end{aligned}

If we define YtXt2Y_t \equiv X_t^2, then Itō’s lemma tells us that the following holds:

dYt=(2Xtf(Xt)+g2(Xt))dt+2Xtg(Xt)dBt\begin{aligned} \dd{Y_t} = \big( 2 X_t \: f(X_t) + g^2(X_t) \big) \dd{t} + 2 X_t \: g(X_t) \dd{B_t} \end{aligned}

Integrating and taking the expectation value removes the Wiener term, leaving:

E[Yt]=Y0+E ⁣0t2Xsf(Xs)+g2(Xs)ds\begin{aligned} \mathbf{E}[Y_t] = Y_0 + \mathbf{E}\! \int_0^t 2 X_s f(X_s) + g^2(X_s) \dd{s} \end{aligned}

Given that K(1 ⁣+ ⁣x2)K (1 \!+\! x^2) is an upper bound of xf(x)x f(x) and g2(x)g^2(x), we get an inequality:

E[Yt]Y0+E ⁣0t2K(1 ⁣+ ⁣Xs2)+K(1 ⁣+ ⁣Xs2)dsY0+0t3K(1+E[Ys])dsY0+3Kt+0t3K(E[Ys])ds\begin{aligned} \mathbf{E}[Y_t] &\le Y_0 + \mathbf{E}\! \int_0^t 2 K (1 \!+\! X_s^2) + K (1 \!+\! X_s^2) \dd{s} \\ &\le Y_0 + \int_0^t 3 K (1 + \mathbf{E}[Y_s]) \dd{s} \\ &\le Y_0 + 3 K t + \int_0^t 3 K \big( \mathbf{E}[Y_s] \big) \dd{s} \end{aligned}

We then apply the Grönwall-Bellman inequality, noting that (Y0 ⁣+ ⁣3Kt)(Y_0 \!+\! 3 K t) does not decrease with time, leading us to:

E[Yt](Y0+3Kt)exp ⁣(0t3Kds)(Y0+3Kt)exp ⁣(3Kt)\begin{aligned} \mathbf{E}[Y_t] &\le (Y_0 + 3 K t) \exp\!\bigg( \int_0^t 3 K \dd{s} \bigg) \\ &\le (Y_0 + 3 K t) \exp\!\big(3 K t\big) \end{aligned}

If a solution exists, it is also worth knowing whether it is unique. Suppose that ff and gg satisfy the following inequalities, for some constant KK and for all xx and yy:

f(x)f(y)Kxyg(x)g(y)Kxy\begin{aligned} \big| f(x) - f(y) \big| \le K \big| x - y \big| \qquad \quad \big| g(x) - g(y) \big| \le K \big| x - y \big| \end{aligned}

Let XtX_t and YtY_t both be solutions to a given SDE, but the initial conditions need not be the same, such that the difference is initially X0 ⁣ ⁣Y0X_0 \!-\! Y_0. Then the difference Xt ⁣ ⁣YtX_t \!-\! Y_t is bounded by:

E[(XtYt)2](X0Y0)2exp ⁣((2K ⁣+ ⁣K2)t)\begin{aligned} \boxed{ \mathbf{E}\big[ (X_t - Y_t)^2 \big] \le (X_0 - Y_0)^2 \exp\!\Big( \big(2 K \!+\! K^2 \big) t \Big) } \end{aligned}

We define DtXt ⁣ ⁣YtD_t \equiv X_t \!-\! Y_t and ZtDt20Z_t \equiv D_t^2 \ge 0, together with Ftf(Xt) ⁣ ⁣f(Yt)F_t \equiv f(X_t) \!-\! f(Y_t) and Gtg(Xt) ⁣ ⁣g(Yt)G_t \equiv g(X_t) \!-\! g(Y_t), such that Itō’s lemma states:

dZt=(2DtFt+Gt2)dt+2DtGtdBt\begin{aligned} \dd{Z_t} = \big( 2 D_t F_t + G_t^2 \big) \dd{t} + 2 D_t G_t \dd{B_t} \end{aligned}

Integrating and taking the expectation value removes the Wiener term, leaving:

E[Zt]=Z0+E ⁣0t2DsFs+Gs2ds\begin{aligned} \mathbf{E}[Z_t] = Z_0 + \mathbf{E}\! \int_0^t 2 D_s F_s + G_s^2 \dd{s} \end{aligned}

The Cauchy-Schwarz inequality states that DsFsDsFs|D_s F_s| \le |D_s| |F_s|, and then the given fact that FsF_s and GsG_s satisfy FsKDs|F_s| \le K |D_s| and GsKDs|G_s| \le K |D_s| gives:

E[Zt]Z0+E ⁣0t2KDs2+K2Ds2dsZ0+0t(2K ⁣+ ⁣K2)E[Zs]ds\begin{aligned} \mathbf{E}[Z_t] &\le Z_0 + \mathbf{E}\! \int_0^t 2 K D_s^2 + K^2 D_s^2 \dd{s} \\ &\le Z_0 + \int_0^t (2 K \!+\! K^2) \: \mathbf{E}[Z_s] \dd{s} \end{aligned}

Where we have implicitly used that DsFs=DsFsD_s F_s = |D_s F_s| because ZtZ_t is positive for all Gs2G_s^2, and that Ds2=Ds2|D_s|^2 = D_s^2 because DsD_s is real. We then apply the Grönwall-Bellman inequality, recognizing that Z0Z_0 does not decrease with time (since it is constant):

E[Zt]Z0exp ⁣(0t2K ⁣+ ⁣K2ds)Z0exp ⁣((2K ⁣+ ⁣K2)t)\begin{aligned} \mathbf{E}[Z_t] &\le Z_0 \exp\!\bigg( \int_0^t 2 K \!+\! K^2 \dd{s} \bigg) \\ &\le Z_0 \exp\!\Big( \big( 2 K \!+\! K^2 \big) t \Big) \end{aligned}

Using these properties, it can then be shown that if all of the above conditions are satisfied, then the SDE has a unique solution, which is Ft\mathcal{F}_t-adapted, continuous, and exists for all times.


  1. U.H. Thygesen, Lecture notes on diffusions and stochastic differential equations, 2021, Polyteknisk Kompendie.