---
title: "Itō calculus"
firstLetter: "I"
publishDate: 2021-11-06
categories:
- Mathematics
date: 2021-11-06T14:34:00+01:00
draft: false
markup: pandoc
---
# Itō calculus
Given two time-indexed [random variables](/know/concept/random-variable/)
(i.e. stochastic processes) $F_t$ and $G_t$,
then consider the following random variable $X_t$,
where $B_t$ is the [Wiener process](/know/concept/wiener-process/):
$$\begin{aligned}
X_t
= X_0 + \int_0^t F_s \dd{s} + \int_0^t G_s \dd{B_s}
\end{aligned}$$
Where the latter is an [Itō integral](/know/concept/ito-integral/),
assuming $G_t$ is Itō-integrable.
We call $X_t$ an **Itō process** if $F_t$ is locally integrable,
and the initial condition $X_0$ is known,
i.e. $X_0$ is $\mathcal{F}_0$-measurable,
where $\mathcal{F}_t$ is the [filtration](/know/concept/sigma-algebra/)
to which $F_t$, $G_t$ and $B_t$ are adapted.
The above definition of $X_t$ is often abbreviated as follows,
where $X_0$ is implicit:
$$\begin{aligned}
\dd{X_t}
= F_t \dd{t} + G_t \dd{B_t}
\end{aligned}$$
Typically, $F_t$ is referred to as the **drift** of $X_t$,
and $G_t$ as its **intensity**.
Now, consider the following **Itō stochastic differential equation** (SDE),
where $\xi_t = \dv*{B_t}{t}$ is white noise:
$$\begin{aligned}
\dv{X_t}{t}
= f(X_t, t) + g(X_t, t) \: \xi_t
\end{aligned}$$
An Itō process $X_t$ is said to satisfy this equation
if $f(X_t, t) = F_t$ and $g(X_t, t) = G_t$,
in which case $X_t$ is also called an **Itō diffusion**.
Because the Itō integral of $G_t$ is a
[martingale](/know/concept/martingale/),
it does not contribute to the mean of $X_t$:
$$\begin{aligned}
\mathbf{E}[X_t]
= \int_0^t \mathbf{E}[F_s] \dd{s}
\end{aligned}$$
## Itō's lemma
Classically, given $y \equiv h(x(t), t)$,
the chain rule of differentiation states that:
$$\begin{aligned}
\dd{y}
= \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{x}
\end{aligned}$$
However, for a stochastic process $Y_t \equiv h(X_t, t)$,
where $X_t$ is an Itō process,
the chain rule is modified to the following,
known as **Itō's lemma**:
$$\begin{aligned}
\boxed{
\dd{Y_t}
= \pdv{h}{t} \dd{t} + \bigg( \pdv{h}{x} F_t + \frac{1}{2} G_t^2 \pdv[2]{h}{x} \bigg) \dd{t} + \pdv{h}{x} G_t \dd{B_t}
}
\end{aligned}$$
We start by applying the classical chain rule,
but we go to second order in $x$.
This is also valid classically,
but there we would neglect all higher-order infinitesimals:
$$\begin{aligned}
\dd{Y_t}
= \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{X_t} + \frac{1}{2} \pdv[2]{h}{x} \dd{X_t}^2
\end{aligned}$$
But here we cannot neglect $\dd{X_t}^2$.
We insert the definition of an Itō process:
$$\begin{aligned}
\dd{Y_t}
&= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big) + \frac{1}{2} \pdv[2]{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big)^2
\\
&= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big)
+ \frac{1}{2} \pdv[2]{h}{x} \Big( F_t^2 \dd{t}^2 + 2 F_t G_t \dd{t} \dd{B_t} + G_t^2 \dd{B_t}^2 \Big)
\end{aligned}$$
In the limit of small $\dd{t}$, we can neglect $\dd{t}^2$,
and as it turns out, $\dd{t} \dd{B_t}$ too:
$$\begin{aligned}
\dd{t} \dd{B_t}
&= (B_{t + \dd{t}} - B_t) \dd{t}
\sim \dd{t} \mathcal{N}(0, \dd{t})
\sim \mathcal{N}(0, \dd{t}^3)
\longrightarrow 0
\end{aligned}$$
However, due to the scaling property of $B_t$,
we cannot ignore $\dd{B_t}^2$, which has order $\dd{t}$:
$$\begin{aligned}
\dd{B_t}^2
&= (B_{t + \dd{t}} - B_t)^2
\sim \big( \mathcal{N}(0, \dd{t}) \big)^2
\sim \chi^2_1(\dd{t})
\longrightarrow \dd{t}
\end{aligned}$$
Where $\chi_1^2(\dd{t})$ is the generalized chi-squared distribution
with one term of variance $\dd{t}$.
The most important application of Itō's lemma
is to perform coordinate transformations,
to make the solution of a given Itō SDE easier.
## Coordinate transformations
The simplest coordinate transformation is a scaling of the time axis.
Defining $s \equiv \alpha t$, the goal is to keep the Itō process.
We know how to scale $B_t$, be setting $W_s \equiv \sqrt{\alpha} B_{s / \alpha}$.
Let $Y_s \equiv X_t$ be the new variable on the rescaled axis, then:
$$\begin{aligned}
\dd{Y_s}
= \dd{X_t}
&= f(X_t) \dd{t} + g(X_t) \dd{B_t}
\\
&= \frac{1}{\alpha} f(Y_s) \dd{s} + \frac{1}{\sqrt{\alpha}} g(Y_s) \dd{W_s}
\end{aligned}$$
$W_s$ is a valid Wiener process,
and the other changes are small,
so this is still an Itō process.
To solve SDEs analytically, it is usually best
to have additive noise, i.e. $g = 1$.
This can be achieved using the **Lamperti transform**:
define $Y_t \equiv h(X_t)$, where $h$ is given by:
$$\begin{aligned}
\boxed{
h(x)
= \int_{x_0}^x \frac{1}{g(y)} \dd{y}
}
\end{aligned}$$
Then, using Itō's lemma, it is straightforward
to show that the intensity becomes $1$.
Note that the lower integration limit $x_0$ does not enter:
$$\begin{aligned}
\dd{Y_t}
&= \bigg( f(X_t) \: h'(X_t) + \frac{1}{2} g^2(X_t) \: h''(X_t) \bigg) \dd{t} + g(X_t) \: h'(X_t) \dd{B_t}
\\
&= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g^2(X_t) \frac{g'(X_t)}{g^2(X_t)} \bigg) \dd{t} + \frac{g(X_t)}{g(X_t)} \dd{B_t}
\\
&= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g'(X_t) \bigg) \dd{t} + \dd{B_t}
\end{aligned}$$
Similarly, we can eliminate the drift $f = 0$,
thereby making the Itō process a martingale.
This is done by defining $Y_t \equiv h(X_t)$, with $h(x)$ given by:
$$\begin{aligned}
\boxed{
h(x)
= \int_{x_0}^x \exp\!\bigg( \!-\!\! \int_{x_1}^x \frac{2 f(y)}{g^2(y)} \dd{y} \bigg)
}
\end{aligned}$$
The goal is to make the parenthesized first term (see above)
of Itō's lemma disappear, which this $h(x)$ does indeed do.
Note that $x_0$ and $x_1$ do not enter:
$$\begin{aligned}
0
&= f(x) \: h'(x) + \frac{1}{2} g^2(x) \: h''(x)
\\
&= \Big( f(x) - \frac{1}{2} g^2(x) \frac{2 f(x)}{g(x)} \Big) \exp\!\bigg( \!-\!\! \int_{x_1}^x \frac{2 f(y)}{g^2(y)} \dd{y} \bigg)
\end{aligned}$$
## References
1. U.H. Thygesen,
*Lecture notes on diffusions and stochastic differential equations*,
2021, Polyteknisk Kompendie.