diff options
author | Prefetch | 2021-11-06 21:47:08 +0100 |
---|---|---|
committer | Prefetch | 2021-11-06 21:47:08 +0100 |
commit | f091bf0922c26238d16bf175a8ea916a16d11fba (patch) | |
tree | 307ace9fde0b408f45fdc55bc8926fc15d8df7c6 | |
parent | a17363fa734518ada98fc3e79c9fd20f70e42f1b (diff) |
Expand knowledge base
-rw-r--r-- | content/know/concept/conditional-expectation/index.pdc | 69 | ||||
-rw-r--r-- | content/know/concept/ito-calculus/index.pdc | 215 | ||||
-rw-r--r-- | content/know/concept/ito-integral/index.pdc | 274 | ||||
-rw-r--r-- | content/know/concept/martingale/index.pdc | 2 | ||||
-rw-r--r-- | content/know/concept/random-variable/index.pdc | 22 | ||||
-rw-r--r-- | content/know/concept/sigma-algebra/index.pdc | 2 | ||||
-rw-r--r-- | content/know/concept/wiener-process/index.pdc | 2 |
7 files changed, 539 insertions, 47 deletions
diff --git a/content/know/concept/conditional-expectation/index.pdc b/content/know/concept/conditional-expectation/index.pdc index 7da7660..5bcc152 100644 --- a/content/know/concept/conditional-expectation/index.pdc +++ b/content/know/concept/conditional-expectation/index.pdc @@ -13,17 +13,17 @@ markup: pandoc # Conditional expectation -Recall that the expectation value $\mathbf{E}(X)$ +Recall that the expectation value $\mathbf{E}[X]$ of a [random variable](/know/concept/random-variable/) $X$ is a function of the probability space $(\Omega, \mathcal{F}, P)$ on which $X$ is defined, and the definition of $X$ itself. -The **conditional expectation** $\mathbf{E}(X|A)$ +The **conditional expectation** $\mathbf{E}[X|A]$ is the expectation value of $X$ given that an event $A$ has occurred, i.e. only the outcomes $\omega \in \Omega$ satisfying $\omega \in A$ should be considered. -If $A$ is obtained by observing another variable, -then $\mathbf{E}(X|A)$ is a random variable in its own right. +If $A$ is obtained by observing a variable, +then $\mathbf{E}[X|A]$ is a random variable in its own right. Consider two random variables $X$ and $Y$ on the same probability space $(\Omega, \mathcal{F}, P)$, @@ -33,7 +33,7 @@ then the conditional expectation of $X$ given the event $Y = y$ is as follows: $$\begin{aligned} - \mathbf{E}(X | Y \!=\! y) + \mathbf{E}[X | Y \!=\! y] = \sum_{x} x \: Q(X \!=\! x) \qquad \quad Q(X \!=\! x) @@ -43,12 +43,12 @@ $$\begin{aligned} Where $Q$ is a renormalized probability function, which assigns zero to all events incompatible with $Y = y$. If we allow $\Omega$ to be continuous, -then from the definition $\mathbf{E}(X)$, +then from the definition $\mathbf{E}[X]$, we know that the following Lebesgue integral can be used, which we call $f(y)$: $$\begin{aligned} - \mathbf{E}(X | Y \!=\! y) + \mathbf{E}[X | Y \!=\! y] = f(y) = \int_\Omega X(\omega) \dd{Q(\omega)} \end{aligned}$$ @@ -60,7 +60,7 @@ Sticking with the assumption $P(Y \!=\! y) > 0$, notice that: $$\begin{aligned} f(y) = \frac{1}{P(Y \!=\! y)} \int_\Omega X(\omega) \dd{P(\omega \cap Y \!=\! y)} - = \frac{\mathbf{E}(X \cdot I(Y \!=\! y))}{P(Y \!=\! y)} + = \frac{\mathbf{E}[X \cdot I(Y \!=\! y)]}{P(Y \!=\! y)} \end{aligned}$$ Where $I$ is the indicator function, @@ -68,33 +68,33 @@ equal to $1$ if its argument is true, and $0$ if not. Multiplying the definition of $f(y)$ by $P(Y \!=\! y)$ then leads us to: $$\begin{aligned} - \mathbf{E}(X \cdot I(Y \!=\! y)) + \mathbf{E}[X \cdot I(Y \!=\! y)] &= f(y) \cdot P(Y \!=\! y) \\ - &= \mathbf{E}(f(Y) \cdot I(Y \!=\! y)) + &= \mathbf{E}[f(Y) \cdot I(Y \!=\! y)] \end{aligned}$$ Recall that because $Y$ is a random variable, -$\mathbf{E}(X|Y) = f(Y)$ is too. +$\mathbf{E}[X|Y] = f(Y)$ is too. In other words, $f$ maps $Y$ to another random variable, which, due to the *Doob-Dynkin lemma* (see [$\sigma$-algebra](/know/concept/sigma-algebra/)), -must mean that $\mathbf{E}(X|Y)$ is measurable with respect to $\sigma(Y)$. +must mean that $\mathbf{E}[X|Y]$ is measurable with respect to $\sigma(Y)$. Intuitively, this makes some sense: -$\mathbf{E}(X|Y)$ cannot contain more information about events +$\mathbf{E}[X|Y]$ cannot contain more information about events than the $Y$ it was calculated from. This suggests a straightforward generalization of the above: instead of a specific value $Y = y$, we can condition on *any* information from $Y$. If $\mathcal{H} = \sigma(Y)$ is the information generated by $Y$, -then the conditional expectation $\mathbf{E}(X|\mathcal{H}) = Z$ +then the conditional expectation $\mathbf{E}[X|\mathcal{H}] = Z$ is $\mathcal{H}$-measurable, and given by a $Z$ satisfying: $$\begin{aligned} \boxed{ - \mathbf{E}\big(X \cdot I(H)\big) - = \mathbf{E}\big(Z \cdot I(H)\big) + \mathbf{E}\big[X \cdot I(H)\big] + = \mathbf{E}\big[Z \cdot I(H)\big] } \end{aligned}$$ @@ -102,52 +102,55 @@ For any $H \in \mathcal{H}$. Note that $Z$ is almost surely unique: *almost* because it could take any value for an event $A$ with zero probability $P(A) = 0$. Fortunately, if there exists a continuous $f$ -such that $\mathbf{E}(X | \sigma(Y)) = f(Y)$, -then $Z = \mathbf{E}(X | \sigma(Y))$ is unique. +such that $\mathbf{E}[X | \sigma(Y)] = f(Y)$, +then $Z = \mathbf{E}[X | \sigma(Y)]$ is unique. + + +## Properties A conditional expectation defined in this way has many useful properties, most notably linearity: -$\mathbf{E}(aX \!+\! bY | \mathcal{H}) = a \mathbf{E}(X|\mathcal{H}) + b \mathbf{E}(Y|\mathcal{H})$ +$\mathbf{E}[aX \!+\! bY | \mathcal{H}] = a \mathbf{E}[X|\mathcal{H}] + b \mathbf{E}[Y|\mathcal{H}]$ for any $a, b \in \mathbb{R}$. The **tower property** states that if $\mathcal{F} \supset \mathcal{G} \supset \mathcal{H}$, -then $\mathbf{E}(\mathbf{E}(X|\mathcal{G})|\mathcal{H}) = \mathbf{E}(X|\mathcal{H})$. +then $\mathbf{E}[\mathbf{E}[X|\mathcal{G}]|\mathcal{H}] = \mathbf{E}[X|\mathcal{H}]$. Intuitively, this works as follows: suppose person $G$ knows more about $X$ than person $H$, -then $\mathbf{E}(X | \mathcal{H})$ is $H$'s expectation, -$\mathbf{E}(X | \mathcal{G})$ is $G$'s "better" expectation, -and then $\mathbf{E}(\mathbf{E}(X|\mathcal{G})|\mathcal{H})$ +then $\mathbf{E}[X | \mathcal{H}]$ is $H$'s expectation, +$\mathbf{E}[X | \mathcal{G}]$ is $G$'s "better" expectation, +and then $\mathbf{E}[\mathbf{E}[X|\mathcal{G}]|\mathcal{H}]$ is $H$'s prediction about what $G$'s expectation will be. However, $H$ does not have access to $G$'s extra information, -so $H$'s best prediction is simply $\mathbf{E}(X | \mathcal{H})$. +so $H$'s best prediction is simply $\mathbf{E}[X | \mathcal{H}]$. The **law of total expectation** says that -$\mathbf{E}(\mathbf{E}(X | \mathcal{G})) = \mathbf{E}(X)$, +$\mathbf{E}[\mathbf{E}[X | \mathcal{G}]] = \mathbf{E}[X]$, and follows from the above tower property by choosing $\mathcal{H}$ to contain no information: $\mathcal{H} = \{ \varnothing, \Omega \}$. -Another useful property is that $\mathbf{E}(X | \mathcal{H}) = X$ +Another useful property is that $\mathbf{E}[X | \mathcal{H}] = X$ if $X$ is $\mathcal{H}$-measurable. In other words, if $\mathcal{H}$ already contains all the information extractable from $X$, then we know $X$'s exact value. Conveniently, this can easily be generalized to products: -$\mathbf{E}(XY | \mathcal{H}) = X \mathbf{E}(Y | \mathcal{H})$ +$\mathbf{E}[XY | \mathcal{H}] = X \mathbf{E}[Y | \mathcal{H}]$ if $X$ is $\mathcal{H}$-measurable: since $X$'s value is known, it can simply be factored out. Armed with this definition of conditional expectation, we can define other conditional quantities, -such as the **conditional variance** $\mathbf{V}(X | \mathcal{H})$: +such as the **conditional variance** $\mathbf{V}[X | \mathcal{H}]$: $$\begin{aligned} - \mathbf{V}(X | \mathcal{H}) - = \mathbf{E}(X^2 | \mathcal{H}) - \big(\mathbf{E}(X | \mathcal{H})\big)^2 + \mathbf{V}[X | \mathcal{H}] + = \mathbf{E}[X^2 | \mathcal{H}] - \big[\mathbf{E}[X | \mathcal{H}]\big]^2 \end{aligned}$$ The **law of total variance** then states that -$\mathbf{V}(X) = \mathbf{E}(\mathbf{V}(X | \mathcal{H})) + \mathbf{V}(\mathbf{E}(X | \mathcal{H}))$. +$\mathbf{V}[X] = \mathbf{E}[\mathbf{V}[X | \mathcal{H}]] + \mathbf{V}[\mathbf{E}[X | \mathcal{H}]]$. Likewise, we can define the **conditional probability** $P$, **conditional distribution function** $F_{X|\mathcal{H}}$, @@ -156,7 +159,7 @@ like their non-conditional counterparts: $$\begin{aligned} P(A | \mathcal{H}) - = \mathbf{E}(I(A) | \mathcal{H}) + = \mathbf{E}[I(A) | \mathcal{H}] \qquad F_{X|\mathcal{H}}(x) = P(X \le x | \mathcal{H}) @@ -168,6 +171,6 @@ $$\begin{aligned} ## References -1. U.F. Thygesen, +1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/ito-calculus/index.pdc b/content/know/concept/ito-calculus/index.pdc new file mode 100644 index 0000000..576e09a --- /dev/null +++ b/content/know/concept/ito-calculus/index.pdc @@ -0,0 +1,215 @@ +--- +title: "Itō calculus" +firstLetter: "I" +publishDate: 2021-11-06 +categories: +- Mathematics + +date: 2021-11-06T14:34:00+01:00 +draft: false +markup: pandoc +--- + +# Itō calculus + +Given two time-indexed [random variables](/know/concept/random-variable/) +(i.e. stochastic processes) $F_t$ and $G_t$, +then consider the following random variable $X_t$, +where $B_t$ is the [Wiener process](/know/concept/wiener-process/): + +$$\begin{aligned} + X_t + = X_0 + \int_0^t F_s \dd{s} + \int_0^t G_s \dd{B_s} +\end{aligned}$$ + +Where the latter is an [Itō integral](/know/concept/ito-integral/), +assuming $G_t$ is Itō-integrable. +We call $X_t$ an **Itō process** if $F_t$ is locally integrable, +and the initial condition $X_0$ is known, +i.e. $X_0$ is $\mathcal{F}_0$-measurable, +where $\mathcal{F}_t$ is the [filtration](/know/concept/sigma-algebra/) +to which $F_t$, $G_t$ and $B_t$ are adapted. +The above definition of $X_t$ is often abbreviated as follows, +where $X_0$ is implicit: + +$$\begin{aligned} + \dd{X_t} + = F_t \dd{t} + G_t \dd{B_t} +\end{aligned}$$ + +Typically, $F_t$ is referred to as the **drift** of $X_t$, +and $G_t$ as its **intensity**. +Now, consider the following **Itō stochastic differential equation** (SDE), +where $\xi_t = \dv*{B_t}{t}$ is white noise: + +$$\begin{aligned} + \dv{X_t}{t} + = f(X_t, t) + g(X_t, t) \: \xi_t +\end{aligned}$$ + +An Itō process $X_t$ is said to satisfy this equation +if $f(X_t, t) = F_t$ and $g(X_t, t) = G_t$, +in which case $X_t$ is also called an **Itō diffusion**. + +Because the Itō integral of $G_t$ is a +[martingale](/know/concept/martingale/), +it does not contribute to the mean of $X_t$: + +$$\begin{aligned} + \mathbf{E}[X_t] + = \int_0^t \mathbf{E}[F_s] \dd{s} +\end{aligned}$$ + + +## Itō's lemma + +Classically, given $y \equiv h(x(t), t)$, +the chain rule of differentiation states that: + +$$\begin{aligned} + \dd{y} + = \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{x} +\end{aligned}$$ + +However, for a stochastic process $Y_t \equiv h(X_t, t)$, +where $X_t$ is an Itō process, +the chain rule is modified to the following, +known as **Itō's lemma**: + +$$\begin{aligned} + \boxed{ + \dd{Y_t} + = \pdv{h}{t} \dd{t} + \bigg( \pdv{h}{x} F_t + \frac{1}{2} G_t^2 \pdv[2]{h}{x} \bigg) \dd{t} + \pdv{h}{x} G_t \dd{B_t} + } +\end{aligned}$$ + +<div class="accordion"> +<input type="checkbox" id="proof-lemma"/> +<label for="proof-lemma">Proof</label> +<div class="hidden"> +<label for="proof-lemma">Proof.</label> +We start by applying the classical chain rule, +but we go to second order in $x$. +This is also valid classically, +but there we would neglect all higher-order infinitesimals: + +$$\begin{aligned} + \dd{Y_t} + = \pdv{h}{t} \dd{t} + \pdv{h}{x} \dd{X_t} + \frac{1}{2} \pdv[2]{h}{x} \dd{X_t}^2 +\end{aligned}$$ + +But here we cannot neglect $\dd{X_t}^2$. +We insert the definition of an Itō process: + +$$\begin{aligned} + \dd{Y_t} + &= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big) + \frac{1}{2} \pdv[2]{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big)^2 + \\ + &= \pdv{h}{t} \dd{t} + \pdv{h}{x} \Big( F_t \dd{t} + G_t \dd{B_t} \Big) + + \frac{1}{2} \pdv[2]{h}{x} \Big( F_t^2 \dd{t}^2 + 2 F_t G_t \dd{t} \dd{B_t} + G_t^2 \dd{B_t}^2 \Big) +\end{aligned}$$ + +In the limit of small $\dd{t}$, we can neglect $\dd{t}^2$, +and as it turns out, $\dd{t} \dd{B_t}$ too: + +$$\begin{aligned} + \dd{t} \dd{B_t} + &= (B_{t + \dd{t}} - B_t) \dd{t} + \sim \dd{t} \mathcal{N}(0, \dd{t}) + \sim \mathcal{N}(0, \dd{t}^3) + \longrightarrow 0 +\end{aligned}$$ + +However, due to the scaling property of $B_t$, +we cannot ignore $\dd{B_t}^2$, which has order $\dd{t}$: + +$$\begin{aligned} + \dd{B_t}^2 + &= (B_{t + \dd{t}} - B_t)^2 + \sim \big( \mathcal{N}(0, \dd{t}) \big)^2 + \sim \chi^2_1(\dd{t}) + \longrightarrow \dd{t} +\end{aligned}$$ + +Where $\chi_1^2(\dd{t})$ is the generalized chi-squared distribution +with one term of variance $\dd{t}$. +</div> +</div> + +The most important application of Itō's lemma +is to perform coordinate transformations, +to make the solution of a given Itō SDE easier. + + +## Coordinate transformations + +The simplest coordinate transformation is a scaling of the time axis. +Defining $s \equiv \alpha t$, the goal is to keep the Itō process. +We know how to scale $B_t$, be setting $W_s \equiv \sqrt{\alpha} B_{s / \alpha}$. +Let $Y_s \equiv X_t$ be the new variable on the rescaled axis, then: + +$$\begin{aligned} + \dd{Y_s} + = \dd{X_t} + &= f(X_t) \dd{t} + g(X_t) \dd{B_t} + \\ + &= \frac{1}{\alpha} f(Y_s) \dd{s} + \frac{1}{\sqrt{\alpha}} g(Y_s) \dd{W_s} +\end{aligned}$$ + +$W_s$ is a valid Wiener process, +and the other changes are small, +so this is still an Itō process. + +To solve SDEs analytically, it is usually best +to have additive noise, i.e. $g = 1$. +This can be achieved using the **Lamperti transform**: +define $Y_t \equiv h(X_t)$, where $h$ is given by: + +$$\begin{aligned} + \boxed{ + h(x) + = \int_{x_0}^x \frac{1}{g(y)} \dd{y} + } +\end{aligned}$$ + +Then, using Itō's lemma, it is straightforward +to show that the intensity becomes $1$. +Note that the lower integration limit $x_0$ does not enter: + +$$\begin{aligned} + \dd{Y_t} + &= \bigg( f(X_t) \: h'(X_t) + \frac{1}{2} g^2(X_t) \: h''(X_t) \bigg) \dd{t} + g(X_t) \: h'(X_t) \dd{B_t} + \\ + &= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g^2(X_t) \frac{g'(X_t)}{g^2(X_t)} \bigg) \dd{t} + \frac{g(X_t)}{g(X_t)} \dd{B_t} + \\ + &= \bigg( \frac{f(X_t)}{g(X_t)} - \frac{1}{2} g'(X_t) \bigg) \dd{t} + \dd{B_t} +\end{aligned}$$ + +Similarly, we can eliminate the drift $f = 0$, +thereby making the Itō process a martingale. +This is done by defining $Y_t \equiv h(X_t)$, with $h(x)$ given by: + +$$\begin{aligned} + \boxed{ + h(x) + = \int_{x_0}^x \exp\!\bigg( \!-\!\! \int_{x_1}^x \frac{2 f(y)}{g^2(y)} \dd{y} \bigg) + } +\end{aligned}$$ + +The goal is to make the parenthesized first term (see above) +of Itō's lemma disappear, which this $h(x)$ does indeed do. +Note that $x_0$ and $x_1$ do not enter: + +$$\begin{aligned} + 0 + &= f(x) \: h'(x) + \frac{1}{2} g^2(x) \: h''(x) + \\ + &= \Big( f(x) - \frac{1}{2} g^2(x) \frac{2 f(x)}{g(x)} \Big) \exp\!\bigg( \!-\!\! \int_{x_1}^x \frac{2 f(y)}{g^2(y)} \dd{y} \bigg) +\end{aligned}$$ + + + +## References +1. U.H. Thygesen, + *Lecture notes on diffusions and stochastic differential equations*, + 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/ito-integral/index.pdc b/content/know/concept/ito-integral/index.pdc new file mode 100644 index 0000000..ec49189 --- /dev/null +++ b/content/know/concept/ito-integral/index.pdc @@ -0,0 +1,274 @@ +--- +title: "Itō integral" +firstLetter: "I" +publishDate: 2021-11-06 +categories: +- Mathematics + +date: 2021-10-21T19:41:58+02:00 +draft: false +markup: pandoc +--- + +# Itō integral + +The **Itō integral** offers a way to integrate +a time-indexed [random variable](/know/concept/random-variable/) +$G_t$ (i.e. a stochastic process) with respect +to a [Wiener process](/know/concept/wiener-process/) $B_t$, +which is also a stochastic process. +The Itō integral $I_t$ of $G_t$ is defined as follows: + +$$\begin{aligned} + \boxed{ + I_t + \equiv \int_a^b G_t \dd{B_t} + \equiv \lim_{h \to 0} \sum_{t = a}^{t = b} G_t \big(B_{t + h} - B_t\big) + } +\end{aligned}$$ + +Where have partitioned the time interval $[a, b]$ into steps of size $h$. +The above integral exists if $G_t$ and $B_t$ are adapted +to a common [filtration](/know/concept/sigma-algebra) $\mathcal{F}_t$, +and $\mathbf{E}[G_t^2]$ is integrable for $t \in [a, b]$. +If $I_t$ exists, $G_t$ is said to be **Itō-integrable** with respect to $B_t$. + + +## Motivation + +Consider the following simple first-order differential equation for $X_t$, +for some function $f$: + +$$\begin{aligned} + \dv{X_t}{t} + = f(X_t) +\end{aligned}$$ + +This can be solved numerically using the explicit Euler scheme +by discretizing it with step size $h$, +which can be applied recursively, leading to: + +$$\begin{aligned} + X_{t+h} + \approx X_{t} + f(X_t) \: h + \quad \implies \quad + X_t + \approx X_0 + \sum_{s = 0}^{s = t} f(X_s) \: h +\end{aligned}$$ + +In the limit $h \to 0$, this leads to the following unsurprising integral for $X_t$: + +$$\begin{aligned} + \int_0^t f(X_s) \dd{s} + = \lim_{h \to 0} \sum_{s = 0}^{s = t} f(X_s) \: h +\end{aligned}$$ + +In contrast, consider the *stochastic differential equation* below, +where $\xi_t$ represents white noise, +which is informally the $t$-derivative +of the Wiener process $\xi_t = \dv*{B_t}{t}$: + +$$\begin{aligned} + \dv{X_t}{t} + = g(X_t) \: \xi_t +\end{aligned}$$ + +Now $X_t$ is not deterministic, +since $\xi_t$ is derived from a random variable $B_t$. +If $g = 1$, we expect $X_t = X_0 + B_t$. +With this in mind, we introduce the **Euler-Maruyama scheme**: + +$$\begin{aligned} + X_{t+h} + &= X_t + g(X_t) \: (\xi_{t+h} - \xi_t) \: h + \\ + &= X_t + g(X_t) \: (B_{t+h} - B_t) +\end{aligned}$$ + +We would like to turn this into an integral for $X_t$, as we did above. +Therefore, we state: + +$$\begin{aligned} + X_t + = X_0 + \int_0^t g(X_s) \dd{B_s} +\end{aligned}$$ + +This integral is *defined* as below, +analogously to the first, but with $h$ replaced by +the increment $B_{t+h} \!-\! B_t$ of a Wiener process. +This is an Itō integral: + +$$\begin{aligned} + \int_0^t g(X_s) \dd{B_s} + \equiv \lim_{h \to 0} \sum_{s = 0}^{s = t} g(X_s) \big(B_{s + h} - B_s\big) +\end{aligned}$$ + +For more information about applying the Itō integral in this way, +see the [Itō calculus](/know/concept/ito-calculus/). + + +## Properties + +Since $G_t$ and $B_t$ must be known (i.e. $\mathcal{F}_t$-adapted) +in order to evaluate the Itō integral $I_t$ at any given $t$, +it logically follows that $I_t$ is also $\mathcal{F}_t$-adapted. + +Because the Itō integral is defined as the limit of a sum of linear terms, +it inherits this linearity. +Consider two Itō-integrable processes $G_t$ and $H_t$, +and two constants $v, w \in \mathbb{R}$: + +$$\begin{aligned} + \int_a^b v G_t + w H_t \dd{B_t} + = v\! \int_a^b G_t \dd{B_t} +\: w\! \int_a^b H_t \dd{B_t} +\end{aligned}$$ + +By adding multiple summations, +the Itō integral clearly satisfies, for $a < b < c$: + +$$\begin{aligned} + \int_a^c G_t \dd{B_t} + = \int_a^b G_t \dd{B_t} + \int_b^c G_t \dd{B_t} +\end{aligned}$$ + +A more interesting property is the **Itō isometry**, +which expresses the expectation of the square of an Itō integral of $G_t$ +as a simpler "ordinary" integral of the expectation of $G_t^2$ +(which exists by the definition of Itō-integrability): + +$$\begin{aligned} + \boxed{ + \mathbf{E} \bigg( \int_a^b G_t \dd{B_t} \bigg)^2 + = \int_a^b \mathbf{E} \big[ G_t^2 \big] \dd{t} + } +\end{aligned}$$ + +<div class="accordion"> +<input type="checkbox" id="proof-isometry"/> +<label for="proof-isometry">Proof</label> +<div class="hidden"> +<label for="proof-isometry">Proof.</label> +We write out the left-hand side of the Itō isometry, +where eventually $h \to 0$: + +$$\begin{aligned} + \mathbf{E} \bigg[ \sum_{t = a}^{t = b} G_t (B_{t + h} \!-\! B_t) \bigg]^2 + &= \sum_{t = a}^{t = b} \sum_{s = a}^{s = b} \mathbf{E} \bigg[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \bigg] +\end{aligned}$$ + +In the particular case $t \ge s \!+\! h$, +a given term of this summation can be rewritten +as follows using the *law of total expectation* +(see [conditional expectation](/know/concept/conditional-expectation/)): + +$$\begin{aligned} + \mathbf{E} \Big[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \Big] + = \mathbf{E} \bigg[ \mathbf{E} \Big[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \Big| \mathcal{F}_t \Big] \bigg] +\end{aligned}$$ + +Recall that $G_t$ and $B_t$ are adapted to $\mathcal{F}_t$: +at time $t$, we have information $\mathcal{F}_t$, +which includes knowledge of the realized values $G_t$ and $B_t$. +Since $t \ge s \!+\! h$ by assumption, we can simply factor out the known quantities: + +$$\begin{aligned} + \mathbf{E} \Big[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \Big] + = \mathbf{E} \bigg[ G_t G_s (B_{s + h} \!-\! B_s) \: \mathbf{E} \Big[ (B_{t + h} \!-\! B_t) \Big| \mathcal{F}_t \Big] \bigg] +\end{aligned}$$ + +However, $\mathcal{F}_t$ says nothing about +the increment $(B_{t + h} \!-\! B_t) \sim \mathcal{N}(0, h)$, +meaning that the conditional expectation is zero: + +$$\begin{aligned} + \mathbf{E} \Big[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \Big] + = 0 + \qquad \mathrm{for}\; t \ge s + h +\end{aligned}$$ + +By swapping $s$ and $t$, the exact same result can be obtained for $s \ge t \!+\! h$: + +$$\begin{aligned} + \mathbf{E} \Big[ G_t G_s (B_{t + h} \!-\! B_t) (B_{s + h} \!-\! B_s) \Big] + = 0 + \qquad \mathrm{for}\; s \ge t + h +\end{aligned}$$ + +This leaves only one case which can be nonzero: $[t, t\!+\!h] = [s, s\!+\!h]$. +Applying the law of total expectation again yields: + +$$\begin{aligned} + \mathbf{E} \bigg[ \sum_{t = a}^{t = b} G_t (B_{t + h} \!-\! B_t) \bigg]^2 + &= \sum_{t = a}^{t = b} \mathbf{E} \Big[ G_t^2 (B_{t + h} \!-\! B_t)^2 \Big] + \\ + &= \sum_{t = a}^{t = b} \mathbf{E} \bigg[ \mathbf{E} \Big[ G_t^2 (B_{t + h} \!-\! B_t)^2 \Big| \mathcal{F}_t \Big] \bigg] +\end{aligned}$$ + +We know $G_t$, and the expectation value of $(B_{t+h} \!-\! B_t)^2$, +since the increment is normally distributed, is simply the variance $h$: + +$$\begin{aligned} + \mathbf{E} \bigg[ \sum_{t = a}^{t = b} G_t (B_{t + h} \!-\! B_t) \bigg]^2 + &= \sum_{t = a}^{t = b} \mathbf{E} \big[ G_t^2 \big] h + \longrightarrow + \int_a^b \mathbf{E} \big[ G_t^2 \big] \dd{t} +\end{aligned}$$ +</div> +</div> + +Furthermore, Itō integrals are [martingales](/know/concept/martingale/), +meaning that the average noise contribution is zero, +which makes intuitive sense, +since true white noise cannot be biased. + +<div class="accordion"> +<input type="checkbox" id="proof-martingale"/> +<label for="proof-martingale">Proof</label> +<div class="hidden"> +<label for="proof-martingale">Proof.</label> +We will prove that an arbitrary Itō integral $I_t$ is a martingale. +Using additivity, we know that the increment $I_t \!-\! I_s$ +is as follows, given information $\mathcal{F}_s$: + +$$\begin{aligned} + \mathbf{E} \big[ I_t \!-\! I_s | \mathcal{F}_s \big] + = \mathbf{E} \bigg[ \int_s^t G_u \dd{B_u} \bigg| \mathcal{F}_s \bigg] + = \lim_{h \to 0} \sum_{u = s}^{u = t} \mathbf{E} \Big[ G_u (B_{u + h} \!-\! B_u) \Big| \mathcal{F}_s \Big] +\end{aligned}$$ + +We rewrite this [conditional expectation](/know/concept/conditional-expectation/) +using the *tower property* for some $\mathcal{F}_u \supset \mathcal{F}_s$, +such that $G_u$ and $B_u$ are known, but $B_{u+h} \!-\! B_u$ is not: + +$$\begin{aligned} + \mathbf{E} \big[ I_t \!-\! I_s | \mathcal{F}_s \big] + &= \lim_{h \to 0} \sum_{u = s}^{u = t} + \mathbf{E} \bigg[ \mathbf{E} \Big[ G_u (B_{u + h} \!-\! B_u) \Big| \mathcal{F}_u \Big] \bigg| \mathcal{F}_s \bigg] + = 0 +\end{aligned}$$ + +We now have everything we need to calculate $\mathbf{E} [ I_t | \mathcal{F_s} ]$, +giving the martingale property: + +$$\begin{aligned} + \mathbf{E} \big[ I_t | \mathcal{F}_s \big] + = \mathbf{E} \big[ I_s | \mathcal{F}_s \big] + \mathbf{E} \big[ I_t \!-\! I_s | \mathcal{F}_s \big] + = I_s + \mathbf{E} \big[ I_t \!-\! I_s | \mathcal{F}_s \big] + = I_s +\end{aligned}$$ + +For the existence of $I_t$, +we need $\mathbf{E}[G_t^2]$ to be integrable over the target interval, +so from the Itō isometry we have $\mathbf{E}[I]^2 < \infty$, +and therefore $\mathbf{E}[I] < \infty$, +so $I_t$ has all the properties of a Martingale, +since it is trivially $\mathcal{F}_t$-adapted. +</div> +</div> + + + +## References +1. U.H. Thygesen, + *Lecture notes on diffusions and stochastic differential equations*, + 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/martingale/index.pdc b/content/know/concept/martingale/index.pdc index ffc286b..07ed1a4 100644 --- a/content/know/concept/martingale/index.pdc +++ b/content/know/concept/martingale/index.pdc @@ -56,6 +56,6 @@ since they will tend to increase and decrease with time, respectively. ## References -1. U.F. Thygesen, +1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/random-variable/index.pdc b/content/know/concept/random-variable/index.pdc index fe50b60..2a8643e 100644 --- a/content/know/concept/random-variable/index.pdc +++ b/content/know/concept/random-variable/index.pdc @@ -119,27 +119,27 @@ $$\begin{aligned} ## Expectation value -The **expectation value** $\mathbf{E}(X)$ of a random variable $X$ +The **expectation value** $\mathbf{E}[X]$ of a random variable $X$ can be defined in the familiar way, as the sum/integral of every possible value of $X$ mutliplied by the corresponding probability (density). For continuous and discrete sample spaces $\Omega$, respectively: $$\begin{aligned} - \mathbf{E}(X) + \mathbf{E}[X] = \int_{-\infty}^\infty x \: f_X(x) \dd{x} \qquad \mathrm{or} \qquad - \mathbf{E}(X) + \mathbf{E}[X] = \sum_{i = 1}^N x_i \: P(X \!=\! x_i) \end{aligned}$$ However, $f_X(x)$ is not guaranteed to exist, and the distinction between continuous and discrete is cumbersome. -A more general definition of $\mathbf{E}(X)$ +A more general definition of $\mathbf{E}[X]$ is the following Lebesgue-Stieltjes integral, since $F_X(x)$ always exists: $$\begin{aligned} - \mathbf{E}(X) + \mathbf{E}[X] = \int_{-\infty}^\infty x \dd{F_X(x)} \end{aligned}$$ @@ -147,25 +147,25 @@ This is valid for any sample space $\Omega$. Or, equivalently, a Lebesgue integral can be used: $$\begin{aligned} - \mathbf{E}(X) + \mathbf{E}[X] = \int_\Omega X(\omega) \dd{P(\omega)} \end{aligned}$$ An expectation value defined in this way has many useful properties, most notably linearity. -We can also define the familiar **variance** $\mathbf{V}(X)$ +We can also define the familiar **variance** $\mathbf{V}[X]$ of a random variable $X$ as follows: $$\begin{aligned} - \mathbf{V}(X) - = \mathbf{E}\big( (X - \mathbf{E}(X))^2 \big) - = \mathbf{E}(X^2) - \big(\mathbf{E}(X)\big)^2 + \mathbf{V}[X] + = \mathbf{E}\big[ (X - \mathbf{E}[X])^2 \big] + = \mathbf{E}[X^2] - \big(\mathbf{E}[X]\big)^2 \end{aligned}$$ ## References -1. U.F. Thygesen, +1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/sigma-algebra/index.pdc b/content/know/concept/sigma-algebra/index.pdc index 1a459ea..96240ff 100644 --- a/content/know/concept/sigma-algebra/index.pdc +++ b/content/know/concept/sigma-algebra/index.pdc @@ -115,6 +115,6 @@ Clearly, $X_t$ is always adapted to its own filtration. ## References -1. U.F. Thygesen, +1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie. diff --git a/content/know/concept/wiener-process/index.pdc b/content/know/concept/wiener-process/index.pdc index 49aebfb..3602b44 100644 --- a/content/know/concept/wiener-process/index.pdc +++ b/content/know/concept/wiener-process/index.pdc @@ -85,6 +85,6 @@ $$\begin{aligned} ## References -1. U.F. Thygesen, +1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie. |