--- title: "Random variable" sort_title: "Random variable" date: 2021-10-22 categories: - Mathematics - Statistics - Measure theory layout: "concept" --- **Random variables** are the bread and butter of probability theory and statistics, and are simply variables whose value depends on the outcome of a random experiment. Here, we will describe the formal mathematical definition of a random variable. ## Probability space A **probability space** or **probability triple** $$(\Omega, \mathcal{F}, P)$$ is the formal mathematical model of a given **stochastic experiment**, i.e. a process with a random outcome. The **sample space** $$\Omega$$ is the set of all possible outcomes $$\omega$$ of the experimement. Those $$\omega$$ are selected randomly according to certain criteria. A subset $$A \subset \Omega$$ is called an **event**, and can be regarded as a true statement about all $$\omega$$ in that $$A$$. The **event space** $$\mathcal{F}$$ is a set of events $$A$$ that are interesting to us, i.e. we have subjectively chosen $$\mathcal{F}$$ based on the problem at hand. Since events $$A$$ represent statements about outcomes $$\omega$$, and we would like to use logic on those statemenets, we demand that $$\mathcal{F}$$ is a [$$\sigma$$-algebra](/know/concept/sigma-algebra/). Finally, the **probability measure** or **probability function** $$P$$ is a function that maps $$A$$ events to probabilities $$P(A)$$. Formally, $$P : \mathcal{F} \to \mathbb{R}$$ is defined to satisfy: 1. If $$A \in \mathcal{F}$$, then $$P(A) \in [0, 1]$$. 2. If $$A, B \in \mathcal{F}$$ do not overlap $$A \cap B = \varnothing$$, then $$P(A \cup B) = P(A) + P(B)$$. 3. The total probability $$P(\Omega) = 1$$. The reason we only assign probability to events $$A$$ rather than individual outcomes $$\omega$$ is that if $$\Omega$$ is continuous, all $$\omega$$ have zero probability, while intervals $$A$$ can have nonzero probability. ## Random variable Once we have a probability space $$(\Omega, \mathcal{F}, P)$$, we can define a **random variable** $$X$$ as a function that maps outcomes $$\omega$$ to another set, usually the real numbers. To be a valid real-valued random variable, a function $$X : \Omega \to \mathbb{R}^n$$ must satisfy the following condition, in which case $$X$$ is said to be **measurable** from $$(\Omega, \mathcal{F})$$ to $$(\mathbb{R}^n, \mathcal{B}(\mathbb{R}^n))$$: $$\begin{aligned} \{ \omega \in \Omega : X(\omega) \in B \} \in \mathcal{F} \quad \mathrm{for\:any\:} B \in \mathcal{B}(\mathbb{R}^n) \end{aligned}$$ In other words, for a given Borel set (see [$$\sigma$$-algebra](/know/concept/sigma-algebra/)) $$B \in \mathcal{B}(\mathbb{R}^n)$$, the set of all outcomes $$\omega \in \Omega$$ that satisfy $$X(\omega) \in B$$ must form a valid event; this set must be in $$\mathcal{F}$$. The point is that we need to be able to assign probabilities to statements of the form $$X \in [a, b]$$ for all $$a < b$$, which is only possible if that statement corresponds to an event in $$\mathcal{F}$$, since $$P$$'s domain is $$\mathcal{F}$$. Given such an $$X$$, and a set $$B \subseteq \mathbb{R}$$, the **preimage** or **inverse image** $$X^{-1}$$ is defined as: $$\begin{aligned} X^{-1}(B) = \{ \omega \in \Omega : X(\omega) \in B \} \end{aligned}$$ As suggested by the notation, $$X^{-1}$$ can be regarded as the inverse of $$X$$: it maps $$B$$ to the event for which $$X \in B$$. With this, our earlier requirement that $$X$$ be measurable can be written as: $$X^{-1}(B) \in \mathcal{F}$$ for any $$B \in \mathcal{B}(\mathbb{R}^n)$$. This is also often stated as "$$X$$ is *$$\mathcal{F}$$-measurable"*. Related to $$\mathcal{F}$$ is the **information** obtained by observing a random variable $$X$$. Let $$\sigma(X)$$ be the information generated by observing $$X$$, i.e. the events whose occurrence can be deduced from the value of $$X$$, or, more formally: $$\begin{aligned} \sigma(X) = X^{-1}(\mathcal{B}(\mathbb{R}^n)) = \{ A \in \mathcal{F} : A = X^{-1}(B) \mathrm{\:for\:some\:} B \in \mathcal{B}(\mathbb{R}^n) \} \end{aligned}$$ In other words, if the realized value of $$X$$ is found to be in a certain Borel set $$B \in \mathcal{B}(\mathbb{R}^n)$$, then the preimage $$X^{-1}(B)$$ (i.e. the event yielding this $$B$$) is known to have occurred. In general, given any $$\sigma$$-algebra $$\mathcal{H}$$, a variable $$Y$$ is said to be *"$$\mathcal{H}$$-measurable"* if $$\sigma(Y) \subseteq \mathcal{H}$$, so that $$\mathcal{H}$$ contains at least all information extractable from $$Y$$. Note that $$\mathcal{H}$$ can be generated by another random variable $$X$$, i.e. $$\mathcal{H} = \sigma(X)$$. In that case, the **Doob-Dynkin lemma** states that $$Y$$ is only $$\sigma(X)$$-measurable if $$Y$$ can always be computed from $$X$$, i.e. there exists a function $$f$$ such that $$Y(\omega) = f(X(\omega))$$ for all $$\omega \in \Omega$$. Now, we are ready to define some familiar concepts from probability theory. The **cumulative distribution function** $$F_X(x)$$ is the probability of the event where the realized value of $$X$$ is smaller than some given $$x \in \mathbb{R}$$: $$\begin{aligned} F_X(x) = P(X \le x) = P(\{ \omega \in \Omega : X(\omega) \le x \}) = P(X^{-1}(]\!-\!\infty, x])) \end{aligned}$$ If $$F_X(x)$$ is differentiable, then the **probability density function** $$f_X(x)$$ is defined as: $$\begin{aligned} f_X(x) = \dv{F_X}{x} \end{aligned}$$ ## Expectation value The **expectation value** $$\mathbf{E}[X]$$ of a random variable $$X$$ can be defined in the familiar way, as the sum/integral of every possible value of $$X$$ mutliplied by the corresponding probability (density). For continuous and discrete sample spaces $$\Omega$$, respectively: $$\begin{aligned} \mathbf{E}[X] = \int_{-\infty}^\infty x \: f_X(x) \dd{x} \qquad \mathrm{or} \qquad \mathbf{E}[X] = \sum_{i = 1}^N x_i \: P(X \!=\! x_i) \end{aligned}$$ However, $$f_X(x)$$ is not guaranteed to exist, and the distinction between continuous and discrete is cumbersome. A more general definition of $$\mathbf{E}[X]$$ is the following Lebesgue-Stieltjes integral, since $$F_X(x)$$ always exists: $$\begin{aligned} \mathbf{E}[X] = \int_{-\infty}^\infty x \dd{F_X(x)} \end{aligned}$$ This is valid for any sample space $$\Omega$$. Or, equivalently, a Lebesgue integral can be used: $$\begin{aligned} \mathbf{E}[X] = \int_\Omega X(\omega) \dd{P(\omega)} \end{aligned}$$ An expectation value defined in this way has many useful properties, most notably linearity. We can also define the familiar **variance** $$\mathbf{V}[X]$$ of a random variable $$X$$ as follows: $$\begin{aligned} \mathbf{V}[X] = \mathbf{E}\big[ (X - \mathbf{E}[X])^2 \big] = \mathbf{E}[X^2] - \big(\mathbf{E}[X]\big)^2 \end{aligned}$$ It is also possible to calculate expectation values and variances adjusted to some given event information: see [conditional expectation](/know/concept/conditional-expectation/). ## References 1. U.H. Thygesen, *Lecture notes on diffusions and stochastic differential equations*, 2021, Polyteknisk Kompendie.