To motivate this note, I’ll pose the following problem:
Consider
. What is ?
At first glance, the answer seems simple: it’s 1/2! Closer inspection reveals that the event
Redefining Conditional Probability
To account for events of measure zero, we must first redefine
Definition: A Sub-
algebra of is a algebra such that
Definition: A Measurable Function
between two measurable spaces is a function such that for every , . If , then is said to be -measurable
Definition: The
-algebra generated by a function is the collection of all inverse images .
With these three definitions, we define the conditional probability function on a sub-
Definition: The conditional probability
is a -measurable random variable such that
In terms of integrals, we can rewrite this as
Defining the conditioning on a sub-
- If we want to condition on a random variable
, then - If we want to condition on multiple random variables
, then where (can be extended to a countable sequence of random variables)
Therefore, this definition of conditional probability is the most general definition.
Existence and Uniqueness of Conditional Probability
From the integral definition, it may be obvious that conditional probability is unique up to a set of measure zero, that is, if
Existence can be proved in many ways: here, we use the Radon-Nikodym theorem. I won’t be proving the theorem, but I’ll quote it, along with some exposition, which should be sufficient.
Definition: A measure
is said to be absolutely continuous with respect to another measure (both defined on the same -algebra) if for every measurable set , . We denote this as , and we say that dominates
Theorem (Radon-Nikodym): if
and are defined on such that , then there exists a -measurable function such that for any ,
The function
We now prove our claim
Claim:
exists and is unique upto a set of probability 0
Proof: Define
Note
Conditional Expectation, and deriving Conditional Probability from Conditional Expectation
Conditional expectation is defined in much the same way: as a random variable satisfying the property
Most of the same properties as conditional probability apply to conditional expectation as well, just that in the proof of existence, since expectation can be negative and we need a positive value for it to be a measure, we define
In several textbooks, conditional expectation is derived first, and conditional probability follows. This is because
The Problem
Now that we have this definition, how do we apply it? Sadly, we can’t directly obtain the conditional probability from it (unless we evaluate the radon-nikodym derivative), and the best we can do is assume that the conditional probability is a reasonable random variable, and then verify that it satisfies the definition.
For the given problem,
Since both these integrals are equal for all
References:
- Rosenthal, Jeffrey S. A First Look at Rigorous Probability Theory. 2nd ed, World Scientific, 2006.
- https://math.stackexchange.com/questions/2306986/uniqueness-of-conditional-expectation
- https://stats.stackexchange.com/questions/395310/what-does-conditioning-on-a-random-variable-mean
- https://www.stat.berkeley.edu/users/pitman/s205f02/lecture15.pdf (this contains the projection proof)
- http://www.stat.cmu.edu/~arinaldo/Teaching/36752/S18/Notes/lec_notes_6.pdf (this too has the projection proof)
- Billingsley, Patrick. Probability and Measure. 3rd ed, Wiley, 1995.