Conditional Distribution
In section 2.3 we talked about conditional probability, and in section 3.2 we talked about the general idea of conditional expectation based on the idea of conditional probability. Here is a review
Definition Let be two random variable. When we say the conditional random variable of given , denoted by , we mean the random variable that has the following c.d.f: Note can be a number or a set of number.
Now if , we have a divide by zero issue. This can happen if is a constant and is continuous, or if is measurable but has measure zero (I will not discuss what "measure zero" means, refer to Lebesgue theory of integration).... etc. If this definition does not work, how do we define or ?
If are two continuous random variable, then . Let . Intuitively, should be subject to a scaling factor, i.e, . But what should be?
By focusing on the consequence of a conditional probability, the answer is immediate. If is a conditional c.d.f, it should be the case that because it is the probability that takes on any value when is . So we wish Therefore So we have arrived at our reasonable definition
As usually, we can always define p.d.f in terms of c.d.f.
To further illustrate these definitions makes sense, if is very small, multiply it to both sides of , we get which is the conditional probability that is very close to , given that is very close to . So we have arrived at our definition of continuous random variable when their joint distribution exists.
Definition If are two continuous random variable whose joint density function is , then
Similarly, for discrete random variable Definition If are two discrete random variable whose joint density function is , then
Example We say normal random variables has bivariate normal distribution if their joint p.d.f for some , and Now since is a p.d.f (a conditional p.d.f is still p.d.f), it must be the case that , and
Knowing enables us to find because by the definition of , it is easy to show
For the following derivation, I will omit what are, because those therms are irrelevant to our discussion.
Similarly, . Note this interesting point, even if the joint distribution of exists and are both normal, it doesn't mean they are independent (of course, you might think the question is why should they be). But under what condition are they independent? when , we see . This is in fact the correlation of and .
Another example illustrates you can not always apply the formula , because the joint distribution might not exists for the two random variables.
Example If we know the probability of an experiment being successful is p, p exits but unknown. We also know that the value of p is a beta distribution with parameter . So we decide to do the experiment n+m times and we found out that n of which turned out successful. Now what do we know about the distribution of p?
Solution Let be i.i.d , where is 1 if the ith experiment turns out successful and 0 otherwise. Given and , the conditional c.d.f And this must be the p.d.f for .