Statistics Review
Bayes Theorem
Notice that if \(A\) and \(B\) are independent, then
Joint Probability
It refers to the probability analysis the interaction between two or more random variables. Taking the example from the Lecture-2 slides.
| d1 | d2 | d3 | d4 | |
|---|---|---|---|---|
| A | 10 | 10 | 10 | 10 |
| B | 10 | 10 | 10 | 0 |
| C | 10 | 10 | 0 | 0 |
| D | 0 | 0 | 0 | 1 |
Each cell in the table below represent the probability \(P(w,d)\).
| d1 | d2 | d3 | d4 | P(w) | |
|---|---|---|---|---|---|
| A | 0.11 | 0.11 | 0.11 | 0.11 | 0.44 |
| B | 0.11 | 0.11 | 0.11 | 0.00 | 0.33 |
| C | 0.11 | 0.11 | 0.00 | 0.00 | 0.22 |
| D | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 |
| P(d) | 0.33 | 0.33 | 0.22 | 0.12 | 1.00 |
Assuming they are independent.
Conditional probability example
Consider the distribution of the four binary random variables below.

From the image, we immediately derive the individual probabilities table
| A | B | C | D | |
|---|---|---|---|---|
| =0 | 0.7 | 0.9 | 0.84 | 0.9 |
| =1 | 0.3 | 0.1 | 0.16 | 0.1 |
With a little bit of work, we can also derive the conditional probability tables. Let's do it step by step for A and B.
| A | B | \(P(A \mid B)\) |
|---|---|---|
| 0 | 0 | ? |
| 0 | 1 | ? |
| 1 | 0 | ? |
| 1 | 1 | ? |
Notice that \(A=1\) whenever \(B=1\). That means that \(P(A=1 \mid B=1)=1\). That also means that it is impossible to have \(A=0\) whenever \(B=1\). The latter is translated as \(P(A=0 \mid B=1) = 0\).
| A | B | \(P(A \mid B)\) |
|---|---|---|
| 0 | 0 | ? |
| 0 | 1 | 0 |
| 1 | 0 | ? |
| 1 | 1 | 1 |
We can also compute \(P(A=0 \mid B=0)\). Notice that we know the value \(P(A=0,B=0)\) from the distribution diagram. That is, \(P(A=0,B=0)=0.7\). Using the latter plus the definition of conditional probability we obtain
Updating our probability table
| A | B | \(P(A \mid B)\) |
|---|---|---|
| 0 | 0 | 7/9 |
| 0 | 1 | 0 |
| 1 | 0 | ? |
| 1 | 1 | 1 |
We can follow a similar path to compute \(P(A=1 \mid B=0)\). But we can also use the follow identity:
The summation of conditional probabilities over the events (left side of |) equals to \(1\).
That is, \(P(A=0 \mid B=0) + P(A=1 \mid B=0) = 1\). Therefore,
| A | B | \(P(A \mid B)\) |
|---|---|---|
| 0 | 0 | 7/9 |
| 0 | 1 | 0 |
| 1 | 0 | 2/9 |
| 1 | 1 | 1 |
We can obtain all the probability tables by repeating the previous steps for the other pairs of random variables
| A | C | \(P(A \mid C)\) |
|---|---|---|
| 0 | 0 | 11/14 |
| 0 | 1 | 1/4 |
| 1 | 0 | 3/14 |
| 1 | 1 | 3/4 |
| A | D | \(P(A \mid D)\) |
|---|---|---|
| 0 | 0 | 0.7 |
| 0 | 1 | 0.7 |
| 1 | 0 | 0.3 |
| 1 | 1 | 0.3 |
| B | A | \(P(B \mid A)\) |
|---|---|---|
| 0 | 0 | 1 |
| 0 | 1 | 2/3 |
| 1 | 0 | 0 |
| 1 | 1 | 1/3 |
| B | C | \(P(B \mid C)\) |
|---|---|---|
| 0 | 0 | 0.9 |
| 0 | 1 | 0.9 |
| 1 | 0 | 0.1 |
| 1 | 1 | 0.1 |
| B | D | \(P(B \mid D)\) |
|---|---|---|
| 0 | 0 | 0.9 |
| 0 | 1 | 0.9 |
| 1 | 0 | 0.1 |
| 1 | 1 | 0.1 |
| C | A | \(P(C \mid A)\) |
|---|---|---|
| 0 | 0 | 2/35 |
| 0 | 1 | 3/5 |
| 1 | 0 | 33/35 |
| 1 | 1 | 2/5 |
I am not going to list the remaining tables because the relations are very simple (the variable pairs are independent).
Conditional probability identities
- \(\sum_{i}{P(A=a_i \mid B=b)} = 1\)
- \(\sum_{i}{P(A=a \mid B=b_i)P(B=b_i)} = P(A=a)\)
Remarks
- We cannot define a distribution by giving arbitrarily values for conditional probabilities. In order to work, the conditional probabilities must respect the identities above.
- On the other hand, we can define a probability distribution by giving
the probabilities of all
andprobabilities. That is, if my universe has random variables \(A,B,C\) and I have the values of \(P(A,B,C)\) for all the possible values of \(A,B,C\), then I have a probability distribution.