# Bayesian theorem

## Basics

### Conditional probability

Conditional probability is the probability of an event given something already happened. It is a joint probability divdied by marginal probability.

$$P(A|B) = \frac{P(AB)}{P(B)}$$

The | means “given” 1.

### Law of total probability

$$P(Y) = P(X|Y) + P(!X|Y)= \frac{P(XY)}{P(Y)} + \frac{P(!XY)}{P(Y)}$$

### Bayesian theorem

Clearly $P(A|B)$ has some relationship with $P(B|A)$, because they share the common part $P(AB)$.

$$P(A|B) = \frac{P(AB)}{P(B)}$$

$$P(B|A) = \frac{P(BA)}{P(A)}$$

$$P(BA) = P(AB) = P(B|A)P(A) = P(A|B)P(B)$$

Therefore,

$$P(A|B) = \frac{P(B|A)P(A)}{P(B)} \prop P(B|A)P(A)$$

If we use the total probability for our denominator, it becomes

$$P(A|B) = \frac{P(B|A)P(A)}{P(B|A)P(A) + P(B|!A)P(!A)}$$

This euqation is called Bayesian theorem and it means if you are given something like P(A|B), you can find its reverse P(B|A) (i.e., the Bayesian inference, see the next block).

## Bayesian inference

### Hypothesis test

Based on the theorem, if we set B the given event as our collected data, A the thing we want to test (or a hypothesis), we change the equation into:

$$\color{blue}{Pr(H_i|data)} = \frac{\color{pink}{Pr(data|H_i)}\color{steelblue}Pr(H_i)}{\sum_{j=1}^n \color{pink}{Pr(data|H_i)}\color{steelblue}Pr(H_i)}$$

$\color{pink}{Pr(data|H_i)}$ is called $\color{pink}{likelihood}$, $\color{steelblue}Pr(H_i)$ the $\color{steelblue}{prior}$ probability, $\color{blue}{Pr(H_i|data)}$ refers to $\color{blue}{posterior}$ probability.

### Parameter estimation

Change the hypotheses to parameters to estimate, the only difference is parameter is continuous. If the probability is discrete, we use Pr(); otherwise, we use P() to represent the PDF distribution (and use integration to replace sum).

$$\color{blue}{P(\theta|data)} = \frac{\color{pink}{P(data|\theta)}\color{steelblue}P(\theta)}{\int \color{pink}{P(data|\theta)}\color{steelblue}P(\theta)d\theta}$$

The $\theta$ here refers to the single parameter in our PDF distribution. If there are two or more, simply change it.

### What is science?

As you can see, In conclusion, Bayesian inference does work like below: Initial belief + New Data = Updated belief. And this is exactly same to the philosophy of science. Science refers to a system of acquiring knowledge and update our cognition. The scientific method consists of induction and deduction (see the figure below).

1. 图像化理解：在B的圈子里，和A重叠的概率 ↩︎