The Bayesian probability is named after Thomas Bayes. It interprets probability as a degree of personal conviction and differs from the objectivistic perceptions of probability, such as the frequentistic concept of probability (which interprets probability as relative frequency). The concept of probability is often used to re-measure the plausibility of a statement in the light of new findings. Laplace discovered later independently of Bayes and used it to solve problems in celestial mechanics, in medical statistics etc.
For example, Laplace estimated Saturn’s mass based on existing astronomical observations of its orbit. He explained the results along with a hint of his uncertainty: “I bet that with 1 in 11,000 error that this result is no greater than 1/100 of its value.” Laplace would have won the bet because 150 years later, its result had to be corrected by only 0.37% based on new data.
The probability was first elaborated at the beginning of the 20th century, especially by Harold Jeffreys and Frank Plumpton Ramsey. Ramsey later developed an approach which he could not pursue due to his early death, but which was independently adopted by Finetti. The basic idea is to take rational estimates as a generalization of betting strategies: there is a lot of information/measurements/data points, and an answer is sought to the question of how much one would bet on the correctness of one’s assessment or which odds one would give. Several arguments against (frequentist) statistical methods were based on this basic idea, which has been debated between Bayesian and Frequentist since the 1950s.
Formalization of probability concept
If one is willing to interpret probability as security in the personal assessment of a fact, the question arises what logical properties this probability must have in order not to be contradictory. Cox made a significant contribution to this. It calls for the validity of the following principles:
- Transitivity: If probability A is greater than probability B, and probability B is greater than probability C, then probability A must also be greater than probability C. Without this property, it would not be possible to express probabilities in real numbers, because real numbers are arranged transitively. Besides, paradoxes such as this would occur: a man who does not understand the transitivity of probability has bet on horse A in a race. But he now believes horse B is better and exchanges his card. He has to pay something, but he doesn’t mind because he has a better card now. Then he thinks horse C is better than horse B. Again, he exchanges and has to pay something. But now he believes horse A is better than horse C. Again, he exchanges and has to pay something. He always thinks he’s getting a better card, but now everything is back to the same, but he’s gotten poorer.
- Negation: If we have an expectation about the truth of something, then we implicitly also have an expectation about its untruth.
- Conditioning: If we expect the truth of H, and also an expectation about the truth of D if H is true, then we implicitly also expect the simultaneous truth of H and D.
- Consistency (soundness): If there are multiple methods of using certain information, then the conclusion must always be the same.
Generally speaking, there are two interpretations of Bayesian probability. For objectivists, interpret probability as an extension of logic, probability quantifies the reasonable expectation that everyone (even a robot) who share the same knowledge according to the rules of Bayesian statistics, which can be justified by Cox’s theorem. For subjectivists, probability corresponds to a personal belief. Rationality and coherence allow for substantial variation within the restrictions they raise; restrictions are justified by the argument by Finetti’s decision theory and theorem. The objective and subjective variants of Bayesian probability differ mainly in their interpretation and construction of the previous probability.
Practical importance in statistics
To be able to tackle such problems in the context of frequentist interpretation, the uncertainty is described there using a specially invented variable random size. Bayesian probability theory does not require such an auxiliary size. Instead, it introduces the concept of a priori probability, which summarizes the observer’s prior knowledge and basic assumptions in a probability distribution. Representatives of the Bayes approach see it as a great advantage to explicitly express priori knowledge and a prior assumptions in the model.