Bayesian vs. Frequentist Statistics

Abstract:

Bayesian and frequentist approaches represent two fundamental paradigms in statistical inference. The Bayesian framework updates prior beliefs using observed data through Bayes’ Rule, yielding a full probability distribution over parameters. In contrast, the frequentist view treats parameters as fixed and bases inference on sampling distributions. Each approach offers distinct advantages in handling uncertainty, with Bayesian methods excelling in incorporating prior knowledge and frequentist methods focusing on long-run error control.

Bayes’ Rule and the Foundations of Bayesian Statistics

Bayesian statistics is a probabilistic framework for inference that models unknown parameters as random variables. It centers on Bayes’ Rule , which provides a mathematical way to update beliefs in light of new evidence.


Bayes’ Rule

Bayes’ Rule describes how to compute the posterior probability of a hypothesis or parameter $\theta$ after observing data $D$:

\[ P(\theta | D) = \frac{P(D|\theta) \cdot P(\theta)}{P(D)} \]

Where: - $P(\theta|D)$: Posterior — updated belief about $\theta$ after observing $D$
- $P(D|\theta)$: Likelihood — probability of data given $\theta$
- $P(\theta)$: Prior — initial belief about $\theta$ before seeing data
- $P(D)$: Marginal likelihood — total probability of the data under all possible $\theta$

This rule is derived from conditional probability and underpins the Bayesian approach to reasoning under uncertainty.

Bayesian Inference Workflow

Bayesian inference proceeds in three conceptual steps: 1. Define a prior — incorporate previous knowledge or assumptions about the parameter. 2. Specify a likelihood — model how data is generated given the parameter. 3. Compute the posterior — update the prior using observed data via Bayes’ Rule.

Unlike frequentist methods, which produce point estimates or confidence intervals, Bayesian methods yield a full probability distribution over parameters, enabling more nuanced interpretations and uncertainty quantification.

Example

Medical diagnosis : A doctor updates the probability of disease after a test result by factoring in test accuracy (likelihood) and disease prevalence (prior). For rare diseases, even highly accurate tests can yield many false positives due to the low base rate.

# Disease diagnosis using Bayes' Rule
# P(Disease) = 0.01 (1% prevalence)
# P(Positive | Disease) = 0.99 (true positive rate)
# P(Positive | No Disease) = 0.05 (false positive rate)

# Given a positive test, what is the probability the person actually has the disease?

P_D = 0.01
P_not_D = 1 - P_D
P_pos_given_D = 0.99
P_pos_given_not_D = 0.05

# Total probability of a positive test
P_pos = P_pos_given_D * P_D + P_pos_given_not_D * P_not_D

# Bayes' Rule
P_D_given_pos = (P_pos_given_D * P_D) / P_pos

print(f"Probability of having the disease given a positive test: {P_D_given_pos:.2%}")

Use Case

A/B Testing in Marketing :
Bayesian A/B testing provides a flexible alternative to traditional hypothesis testing by estimating the probability that one variant outperforms another. Instead of relying on p-values and fixed sample sizes, it continuously updates the probability of each variant being the best as new data arrives.

For example, in an email campaign, you might compare two subject lines (A and B) based on their click-through rates. With a Bayesian approach, you start with a prior belief about each variant’s performance and use the observed clicks to update your beliefs. The result is a posterior distribution for each variant’s conversion rate, from which you can directly compute:

  • The probability that A is better than B
  • The expected lift from choosing one over the other
  • When to stop the test based on decision thresholds (e.g., 95% probability A is better)

This approach supports real-time decision-making, reduces the risk of false discoveries, and aligns better with business goals than rigid binary outcomes.

Example

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta

# Prior belief: Beta(2, 2) — a balanced prior
alpha_prior, beta_prior = 2, 2

# Observed data: 30 successes out of 50 trials
successes = 30
failures = 50 - successes

# Posterior parameters
alpha_post = alpha_prior + successes
beta_post = beta_prior + failures

# Plotting the prior and posterior
x = np.linspace(0, 1, 1000)
prior_dist = beta(alpha_prior, beta_prior).pdf(x)
posterior_dist = beta(alpha_post, beta_post).pdf(x)

plt.plot(x, prior_dist, label='Prior', linestyle='--')
plt.plot(x, posterior_dist, label='Posterior')
plt.title('Bayesian Update: Prior to Posterior')
plt.xlabel('Conversion Rate')
plt.ylabel('Density')
plt.legend()
plt.show()

Features of Bayesian and Frequentist Inference

Bayesian and frequentist inference represent two distinct approaches to statistical reasoning:

Bayesian Inference

  • Parameters are treated as random variables with probability distributions.
  • Inference is based on the posterior distribution , which combines the prior and likelihood using Bayes’ Rule.
  • Allows probabilistic statements about parameters (e.g., “there is a 95% chance that θ is between a and b”).
  • Naturally incorporates prior knowledge or beliefs.
  • Supports sequential updating as new data arrives.

Frequentist Inference

  • Parameters are fixed but unknown quantities.
  • Inference is based on the sampling distribution of estimators.
  • Relies on concepts like confidence intervals and p-values .
  • Does not use prior information ; conclusions depend solely on data at hand.
  • Emphasizes long-run frequency properties (e.g., “95% of intervals from repeated samples will contain θ”).

Example

Drug Effectiveness Trials :
A frequentist approach might test whether a new drug’s effect is significantly different from a placebo using p-values. A Bayesian approach, however, would model the uncertainty in the drug's effect and give a probability distribution over likely effect sizes — enabling more risk-benefit and personalized decision-making.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# Simulated data: Treatment group mean effect
np.random.seed(42)
sample_mean = 1.2
sample_std = 0.5
n = 30

# Frequentist: 95% confidence interval for the mean
conf_int_low = sample_mean - 1.96 * (sample_std / np.sqrt(n))
conf_int_high = sample_mean + 1.96 * (sample_std / np.sqrt(n))

# Bayesian: Posterior assuming Normal prior and Normal likelihood
prior_mu, prior_sigma = 0, 1  # weak prior
likelihood_sigma = sample_std / np.sqrt(n)

posterior_mu = (prior_mu / prior_sigma**2 + sample_mean / likelihood_sigma**2) / \
               (1 / prior_sigma**2 + 1 / likelihood_sigma**2)
posterior_sigma = np.sqrt(1 / (1 / prior_sigma**2 + 1 / likelihood_sigma**2))

# Plotting
x = np.linspace(-1, 3, 500)
frequentist_dist = norm(sample_mean, sample_std / np.sqrt(n)).pdf(x)
bayesian_dist = norm(posterior_mu, posterior_sigma).pdf(x)

plt.plot(x, frequentist_dist, label='Frequentist (Sampling Dist)', linestyle='--')
plt.plot(x, bayesian_dist, label='Bayesian Posterior')
plt.axvline(conf_int_low, color='gray', linestyle=':', label='95% CI')
plt.axvline(conf_int_high, color='gray', linestyle=':')
plt.title('Frequentist vs Bayesian Inference')
plt.xlabel('Estimated Effect')
plt.ylabel('Density')
plt.legend()
plt.show()

Comparing Frequentist and Bayesian Approaches

Frequentist and Bayesian inference are two foundational paradigms in statistics, each with a distinct interpretation of probability and approach to data analysis:

Frequentist Approach

  • Views probability as the long-run frequency of events under repeated sampling.
  • Parameters are fixed but unknown ; only data varies.
  • Inference is based on sampling distributions, confidence intervals, and p-values.
  • No prior beliefs are incorporated.

Bayesian Approach

  • Interprets probability as a degree of belief or certainty , updated with evidence.
  • Parameters are random variables with probability distributions.
  • Inference is based on posterior distributions , combining prior knowledge and observed data via Bayes’ Rule.
  • Naturally supports sequential learning and uncertainty quantification.

Key Differences

  • Interpretation of probability : Objective (frequentist) vs. subjective (Bayesian).
  • Parameter treatment : Fixed (frequentist) vs. probabilistic (Bayesian).
  • Use of prior information : Ignored (frequentist) vs. explicitly included (Bayesian).
  • Inference output : Confidence intervals and p-values (frequentist) vs. full posterior distributions (Bayesian).

Use Case

Estimating Conversion Rates in A/B Testing :
- A frequentist might run a test, compute a p-value, and decide based on a significance threshold (e.g., 0.05). - A Bayesian would estimate the probability that version A is better than B, incorporating prior campaign data, and make a decision based on the posterior probability.

Example

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import beta
from statsmodels.stats.proportion import proportions_ztest

# A/B test observed data
conversions_A, trials_A = 30, 100
conversions_B, trials_B = 45, 100

# Frequentist Approach: Two-proportion z-test
count = np.array([conversions_A, conversions_B])
nobs = np.array([trials_A, trials_B])

stat, pval = proportions_ztest(count, nobs, alternative='two-sided')
print("Frequentist Approach")
print(f"Observed conversion rates: A = {conversions_A/trials_A:.2f}, B = {conversions_B/trials_B:.2f}")
print(f"Z-test statistic: {stat:.3f}, p-value: {pval:.4f}")
if pval < 0.05:
    print("Result: Statistically significant difference (p < 0.05)")
else:
    print("Result: No statistically significant difference")

# Bayesian Approach: Posterior distributions with uniform Beta(1,1) priors
posterior_A = beta(1 + conversions_A, 1 + trials_A - conversions_A)
posterior_B = beta(1 + conversions_B, 1 + trials_B - conversions_B)

# Probability that B > A via simulation
samples_A = posterior_A.rvs(100_000)
samples_B = posterior_B.rvs(100_000)
prob_B_superior = np.mean(samples_B > samples_A)

# Plot posteriors
x = np.linspace(0, 1, 1000)
plt.plot(x, posterior_A.pdf(x), label='Posterior A')
plt.plot(x, posterior_B.pdf(x), label='Posterior B')
plt.title('Bayesian Posteriors for Conversion Rates')
plt.xlabel('Conversion Rate')
plt.ylabel('Density')
plt.legend()
plt.show()

print("\n Bayesian Approach")
print(f"Estimated P(B > A): {prob_B_superior:.3f}")
if prob_B_superior > 0.95:
    print("Result: High confidence that B performs better than A (Bayesian)")
else:
    print("Result: Insufficient evidence to confidently prefer B over A")

Leave a Comment

Comments

Are You a Physicist?


Join Our
FREE-or-Land-Job Data Science BootCamp