Arthur White
Instructor: Arthur White
Email: arwhite@tcd.ie
Office: Room 144, Lloyd Building
Office hours: 10am-12pm Fridays
Email me to schedule a meeting, or I can also meet you remotely using Teams
All material will appear on blackboard
Lectures
Monday 1pm LB 1.07
Friday 9am LB 1.07
Case studies will accompany lecture material
Email: arwhite@tcd.ie
Discussion board
Your interaction and feedback are crucial
Input from class reps always useful
This module is assessed 100% by coursework, i.e., no exam
2 x small assignments: 15% each. These will be problem sets
Main assignment: 70%. This will be a report describing a detailed analysis of a complex data set
All assignments will be submitted through Turnitin
These will be scheduled with goal to give you plenty of time to complete, especially main assignment. More details to follow.
rstan and brmsThere is no compulsory textbook for this course, but the following cover different aspects of the material:
P.D. Hoff, A first course in Bayesian statistical methods. Springer, 2009. Library e-link
S.N. Wood, Core Statistics. Cambridge University Press, 2015. Library link and free pdf
C.M. Bishop, Pattern recognition and machine learning. Springer, 2006. Library link and free pdf:
B. Efron & T. Hastie. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science Cambridge University Press, 2016. Free pdf
A. A. Johnson, M. Q. Ott, M. Dogucu. Bayes Rules! An introduction to applied Bayesian modeling. Chapman & Hall, 2022. Full text online
bayesrules.This module will provide an overview of statistical models and how to apply them to analyse data.
We will focus on theory and application:
Our models will be motivated by different research problems
“Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine,” Polack et al (2020)
Sample of 43,548 participants randomized to receive mRNA Covid-19 Vaccine or placebo (i.e., control group).
Authors report that 8 cases of Covid-19 recorded from vaccinated patients while 162 cases recorded from the control arm.
Vaccine efficacy estimated by \(VE = 100\times(1-RR),\) where \(RR\) is the estimated ratio of confirmed cases of Covid-19 in vaccine vs. placebo groups.
How effective is the vaccine?
Students from 100 different schools take a standardised test.
Can we quantify which schools are best? By how much?
## school mathscore
## 763 40 46.98
## 1584 81 32.85
## 1624 84 27.88
## 489 26 54.48
## 985 51 69.42
## 569 30 46.95
## 1831 94 51.76
## 469 25 41.68
## 1315 66 30.41
## 444 24 49.08
## 1367 69 38.47
## 716 37 52.13
Special aerobics vs standard running programme, \(n = 12\)
Can we quantify the programme’s effect on oxygen increase, accounting for age?
## uptake aerobic age
## 1 -0.87 0 23
## 2 -10.74 0 22
## 3 -3.27 0 22
## 4 -1.97 0 25
## 5 7.50 0 27
## 6 -7.25 0 20
## 7 17.05 1 31
## 8 4.96 1 23
## 9 10.40 1 27
## 10 11.05 1 28
## 11 0.26 1 22
## 12 2.51 1 24
These examples have some aspects in common
Using specific individuals (data) to describe a population (model parameters)
Limited/finite sample sizes (at least e.g., per group)
Variation in the data must be accounted for
Generically, a statistical model will have parameter(s) \(\theta.\)
A typical statistical analysis will be interested in answering some or all of of the following questions:
Point estimation: What value(s) of \(\theta,\) is(are) most consistent with the observed data \(y?\)
Hypothesis testing Are the estimated value(s) of \(\theta\) consistent with some pre-specified value(s) \(\theta_0?\)
Interval estimation: What range(s) of value(s) of \(\theta\) are most plausibly consistent with \(y?\)
We may also have a choice of different, related models that we can apply to the data at hand. In such cases, we will have to decide:
Model choice: Which model is the most appropriate for these data? (i.e., relative performance)
Goodness of fit: is our preferred model a suitable fit for the data?
We are going to study different models:
We will fit these models to data using a Bayesian computational framework
We will communicate our findings in terms of the original context of the research question