MAS61006 Bayesian Statistics and Computational Methods

 Both semesters, 2021/22 30 Credits Lecturer: Dr Miguel Juarez Timetable Reading List Aims Outcomes Teaching Methods Assessment Full Syllabus

This module first introduces the Bayesian approach to statistical inference, presenting the foundations, applications in simple settings (using “conjugate priors”), and computational methods using Markov chain Monte Carlo. In parallel, students will also learn R programming, with a focus on implementing various Monte Carlo methods. In semester 2, students will study further computational methods which can be used for implementing various statistical inferential procedures. More complex Bayesian models will also be presented, which will provide motivation for the computational methods taught. Students will learn how to implement Markov chain Monte Carlo methods using the probabilistic programming language “Stan”.

Not with: MAS364 (Bayesian Statistics)
No other modules have this module as a prerequisite.

Outline syllabus

• Bayesian inference; analysis with conjugate priors;
• Markov chain Monte Carlo methods, including Hamiltonian Monte Carlo implemented with Stan;
• Other Monte Carlo computational methods, e.g. bootstrapping;
• R programming.

Office hours

Monday 14:30-16:00

Aims

• introduce the Bayesian approach to statistical inference;
• present a range of computational tools for implementing Bayesian and frequentist inference, in otherwise intractable problems;
• demonstrate how various computational methods and be implemented using R and Stan;
• enhance students’ broader understanding of statistical methodology and develop their professional skills as applied statisticians.

Learning outcomes

• explain the difference between Bayesian and frequentist statistical inference;
• carry out Bayesian inference for a range of standard statistical problems;
• implement Markov chain Monte Carlo sampling using Stan;
• write functions in R to implement simple computational methods, in particular, Monte Carlo methods for frequentist inference;
• identify situations where Bayesian multi-level models can be used for data analysis, fit these models and interpret the results;
• use appropriate statistical methods for analysing partially observed data;
• conduct a Bayesian analysis in a substantial case study, and communicate the key issues to a non-expert.

Teaching methods

There will be formal lectures, which will involve the explanation of theoretical concepts and their application to worked examples. The motivation, rationale, advantages and disadvantages of the various methods taught will be discussed as appropriate, with examples given of communicating issues to a lay audience. Students will also have computer lab sessions, for learning about R programming. Detailed lecture notes will be provided, which students will be expected to study in their own time to assimilate the material.

Lectures will include practical demonstrations of analysis using R and Stan. Students will work through set exercises in both theory and R implementation, and submit homework for marking, although this will not be part of the formal assessment. Students will undertake a project which will involve investigating the application of methods and concepts covered in the module in a substantial case study, and will be required to communicate their findings in a written report, at a level so that the key findings/issues can be understood by a non-expert reader.

40 lectures, no tutorials, 8 practicals

Assessment

One formal 3 hour written examination (60%). All questions compulsory. One project (40%).

Full syllabus

Semester 1
Bayesian Statistics

• Subjective probability;
• prior distributions;
• Bayesian learning;
• predictive distributions;
• non-conjugate priors;
• introducing MCMC: Full conditionals, Gibbs Metropolis-Hastings, convergence;
R Programming
• vectors, lists, matrices;
• writing functions;
• Monte Carlo simulation in R: Monte Carlo integration, importance sampling, Monte Carlo tests;
• loops, branching;
• the apply family.
Semester 2
• Recap of MCMC. Problems with poor performance. Introduction to Hamiltonian Monte Carlo;
• introduction to Stan
• Bayesian multilevel models in Stan: linear and generalised linear models;
• missing data in Stan;
• bootstrapping;
• cross-validation;
• variational inference.