MAS223 Statistical Inference and Modelling
Both semesters, 2019/20 | 20 Credits | ||||
Lecturer: | Dr Jonathan Potts | uses MOLE | Timetable | Reading List | |
Aims | Outcomes | Assessment | Full Syllabus |
This unit develops methods for analysing data, and provides a foundation for further study of probability and statistics at Level 3. It introduces some standard distributions beyond those met in MAS113, and proceeds with study of continuous multivariate distributions, with particular emphasis on the multivariate normal distribution. Transformations of univariate and multivariate continuous distributions are studied, with the derivation of sampling distributions of important summary statistics as applications. The concepts of likelihood and maximum likelihood estimation are developed. Data analysis is studied within the framework of linear models. There will be substantial use of the software package R.
Prerequisites: MAS113 (Introduction to Probability and Statistics)
Corequisites: MAS211 (Advanced Calculus and Linear Algebra)
The following modules have this module as a prerequisite:
MAS352 | Stochastic Processes and Finance |
MAS360 | Practical and Applied Statistics |
MAS361 | Medical Statistics |
MAS362 | Financial Mathematics |
MAS364 | Bayesian Statistics |
MAS367 | Linear and Generalised Linear Models |
MAS369 | Machine Learning |
MAS370 | Sampling Theory and Design of Experiments |
MAS371 | Applied Probability |
MAS372 | Time Series |
MAS452 | Stochastic Processes and Finance |
MAS461 | Medical Statistics |
MAS462 | Financial Mathematics |
MAS464 | Bayesian Statistics |
MAS467 | Linear and Generalised Linear Models |
MAS468 | Statistical Computing in R |
MAS469 | Machine Learning |
Outline syllabus
- Univariate distribution theory
- Multivariate distribution theory
- Likelihood
- Likelihood case studies
- Linear models
Office hours
Office hours are 1-2pm on Wednesday in G27C, Hicks, for students to ask the lecturer questions about the module. Please try to email in advance if you would like to meet during this time. You can also just pop by, but I will give preference to those who have a pre-arranged slot.
If you cannot make Wednesday 1-2pm, and would like to meet, please email me to arrange another time.
Please do not come to my office outside my office hours without pre-arrangement as I will be busy with other work and will have to turn you away.
Aims
- extend students' familiarity with standard probability distributions;
- give practice in handling discrete and continuous distributions, especially continuous multivariate ones;
- instil an understanding of the rationale and techniques of likelihood exploration and maximisation;
- consider linear regression models in detail;
- extend the comparison of means from two to several groups through ANOVA models;
- give students experience in the use of R for fitting linear models;
Learning outcomes
- handle a wide range of standard distributions, including the multivariate normal;
- calculate joint, marginal and conditional continuous distributions;
- manipulate multivariate means, variances and covariances;
- transform univariate and multivariate continuous random variables;
- derive, manipulate and interpret likelihood functions, and find maximum likelihood estimators;
- understand regression and ANOVA models as examples of linear models;
- estimate parameters in a linear model;
- make inferences about model parameters through appropriate model comparisons;
- develop a 'best-fitting' model in a systematic and pragmatic way;
- undertake model checking procedures through the use of residuals;
- use R to implement methods covered in the course;
- prepare a structured word processed report of the statistical analysis of an open-ended problem.
44 lectures, 8 tutorials, 3 practicals
Assessment
One 2hr 30 minute exam (90%) and a practical project (10%).
Full syllabus
Univariate distribution theory (6 lectures)
Revision of sample spaces, events and random variables; distribution functions, probability functions, probability density functions; moments; random variables without a mean (Cauchy as example); discrete standard distributions: hypergeometric, negative binomial; revision of normal; gamma and beta functions; gamma (χ^{2} as special case) and beta distributions; visualising distributions in R; transformations of univariate random variables including monotonic case and non-monotonic examples.
Random vectors; multivariate p.d.f.s for continuous random vectors; p.d.f.s of marginal and conditional distributions; covariance and correlation; independence; conditional expectation and variance; transformations of multivariate p.d.f.s using Jacobians; applications of transformations including the t distribution and Box-Muller simulation of normal r.v.s; covariance matrices; linear transformations and their effect on covariance matrices; multivariate normal including matrix form of p.d.f.; linear transformations of the multivariate normal; conditional distributions of multivariate normal components. Likelihood (5 lectures)
Data and random samples; models and parameters; definition of likelihood and examples; introduction to maximum likelihood estimation; log likelihood; one parameter MLE; two parameter MLE using Hessian, including MLE for Normal with unknown mean and variance; interval estimation using likelihood; hypothesis tests using likelihood. Likelihood case studies (2 lectures)
Two or three applications of likelihood inference to case studies. Linear models (22 lectures)
- Matrix representation of a linear model: linear regression, polynomial regression, multiple regression and ANOVA models as examples of linear models.
- Least squares estimation: least squares estimators in matrix notation; distributional properties of least squares estimators and the residual sum of squares.
- Hypothesis testing: the F-test for comparing nested linear models; t-tests.
- Prediction: confidence intervals and prediction intervals
- Model checking: diagnostics using standardized residuals; transformations; R^{2}.
- Factor independent variables: ANCOVA and one-way and two-way ANOVA.
- Fitting and analysing linear models using R.
Reading list
Type | Author(s) | Title | Library | Blackwells | Amazon |
---|---|---|---|---|---|
B | Draper and Smith | Applied Regression Analysis | 519.536 (D) | Blackwells | Amazon |
B | Faraway | Linear Models with R | 519.538 (F) | Blackwells | Amazon |
B | Freund, Miller and Miller | John E. Freundâ€™s Mathematical Statistics with Applications | 519.5 (F) | Blackwells | Amazon |
B | Kleinbaum, Kupper, Muller and Nizam | Applied Regression Analysis and Other Multivariable Methods | 519.536 (A) | Blackwells | Amazon |
B | Mood, Graybill and Boes | Introduction to the Theory of Statistics | 519.5 (M) | Blackwells | Amazon |
(A = essential, B = recommended, C = background.)
Most books on reading lists should also be available from the Blackwells shop at Jessop West.
Timetable (semester 2)
Wed | 12:00 - 12:50 | tutorial | Hicks Lecture Theatre D | ||||
Mon | 15:00 - 15:50 | lecture | Alfred Denny Building Lecture Theatre 1 | ||||
Tue | 16:00 - 16:50 | tutorial | (group 1) | (odd weeks) | Hicks Seminar Room F28 | ||
Tue | 16:00 - 16:50 | tutorial | (group 2) | (odd weeks) | Hicks Seminar Room F38 | ||
Tue | 16:00 - 16:50 | lab session | (group A) | (odd weeks) | Portobello Centre Lab 28 | ||
Tue | 16:00 - 16:50 | lab session | (group B) | (odd weeks) | Arts Tower Computer Room 1012 | ||
Wed | 12:00 - 12:50 | tutorial | (group 3) | (odd weeks) | K14 Hicks Building | ||
Wed | 12:00 - 12:50 | tutorial | (group 4) | (odd weeks) | Arts Tower Lecture Theatre 7 | ||
Wed | 12:00 - 12:50 | tutorial | (group 5) | (odd weeks) | Hicks Lecture Theatre D | ||
Wed | 12:00 - 12:50 | lab session | (group C) | (odd weeks) | FC-B56 - Firth Court | ||
Wed | 12:00 - 12:50 | lab session | (group D) | (odd weeks) | Geography Building Computer Room B4 | ||
Wed | 12:00 - 12:50 | lab session | (group E) | (odd weeks) | Hicks Room G25 | ||
Fri | 12:00 - 12:50 | lecture | Hicks Lecture Theatre 1 |