 The current and official versions of the course specifications are available on the web at https://www.usq.edu.au/course/specification/current.
Please consult the web for updates that may occur during the year.

# STA3301 Statistical Models

 Semester 2, 2022 Online Units : 1 Faculty or Section : Faculty of Health, Engineering and Sciences School or Department : School of Mathematics, Physics & Computing Student contribution band : Band 1 Grading basis : Graded Version produced : 18 May 2022

## Staffing

Examiner: Enamul Kabir

## Requisites

Pre-requisite: STA3300 or approval of examiner or Students must have completed STA8170 and be enrolled in one of the following Programs: GCSC or GDSI or MSCN or MADS or MSCR or DPHD.

## Overview

Linear Models and Generalised Linear Models are very widely used statistical tools. Linear models allow us to model data with normally distributed errors and generalised linear models extend these methods to a wider family of distributions. While students are expected to have obtained some understanding of linear regression techniques in previous courses, this course offers a more complete introduction to linear models and their application, then, building on this, extends into generalised linear models. The key functions of linear models are for describing the relationships between variables and predicting outcomes and so inference methods will be addressed in some detail. Finally, as models only give useful information when they provide an accurate reflection of the 'real world', various diagnostic tests on the appropriateness and goodness of fit of various models will be introduced. This course has relevance to all students seeking to pursue a career involving applied statistics.

This course introduces and extends the student's knowledge of linear models. The mathematical development of these models will be considered; however, the focus will be on practical applications. The statistical program R will be introduced and used throughout the course. The topics include developing multiple regression models, testing hypotheses for these models, selecting the 'best' model, diagnosing problems in model fit, shrinkage methods, developing generalised linear models, and a range of applications of generalised linear models including logistic, Poisson and log-linear models. Analysis of different statistical models are practised using the statistical software package through the R and RStudio.

## Course learning outcomes

On completion of this course students should be able to:

1. Recognise appropriate general and generalised linear models for analysis of different types of data sets
2. Apply a range of models and diagnostic techniques to test hypotheses and interpret the output correctly and in context.
3. Explore the capabilities of and implement R software (RStudio) as a statistical package in analysing different statistical models.
4. Interpret and communicate the results of analyses to a diverse audience.

## Topics

Description Weighting(%)
1. Review of multiple regression: specifying the model, least squares estimators of regression parameters and variance, maximum likelihood estimators of the regression parameters and variance, multiple and partial correlation, regression through the origin. 10.00
2. Inference on the normal model: interval estimation of the regression parameters and variance, prediction of future responses, analysis of variance, coefficient of determination, tests on single regression coefficients, confidence regions, tests on a subset of the regression coefficients, procedures for model selection, tests on the general linear model, test of goodness fit. 20.00
3. Model selection and checking: criteria for selecting regressors, residual analysis, data transformations, weighted least squares, detecting outliers and influential observations, multicollinearity, detecting multicollinearity, Ridge, LASSO and Elastic Net regression. 20.00
4. Generalised linear models: the exponential family of distributions, the mean and variance of the exponential family, specifying the generalised linear model, the link function, estimation of the regression parameters, adequacy of the model, the deviance, analysis of deviance and model selection. 10.00
5. Binary variables and logistic regression: probability distributions, generalised linear models, logistic regression model, deviance, Pearson's Chi-Square test, residuals and other diagnostics. 20.00
6. Count data, Poisson regression and log-linear models: Poisson regression, probability models for contingency tables, log-linear models, inference for log-linear models. 20.00

## Text and materials required to be purchased or accessed

Introductory material (Current Year), Course STA3301 Statistical Models.
(Accessible from the course StudyDesk.)
Study Book (current year), Course STA3301 Statistical Models.
(Accessible from the course StudyDesk.)
All additional study material will be provided on the course StudyDesk.