STAT6020
10 units
6000 level
Course handbook
Description
The world is awash in data and there is a huge demand for people with the skills and knowledge to turn data into actionable insights. STAT6020 covers the basics of predictive data analytics, statistical computing and visualisation. Students develop an understanding of data science, from the basic skills of data processing and visualisation to building sophisticated descriptive and predictive models. STAT6020 focuses on developing models for prediction as well as methods to learn or train these models from data in tasks that include classification and regression. The main scope consists in the use of a set of predictor variables to model the outcome of a target variable using supervised statistical learning techniques, such as least squares and logistic regression, Bayes-type classifiers (e.g. Naïve Bayes, Linear/Quadratic Discriminant Analysis – LDA/QDA), k-nearest neighbours, and decision trees for classification and regression. Shrinkage-based methods for model selection and cross-validation for model evaluation are also introduced. Although the main focus is on predictive tasks, fundamental unsupervised learning techniques for both descriptive and predictive analytics tasks are also introduced, most noticeably foundational partitioning/hierarchical methods for data clustering as well as elements of principal component analysis (PCA). Advanced content may include for instance model ensembles (e.g. bagging, random forests), support vector machines, outlier/anomaly detection, or other related topic(s). Students develop coding and reproducible reporting skills with open source software. STAT6020 equips students with the data skills and acumen to excel in their chosen career through their ability to analyse a variety of data sources and make data driven decisions.
Availability2024 Course Timetables
Online
- Semester 2 - 2024
Learning outcomes
On successful completion of the course students will be able to:
1. Apply prediction modelling techniques such as classification, regression and clustering to turn data into actionable insights.
2. Perform model selection to identify the most important predictors out of a potentially very large set of predictor variables.
3. Use cross-validation to assess the performance of selected models.
4. Undertake reproducible reporting and report on the results of a statistical analysis of a data set.
5. Discuss limitations of the statistical methods chosen for particular application scenarios.
Content
The course will include the following topics:
- Variable and feature selection techniques
- Statistical computing and reproducible reporting
- Predictive, supervised modelling techniques such as least squares and logistic regression, k-nearest neighbours and decision trees for classification and regression as well as unsupervised modelling techniques such as hierarchical clustering and PCA
- Cross-validation and other resampling methods for model evaluation and selection
- Advanced topics in statistical learning
Assumed knowledge
It is assumed students have completed Year 12 HSC Mathematics Advanced or have an equivalent background in basic calculus and probability/statistics, as well as notions of elementary matrix algebra. In addition, ideally students will be better prepared to undertake STAT6020 if they also: (a) have had some previous exposure to computer programming or statistical software; and (b) have completed an introductory postgraduate or undergraduate statistics course, such as STAT6160, STAT6170, STAT1070, STAT1300, STAT2010 or STAT2110.
Assessment items
Written Assignment: Written Assignment
Project: Project
Online Open Book Formal Examination: Formal Examination
Quiz: Online quiz
Contact hours
Semester 2 - 2024 - Online
Self-Directed Learning-1
- Online 10 hour(s) per week(s) for 13 week(s) starting in week 1
- There are opportunities for students to get individualised help from teaching staff in the course. Suggest 8-12 hours time commitment per week (guide only).
Course outline
Course outline not yet available.
The University of Newcastle acknowledges the traditional custodians of the lands within our footprint areas: Awabakal, Darkinjung, Biripai, Worimi, Wonnarua, and Eora Nations. We also pay respect to the wisdom of our Elders past and present.