The University of Newcastle, Australia
Available in 2019

Course handbook

Description

The world is awash in data and there is a huge demand for people with the skills and knowledge to turn data into actionable insights. STAT2020 covers the basics of predictive data analytics, statistical computing and visualisation. Students develop an understanding of data science, from the basic skills of data processing and visualisation to building sophisticated descriptive and predictive models.

STAT2020 focuses on developing models for classification and prediction. The aim is to use a set of predictor variables to model the outcome of a target variable using techniques such as least squares and logistic regression, k-nearest neighbours, classification and regression trees (CART), and hierarchical classifiers. Shrinkage-based methods for model selection and cross-validation for model evaluation are also introduced. Students develop coding and reproducible reporting skills with open source software.

STAT2020 equips students with the data skills and acumen to excel in their chosen career through their ability to analyse a variety of data sources and make data driven decisions.


Availability2019 Course Timetables

Callaghan

  • Semester 2 - 2019

Learning outcomes

On successful completion of the course students will be able to:

1. Apply classification and prediction modelling techniques to turn data into actionable insights

2. Perform model selection to identify the most important predictors out of a potentially very large set of predictor variables

3. Use cross-validation to assess the performance of selected models

4. Understand the importance of reproducible reporting and report on the results of a statistical analysis of a data set

5. Discuss the limitations of the statistical analyses considered


Content

The course will include the following topics:

  1. Classification and prediction modelling techniques such as least squares and logistic regression, k-nearest neighbours, classification and regression trees (CART), and hierarchical classifiers
  2. Variable and feature selection techniques
  3. Cross-validation for model evaluation
  4. Statistical computing and reproducible reporting

Assumed knowledge

STAT2020Predictive AnalyticsThe world is awash in data and there is a huge demand for people with the skills and knowledge to turn data into actionable insights. STAT2020 covers the basics of predictive data analytics, statistical computing and visualisation. Students develop an understanding of data science, from the basic skills of data processing and visualisation to building sophisticated descriptive and predictive models.

STAT2020 focuses on developing models for classification and prediction. The aim is to use a set of predictor variables to model the outcome of a target variable using techniques such as least squares and logistic regression, k-nearest neighbours, classification and regression trees (CART), and hierarchical classifiers. Shrinkage-based methods for model selection and cross-validation for model evaluation are also introduced. Students develop coding and reproducible reporting skills with open source software.

STAT2020 equips students with the data skills and acumen to excel in their chosen career through their ability to analyse a variety of data sources and make data driven decisions.FSCITFaculty of Science724School of Mathematical and Physical Sciences1020005980Semester 2 - 2019CALLAGHANCallaghan2019STAT1070 or STAT2010The course will include the following topics: Classification and prediction modelling techniques such as least squares and logistic regression, k-nearest neighbours, classification and regression trees (CART), and hierarchical classifiers Variable and feature selection techniques Cross-validation for model evaluation Statistical computing and reproducible reporting YOn successful completion of this course, students will be able to:1Apply classification and prediction modelling techniques to turn data into actionable insights2Perform model selection to identify the most important predictors out of a potentially very large set of predictor variables3Use cross-validation to assess the performance of selected models4Understand the importance of reproducible reporting and report on the results of a statistical analysis of a data set5Discuss the limitations of the statistical analyses considered Written Assignment: Written AssignmentProject: ProjectFormal Examination: Formal ExaminationParticipation: Class discussion participation CallaghanWorkshopFace to Face On Campus3hour(s)per Week for0Full Term1CallaghanWorkshopFace to Face On Campus3hour(s)per Week for0Full Term1


Assessment items

Written Assignment: Written Assignment

Project: Project

Formal Examination: Formal Examination

Participation: Class discussion participation


Contact hours

Callaghan

Workshop

Face to Face On Campus 3 hour(s) per Week for Full Term starting in week 1

Callaghan

Workshop

Face to Face On Campus 3 hour(s) per Week for Full Term starting in week 1