The University of Newcastle, Australia

Exploring Multifaceted Clustering of Complex Electricity Time-Series Data to Support Data-Driven Decision-Making in the Energy Sector

Closing Date: 31 January 2021Apply Now

PhD Scholarship

The primary goal of this project is to explore methodologies and processes available for clustering complex, long, low frequency time series with spatial, temporal, or categorical factors associated with them – such as those found in the energy sector. The focus will be on mapping out which tools are most appropriate for use, the various considerations that should be made when using those tools, the metrics that should be used in validating their success and, when applicable, possible amendments/adjustments and improvements that may be required depending on the application scenario in hand. Though the focus need not be strictly limited to energy related time-series only, this project will receive support from CSIRO researchers in the energy domain.


Long time series data is one of the largest sources of residential energy use information in the electricity sector. These time series are obtained from interval meters which typically record electricity for an individual household every half an hour. Since interval meter databases end up capturing data for hundreds of thousands of households, across many years, this time series data becomes large and difficult to understand. Yet there is a real industry need to better understand this data to improve decisions around tariffs, consumer equality, and supply-demand operations. This is where clustering and forming clustered representations of time series become key for the energy sector. There are different approaches to achieve this so-called ‘load clustering’ (more broadly ‘time-series clustering’), where ‘load’ refers to the electrical time series load of a household. However, the considerations that need to be made when employing such clustering are not well understood – especially when these time series have linked characteristics.

There is a complicated relationship between humans and electricity usage, especially due to weather and work schedules. The electricity usage of a household is very closely linked to weather. Depending on the season, electricity usage can be wildly different due to space conditioning either heating or cooling large areas. Electricity usage is also very closely linked to work schedules. Usage can vary greatly between workdays and weekends/holidays. Businesses and organisations in charge of operating the electricity network are concerned about knowing the ‘when’ and ‘why’ of high simultaneous usage across residential and commercial sites. This typically happens when it’s very hot, a workday, and in the evening. When this occurs, it can mean outages for anyone on the network. This means a clustering approach should be robust enough to identify these anomalies within groups of households. Currently, a common way of assessing these clusters is through the use of a motif. These are useful as they allow industry to apply rules, conditions and considerations to a single representative load for many households, rather than having to consider each of these individually. Knowing these motifs can help improve the planning of electricity supply and policies. However, the motifs on their own lack much of the complexity of location, weather, year shifts, demographics and the likes. Additionally, the motifs often lose anomalies. The anomalies are important, but only play a small part in the time series and, as such, they may not be properly captured/represented, thus not biasing clustering. Some work in this area has already been completed by CSIRO. However, understanding how to incorporate intelligence from linked data, and mapping out the levers available to cater clustering to particular purposes is less known.


The primary goal of this project is to best map out the considerations which should be made when selecting and/or employing/implementing a clustering process when complex data are involved. It should explore many variations that can be made to the modelling, clustering, and data influences (that is, the incorporation of linked datasets) and how these affect the resultant clustering metrics. It should also explore what metrics might be best at ensuring particular needs are met.

Considerations include:

  • How well do clustering approaches support stakeholder needs?
  • What should be considered when representing clusters?
  • What should be considered when assessing clustering success?
  • What should be considered when seasonality is important?
  • What should be considered if individual time series can move between clusters depending on the point in time?
  • What should be considered if spatial information is available?
  • What should be considered if anomalies are important?
  • What should be considered when other static data such as demographics is available

PhD Scholarship details


Living allowance of $28,092 per annum (2020 rate) indexed annually. For a PhD candidate, the living allowance scholarship is for 3.5 years and the tuition fee scholarship is for 4 years. Scholarships also include up to $1,500 relocation allowance (if applicable).

Additional Funding and Support:

  • $10,000 per annum as CSIRO scholarship TOP-UP (for 3 years and upon agreement to the CSIRO student terms) to further support the recipient of this scholarship;
  • Up to $5,000 per annum (for 3 years, upon application and upon agreement to the CSIRO student terms) for research support from the CSIRO to eligible expenses such as approved conference travel
  • The selected candidate may also be eligible to apply for annual HDR research support from the faculty/school, if available

Supervisor: Prof. Ricardo J. G. B. Campello

Available to: Domestic students


Eligibility Criteria


  1. Domestic student in Australia
  2. Honours or Master Degree (Statistics, Computational Statistics, Data Science, Computer Science, Mathematics, Engineering, or related field)
  3. Verbal and written communication skills


  1. Programming skills
  2. Solid statistics/mathematics and/or computational background

The successful applicant must by able to commence by 1 March 2021. Please also refer to the admission eligibility criteria.

Application Procedure

Interested applicants should send an email addressing the essential/desirable criteria and expressing their interest along with scanned copies of their academic transcripts, CV, a brief statement of their research interests and a proposal that specifically links them to the research project.

Please send the email expressing interest to by 5pm on 31 January 2021.

Applications Close 31 January 2021 Apply Now

Contact Prof. Ricardo J. G. B. Campello
Phone +61 2 4921 6762

PhD and Research MastersFind out more