Using machine learning to accelerate climate extremes research

To better understand and predict extreme weather events, Dr Emily Gordon develops machine learning models trained on large climate datasets.

Research background

Dr. Emily Gordon is a researcher at the University of Auckland investigating how climate change affects the predictability of extreme weather events, particularly extreme hot summers.

As global temperatures continue to rise due to anthropogenic greenhouse gas emissions, heat waves are expected to become more frequent, longer lasting, and more intense. These events can have major societal and economic consequences, including reduced agricultural productivity, increased pressure on electricity networks, infrastructure disruption, and elevated risks to human health and mortality.

To better understand and predict these events, Emily develops machine learning models trained on large climate datasets. Machine learning training can produce different outcomes depending on the initial pseudo-random seed values, so each experiment must be repeated to explore different optimisation pathways and identify the most accurate and robust models.

The computational demands of this work are substantial.

Fortunately, the independent nature of the training runs makes the problem highly parallelisable. By leveraging the high-performance computing infrastructure provided by REANNZ, thousands of model training jobs can execute concurrently, dramatically reducing overall turnaround time and enabling much larger experimental campaigns than would otherwise be possible.

Project challenges

Although the machine learning training jobs were individually straightforward, managing thousands of submissions manually quickly became impractical. Preparing inputs, launching jobs, tracking progress, handling failed runs, and collecting outputs introduced significant overhead and increased the risk of human error.

Emily approached REANNZ for support in developing a robust and scalable workflow. She needed something capable of automating the end-to-end execution of these experiments while efficiently utilising the available parallel computing resources on the REANNZ research computing platform.

Her challenges included:

  • Automating the submission and monitoring of thousands of independent training runs
  • Managing dependencies between preprocessing, training, and postprocessing steps
  • Efficiently distributing jobs across cluster resources
  • Improving reproducibility and reducing manual intervention
  • Simplifying the collection and analysis of model outputs and statistics

 

What was done

The project began with adapting and testing Emily’s existing scripts to ensure they could be integrated reliably into an automated workflow environment. Particular attention was given to standardising file inputs and outputs so that workflow dependencies could be managed systematically.

Following a review of the workflow requirements, REANNZ Research Software Engineer Alex Pletzer and Research Support Specialist Jennifer Reeve proposed several orchestration approaches.

Emily and the REANNZ team ultimately selected the Nextflow workflow engine because of its strong support for scalable scientific computing, reproducibility, and integration with high-performance computing schedulers.

The implemented workflow uses file-based input and output channels to define dependencies between tasks. As soon as prerequisite files become available, downstream jobs are automatically submitted to the cluster for execution. This event-driven approach enables large numbers of machine learning training runs to proceed concurrently without requiring manual coordination.

By introducing workflow automation, the overall wall-clock time of the research campaign became primarily limited by the available computing resources rather than by the number of individual training sessions. The workflow also improved fault tolerance, reproducibility, and monitoring capabilities, making it easier to track progress and rerun failed tasks when necessary.

Figure 1: Schematics of the workflow implemented using Nextflow on the REANNZ platform.

 

Main outcomes

The workflow enabled Emily to scale up her machine learning experiments while substantially reducing the time and effort required to manage computations.

Using the new workflow, Emily demonstrated that neural network models consistently outperformed traditional logistic regression approaches in predicting the onset of extreme heat events across multiple geographic regions.

The increased computational throughput also enabled more comprehensive exploration of machine learning model uncertainty and sensitivity to model initialisation.

Additional benefits included:

  • Faster turnaround times for large experiment campaigns
  • Improved reproducibility of results
  • Reduced manual workload and operational errors
  • More efficient utilisation of REANNZ computing resources
  • Easier monitoring and recovery of failed jobs

 

Figure 2: Improvement in heat event onset prediction provided by machine learning

Researcher feedback

This research project has the potential to unlock new insights about how we can predict future extreme hot seasons however it was proving logistically challenging for me to implement due to the data wrangling, large number of tasks, and computational resources required. Alexander and Jen at REANNZ jumped on this problem and implemented a workflow that is optimised for the REANNZ platform and enabled me to perform this research with much less computational burden, and enhanced reproducibility. The new workflow allows me set an experiment, and leave it run without worrying about launching new tasks or troubleshooting failures so that I can spend more time on results and analysis.

- Dr Emily Gordon, Faculty of Science, Physics, University of Auckland

 

 


 

This case study shares some of the technical details and outcomes provided through our Consultancy Service. This service supports projects across a range of domains, with an aim to lift researchers’ productivity, efficiency, and skills in research computing. Get in touch to discuss how our Research Software Engineers and specialist support could help advance your project.

Find anything about our products, services, and more. Enter a query in the search input above.