REANNZ develops and supports a range of products and services to support the specialist needs of our members in the R&E and innovation community.
We operate NZ's national research and education network and seamlessly connect with 120+ networks globally to enable researchers to collaborate.
REANNZ is proud to support the specialist needs of our members from NZ's research, education and innovation community.
Find out more about who REANNZ is, what we do and the people that operate New Zealand’s national research and education network.
To better understand and predict extreme weather events, Dr Emily Gordon develops machine learning models trained on large climate datasets.
Dr. Emily Gordon is a researcher at the University of Auckland investigating how climate change affects the predictability of extreme weather events, particularly extreme hot summers.
As global temperatures continue to rise due to anthropogenic greenhouse gas emissions, heat waves are expected to become more frequent, longer lasting, and more intense. These events can have major societal and economic consequences, including reduced agricultural productivity, increased pressure on electricity networks, infrastructure disruption, and elevated risks to human health and mortality.
To better understand and predict these events, Emily develops machine learning models trained on large climate datasets. Machine learning training can produce different outcomes depending on the initial pseudo-random seed values, so each experiment must be repeated to explore different optimisation pathways and identify the most accurate and robust models.
The computational demands of this work are substantial.
Fortunately, the independent nature of the training runs makes the problem highly parallelisable. By leveraging the high-performance computing infrastructure provided by REANNZ, thousands of model training jobs can execute concurrently, dramatically reducing overall turnaround time and enabling much larger experimental campaigns than would otherwise be possible.
Although the machine learning training jobs were individually straightforward, managing thousands of submissions manually quickly became impractical. Preparing inputs, launching jobs, tracking progress, handling failed runs, and collecting outputs introduced significant overhead and increased the risk of human error.
Emily approached REANNZ for support in developing a robust and scalable workflow. She needed something capable of automating the end-to-end execution of these experiments while efficiently utilising the available parallel computing resources on the REANNZ research computing platform.
Her challenges included:
The project began with adapting and testing Emily’s existing scripts to ensure they could be integrated reliably into an automated workflow environment. Particular attention was given to standardising file inputs and outputs so that workflow dependencies could be managed systematically.
Following a review of the workflow requirements, REANNZ Research Software Engineer Alex Pletzer and Research Support Specialist Jennifer Reeve proposed several orchestration approaches.
Emily and the REANNZ team ultimately selected the Nextflow workflow engine because of its strong support for scalable scientific computing, reproducibility, and integration with high-performance computing schedulers.
The implemented workflow uses file-based input and output channels to define dependencies between tasks. As soon as prerequisite files become available, downstream jobs are automatically submitted to the cluster for execution. This event-driven approach enables large numbers of machine learning training runs to proceed concurrently without requiring manual coordination.
By introducing workflow automation, the overall wall-clock time of the research campaign became primarily limited by the available computing resources rather than by the number of individual training sessions. The workflow also improved fault tolerance, reproducibility, and monitoring capabilities, making it easier to track progress and rerun failed tasks when necessary.
The workflow enabled Emily to scale up her machine learning experiments while substantially reducing the time and effort required to manage computations.
Using the new workflow, Emily demonstrated that neural network models consistently outperformed traditional logistic regression approaches in predicting the onset of extreme heat events across multiple geographic regions.
The increased computational throughput also enabled more comprehensive exploration of machine learning model uncertainty and sensitivity to model initialisation.
Additional benefits included:
This research project has the potential to unlock new insights about how we can predict future extreme hot seasons however it was proving logistically challenging for me to implement due to the data wrangling, large number of tasks, and computational resources required. Alexander and Jen at REANNZ jumped on this problem and implemented a workflow that is optimised for the REANNZ platform and enabled me to perform this research with much less computational burden, and enhanced reproducibility. The new workflow allows me set an experiment, and leave it run without worrying about launching new tasks or troubleshooting failures so that I can spend more time on results and analysis.
- Dr Emily Gordon, Faculty of Science, Physics, University of Auckland
Find anything about our products, services, and more. Enter a query in the search input above.