Google DeepMind’s new AI model is the best yet at weather forecasting

Google DeepMind has unveiled an AI model that’s better at predicting the weather than the current best systems. The new model, dubbed GenCast, is published in Nature today.

This is the second AI weather model that Google has launched in just the past few months. In July, it published details of NeuralGCM, a model that combined AI with physics-based methods like those used in existing forecasting tools. That model performed similarly to conventional methods but used less computing power.

GenCast is different, as it relies on AI methods alone. It works sort of like ChatGPT, but instead of predicting the next most likely word in a sentence, it produces the next most likely weather condition. In training, it starts with random parameters, or weights, and compares that prediction with real weather data. Over the course of training, GenCast’s parameters begin to align with the actual weather.

The model was trained on 40 years of weather data (1979 to 2018) and then generated a forecast for 2019. In its predictions, it was more accurate than the current best forecast, the Ensemble Forecast, ENS, 97% of the time, and it was better at predicting wind conditions and extreme weather like the path of tropical cyclones. Better wind prediction capability increases the viability of wind power, because it helps operators calculate when they should turn their turbines on and off. And better estimates for extreme weather can help in planning for natural disasters.

Google DeepMind isn’t the only big tech firm that is applying AI to weather forecasting. Nvidia released FourCastNet in 2022. And in 2023 Huawei developed its Pangu-Weather model, which trained on 39 years of data. It produces deterministic forecasts—those providing a single number rather than a range, like a prediction that tomorrow will have a temperature of 30 °F or 0.7 inches of rainfall.

GenCast differs from Pangu-Weather in that it produces probabilistic forecasts—likelihoods for various weather outcomes rather than precise predictions. For example, the forecast might be “There is a 40% chance of the temperature hitting a low of 30 °F” or “There is a 60% chance of 0.7 inches of rainfall tomorrow.” This type of analysis helps officials understand the likelihood of different weather events and plan accordingly.

These results don’t mean the end of conventional meteorology as a field. The model is trained on past weather conditions, and applying them to the far future may lead to inaccurate predictions for a changing and increasingly erratic climate.

GenCast is still reliant on a data set like ERA5, which is an hourly estimate of various atmospheric variables going back to 1940, says Aaron Hill, an assistant professor at the School of Meteorology at the University of Oklahoma, who was not involved in this research. “The backbone of ERA5 is a physics-based model,” he says.

In addition, there are many variables in our atmosphere that we don’t directly observe, so meteorologists use physics equations to figure out estimates. These estimates are combined with accessible observational data to feed into a model like GenCast, and new data will always be required. “A model that was trained up to 2018 will do worse in 2024 than a model trained up to 2023 will do in 2024,” says Ilan Price, researcher at DeepMind and one of the creators of GenCast.

In the future, DeepMind plans to test models directly using data such as wind or humidity readings to see how feasible it is to make predictions on observation data alone.

There are still many parts of forecasting that AI models still struggle with, like estimating conditions in the upper troposphere. And while the model may be good at predicting where a tropical cyclone may go, it underpredicts the intensity of cyclones, because there’s not enough intensity data in the model’s training.

The current hope is to have meteorologists working in tandem with GenCast. “There’s actual meteorological experts that are looking at the forecast, making judgment calls, and looking at additional data if they don’t trust a particular forecast,” says Price.

Hill agrees. “It’s the value of a human being able to put these pieces together that is significantly undervalued when we talk about AI prediction systems,” he says. “Human forecasters look at way more information, and they can distill that information to make really good forecasts.”