CSIRO CAFE-60 submissions to the World Meteorological Organization operational decadal forecasts and the international multi-model data exchange
Mark A. Collier A * , Terence J. O’Kane B , Vassili Kitsios A C and Paul A. Sandery BA CSIRO Oceans and Atmosphere, Aspendale, Melbourne, Vic. 3195, Australia.
B CSIRO Oceans and Atmosphere, Battery Point, Hobart, Tas. 7004, Australia.
C Laboratory for Turbulence Research in Aerospace and Combustion, Department of Mechanical and Aerospace Engineering, Monash University, Clayton, Vic. 3800, Australia.
Journal of Southern Hemisphere Earth Systems Science 72(1) 52-57 https://doi.org/10.1071/ES21024
Submitted: 16 September 2021 Accepted: 11 January 2022 Published: 22 February 2022
© 2022 The Author(s) (or their employer(s)). Published by CSIRO Publishing on behalf of BoM. This is an open access article distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND)
Abstract
The Climate Analysis Forecast Ensemble (CAFE) system was developed by the Decadal Climate and Forecasting Project (DCFP) within the CSIRO Climate Science Centre (CSC) with the express purposes of providing operational ensemble forecasts of the near-term climate (1–10 years) and meeting the requirements of becoming a World Meteorological Organization (WMO) Global Data Producing Centre (GDPC) from which output has been provided in the UK Met Office (UKMO) international, multi-model data exchange. This CAFE-60 submission satisfied, at least, the minimum requirements for the CSC to become a Global Producing Centre for Annual-to-Decadal Climate Prediction (GPC-ADCP) of the WMO Lead Centre for Annual-to-Decadal Climate Prediction. In practical terms, this has meant running the CAFE-60 version 1 modelling system for years 1960–2020, or a 61-year set, with a modest ensemble size of 10. Each decade-long forecast was initialised from the beginning of each November using a state-of-the-art assimilation methodology, and delivering the minimum requested data to the coordinating centre within the UKMO. The number 60 in CAFE-60 refers to the initial year of the assimilation, 1960, where we believe there is adequate observational data to produce a meaningful reanalysis output product for wide scientific application. This document describes the key features of the CAFE-60 assimilation and forecasting system, the climate and forecasting conforming post-processed datasets along with how to access the datasets via the CSIRO Data Access Portal, and the quality control measures implemented to achieve the highest possible data quality and delivery. Part of becoming a Data Producing Centre is the ongoing need to supply CAFE-60 forecast data, at least on an ongoing annual basis, to the UKMO; therefore, obtaining CAFE-60 operational status requires significant ongoing resources and efforts to continue. In addition to the annual dataset submission by the DCFP as a minimum requirement of being declared a Data Contributing Centre, a separate application from the Australian Permanent Representative with WMO (Dr Andrew Johnson) was required to obtain the GPCP status; this was completed in early September 2020, making CSIRO one of only five centres with this special GPC-ADCP status currently in the world.
Keywords: decadal prediction, near term climate variability, operational climate predictions.
1. Introduction
In this section, we describe a uniquely formed multi-variable dataset1 using the Climate Analysis Forecast Ensemble version 60 (CAFE-60) submitted to, and accepted by, the World Meteorological Organization (WMO) via representatives from the UK Met Office (UKMO) to obtain accreditation as a Data Producing Centre of the WMO Lead Centre for Annual-to-Decadal Climate Prediction. There have been eight exchanges of decadal predictions between the UKMO and numerous institutions located in the northern hemisphere (NH); CSIRO has participated in the latest two: the first in July 2020 and the second more recently in May 2021. This participation will be hereafter be referred to as the ‘Exchange’. The CAFE-60 modelling system is described by O’Kane et al. (2021a, 2021b). CAFE-60 produces a large ensemble retrospective analysis of the past 61 years and provides initial conditions for up to 96 forecasts plus an additional mean or best fit mean state initial condition. An earlier paper (O’Kane et al. 2019) introduced the mathematical and modelling framework of the ensemble Kalman filtering system and examined early results of the system by way of multi-year El Niño Southern Oscillation forecasts and its predictability. The datasets referred to in this paper are decadal-long monthly average conditions (i.e. a forecast with 10-year lead time), each initialised at 1 November, for each of the years 1960–2020 inclusive. This dataset is required by the WMO in compliance with CSIRO's submission as a Global Data Producing Centre (GDPC) for near-term climate prediction. To avoid confusion, we will not make the distinction between hindcasts and forecasts; rather we will just refer to all model predictions as 'forecasts' whether they extend beyond the current date or not.
As this study is focused on the WMO GDPC submission, we limit the study to 10-member, 10-year lead-time forecasts each 1 November starting in 1960. The recent success of an Australian Large Computing Grant (see https://nci.org.au/news-events/news/largest-compute-grants-australian-history-announced) from the National Computation Infrastructure based in Canberra has enabled an effort to initialise other months and for the full 96-member ensemble, at least for a limited number of years. Description of these additional forecasts is beyond the current scope of the Exchange.
This paper is structured in the following way. The multi-model Decadal Exchange and the subsequent Update it produced is described in Section 2. In Section 3 the basic motivation and features of the CAFE-60 model are explained. Finally, Sections 4 and 5 provide advice on obtaining the dataset as well as a set of figures summarising features of the CAFE-60 forecast temporal behaviour.
2. Decadal Exchange
A major recent outcome of the CSIRO CAFE-60 participation in decadal predictions has been that the WMO’s Global Annual-to-Decadal Climate Update was made possible after the most recent two exchanges: (1) UKMO (2020) and (2) UKMO (2021) targeting 2020/2021 and 2020–2024/2021–2024 annually averaged data, and officially released 9 July 2020/27 May 2021, respectively. Future updates are planned for release as close to the beginning of the year as possible to generate the most societal utility. The latest Update document shows the results for the next 5 years using the WMO Lead Centre multi-model decadal prediction ensemble. This Update is a new initiative and the first official WMO forecast of the near-term climate, which will impact the decision making of policy makers as well as range of stake-holders that are impacted by annual-to-decadal climate variability. Although in some ways the update is similar to the Coupled Model Intercommparison Project (CMIP) decadal forecast activity, the WMO forecasts focus on a shorter timescale, with forecasts produced by operational systems with sophisticated initialisation and data assimilation systems; therefore, they potentially have more initial impact and more relevance for activities with shorter cycle lengths. An interactive web page presenting basic geographically based results can be found by following: https://hadleyserver.metoffice.gov.uk/wmolc/.
Participation of CAFE-60 in the Exchange, and henceforth the contribution of the forecast outputs to the Update, are significant achievements of the CSIRO Decadal Climate Forecast Project. Further, the CAFE-60 participation is of importance to both international and Australian scientists and policy maker utility, as it comprises the only contribution from the southern hemisphere (SH) offering a scientific perspective that would otherwise not be there. See Fig. 1 for the location of the 17 participating centres. Note that only BSC (Barcelona Supercomputing Center, Spain), CCMA (Canadian Centre for Climate Modelling and Analysis, Canada), DWD (Deutscher Wetterdienst, Germany), MOHC (Met. Office Hadley Centre, UK) and CSIRO (Australia) are GDPCs, the remainder are Contributing Centres. A quality-controlled, multi-model dataset will undoubtedly produce a better set of forecasts than any individual model. CAFE-60 is developed for global applications; however, much of its focus is placed on its design and behaviour in the Australian region; therefore, it will feasibly contribute features into the multi-model Update that may be less-well modelled by other institutions who also have a local focus in their model development and testing.
3. CAFE-60 model features
The CAFE-60 modelling system is based on the Geophysical Fluid Dynamics Laboratory’s Climate Model 2.1 (Delworth et al. 2006), with upgrades to Modular Ocean Model version 5.1, whose specific configuration has been described fully in O’Kane et al. (2019), including the ocean, land and land-surface components. No flux correction has been applied to the model exchanges of heat, freshwater or mass, thereby helping to maintain a physically based and realistic model response compared to some historical systems that often require artificial corrections to suppress drift in coupled models. In addition, full-field initialisation is employed in preference to anomaly updates in order for model biases to be diagnosed.
3.1. CAFE-60 reanalysis system
Although the focus of this paper is the forecasts datasets submitted to the WMO, it is worthwhile to describe aspects of the CAFE-60 reanalysis system, as its outputs are used to create the initial conditions for the forecasts: an accurate and model compatible initial condition together with an accurate modelling system are required to form reliable near-term climate prediction. In addition, to meaningfully illustrate aspects of the model forecast temporal behaviour, as presented in Section 5, it is necessary to plot them together, as the reanalysis forms a useful quasi-observational state given that the assimilation system does not, in general, allow the model to deviate too far from the real world. Also at forecast time zero, there is by definition no difference between the model states between the reanalysis; so the mean model error can be assigned to be zero with regards to the mean analysed state, unlike a difference with alternative observational or quasi-observational sets. That said, detailed innovation errors, i.e. differences between particular observations and model state variables, are produced at each analysis time and are available on request.
The motivation of the coupled data assimilation (CDA) method employed is to produce better-balanced forecast initial conditions within the available computing constraints, to enable near-term climate studies or, in the context of the WMO activity, multi-year to decadal time scales. Model systems that use random or unbalanced initial perturbations will suffer from initialisation shock diverging rapidly from nearby the observed climate trajectory to one some (Euclidean) distance away, depending on the degree of persistent model bias present in the model configuration. The Japanese 55-year Reanalysis is used to pre-fill many of the state variables required by the reanalysis system (Kobayashi et al. 2015).
The reanalysis system contains 96 members, which is a predetermined minimum number of background model states required to accurately capture the observed cross-domain covariances (Sandery et al. 2020). The assimilation cycle length is one calendar month.
3.2. Forecast system
Although 96 states are available to initialise and integrate the forecast model forward, here we only focus on the 10 members that comprise the WMO submission. That said, complete model restarts are available for all months; however, again for the purposes of the WMO activity, forecasts beginning 1 November were conducted, although experiments are underway with additional restart months. Ten years of data for every forecast has been provided, where the minimum requested was five. There is considerable interest in forecast value beyond the 5-year time frame, and about half of the centres are providing decadal output (pers. comm. UKMO). Additionally, we were informed by the UKMO that ocean meridional overturning mass streamfunction (msftmz, see Table 1) was exchanged by approximately half of the contributing centres, including CSIRO, whereas most other variables (surface air temperature, sea level pressure, precipitation, sea ice concentration, see Table 1) had a near-complete set from each institution. Although the Atlantic Meridional Overturning Circulation at 26°N (AMOC26°N), calculated here from the model variable msftmz, is a poorly observed quantity (see Section 5 for more details on this quantity), for most of the ocean, it has been of great interest in the context of numerical models, and a multi-model perspective will enable a better benchmark to be formed for when the observational base improves. Global circulation is expected to respond to human-based activities as well as internal climate variability on multi-year timescales, and therefore, it is an important driver/responder of climate change by way of its impacts at the ocean surface. The institutional submissions of msftmz provides a useful benchmark for future studies even though the current set appears to vary a lot between the contributing models as can be seen for the Atlantic basin: https://hadleyserver.metoffice.gov.uk/wmolc.
4. Quality control and the dataset
Raw model data was modified to have units compatible with those specified in the CMIP Phase 6 (CMIP6) Climate Model Output Rewriter (CMOR) tables: https://github.com/PCMDI/cmip6-cmor-tables. These are typically made with scale and offset factors. Rainfall and sea ice were forced to have values greater or equal to zero, and occasionally the model generated small negative numbers due to round-off.
These tables are read by the CMOR software: https://github.com/PCMDI/cmor. We utilised the python interface. By using this software, we are able to ensure data and metadata conventions are useful to analysts for downstream processing and analysis. The files follow Network Common Data Forum Climate and Forecasting Metadata Conventions: https://cfconventions.org, meaning that definitions of file data and coordinates follow standard conventions. The files are transferable and readable to all common machine architectures and, with Digital Object Identifiers (DOIs), are persistent on the Internet.
The only variable requiring significant processing to form was msftmz, as the raw model files contained mass transports as a function of time, level, latitude and longitude for both the large-scale and eddy component that are used to compute the total transport as a function of time, level and latitude. See Section 5 for details on how this was done.
Various automatic and human interpretive quality control measures were used to ensure raw model data was converted accurately and that the model exhibited values that were within expected ranges. For example, rainfall, temperature and pressure are continuous in space and time and available for all surface types. For these, it was sufficient to check that the files had consistent temporal and spatial counts of data, both non-missing and within physically chosen valid ranges. Sea ice was similarly examined; however, the number of valid points can vary with time, so this is taken into account as well as hemispheric context rather than global. Mass transports are set to missing value over longitude extents where there are no ocean points; however, counts of these were checked to be constant over time and also within expected ranges. Lastly, we looked through spatial and temporal plots of all variables, both means and standard deviations, to ensure that they did not exhibit any odd behaviour. The culmination of the quality control approaches also adds important validation measures to the original modelling and raw data writing component of the CAFE-60 project.
The citable and downloaded versions of the published datasets described in this paper are available from the CSIRO Data Access Portal (https://data.csiro.au), and can be found using the portal’s search facility. The first dataset is titled ‘CSIRO CAFE60 10-ensemble 10-year forecast monthly output submitted to the World Meteorological Organisation, initialised at 1 November for each year 1960–2019’ (1960–2019 forecasts, 10 ensemble members, 18.8 Gb, 2635 files). The second dataset is titled ‘CSIRO CAFE60 96-ensemble 10-year forecast monthly output submitted to the World Meteorological Organisation, initialised at 1 November year 2020’ (2020 forecasts, 96 ensemble members, 6.6 Gb, 672 files). The CAFE60 outputs described in this paper and more are additonally available from the Amazon Web Service: https://registry.opendata.aws/csiro-cafe60.
5. CAFE-60 model behaviour
To give an indication of the basic behaviour of the CAFE-60 modelling system as well as to give potential end-users a better understanding of the temporal relationship between the reanalysis and forecast products available, in Supplementary Figs S1–12 we present time-varying, area-averaged results derived directly from the raw CAFE-60 data described in this paper. Part (a) of each respective supplementary figure is monthly and area-averaged in the case of input atmospheric variables (pr, ts, psl and siconc) or an ocean transport derived from msftmz. Each of the following ocean mass transports provide the strength of overturning for a particular common index and all with units of kg s−1(× 109) are:
Antarctic Bottom Water (AABW)
North Atlantic Deep Water (NADW)
Atlantic Meridional Overturning Streamfunction at 26ºN (AMOC26ºN)
Southern Ocean Abyssal Cell (SOAC)
These ocean mass transports are obtained by seeking local minima/maxima in the global or Atlantic Meridional Overturning Circulation (MOC) msftmz using CMIP6 nomenclature. MOC is the only variable requiring computation of moderate post-processing complexity in the WMO submission.
MOC is defined using the streamfunction ψ in units of Sverdrups (1 Sv = 106 m3 s−1 or approximately 109 kg s−1). It is the zonally integrated (east–west) and vertically accumulated (from depth z to the surface) meridional volume transport in-depth coordinates, as defined by:
CAFE-60 generates mass transport (kg s−1) as a function of horizontal location and depth, and so these can be used in place of volume in Eqn 1 to generate a MOC in the preferred mass units. To obtain transports for a basin as a function of depth and latitude, the full depth model mass transports are used and integrated between east and west boundaries (coast-to-coast), therefore it is possible to define them globally, for the Atlantic (north of the southern tip of Africa) and Arctic combined, and the Indian and Pacific oceans combined; the latter cannot be separated normally, as there is mass communication via the Indonesian Passage in both the Australian Community Climate and Earth System Simulator-Coupled Model 2 and CAFE-60 models.
As the set of forecasts described here begin in November and end in October (over an exact 10-year period), they do not cover an entire year at the beginning and ending of the time series. Only four example forecasts are shown in the figures, for the years 1960, 1980, 2000 and 2019, to avoid too much detail and to provide the latest forecast submitted to the WMO. The reanalysis spans the period from January 1960 to December 2020 inclusive. Reanalysis atmospheric variables are compared to the European Centre for Medium-Range Weather Forecasts Reanalysis version 5 (ERA5) (Hersbach et al. 2020) and cover the period January 1979 to March 2020 inclusive. Only one of the ocean transports, AMOC26N, has an observational estimate included in the figure, from the RAPID dataset (Smeed et al. 2015). The annual cycle of the reanalysis over the period 1981–2020 for each quantity are shown in part (b) of the respective supplementary figures and, where available, with an observational equivalent. These monthly climatologies are used in forming the monthly anomalies shown in part (c) of the respective supplementary figures for the reanalysis. Note that the anomalies shown for the forecasts are relative to a lead dependent climatology formed over the entire 1960–2019 forecast set and are generally more appropriate than using the reanalysis annual climatology, as they take into account the typical model response over a large sample size. Parts (a) and (c) in each of the respective supplementary figures provide the ensemble (96 for the reanalysis and 10 for the forecasts) range and 1 − σ s.d.
Supplementary Figs S1–S6 have hemispheric averages of the three atmospheric quantities included in the Exchange. Hemispheric averages are more informative for any particular year than global averages due to the out of phase annual cycles for each. All quantities have a reasonably good match in the annual cycle, only psl has a consistent 1 hPa systematic offset when compared to the ERA5, where the CAFE reanalysis is higher for both hemispheres, otherwise the shape of the annual cycle is realistic.
Supplementary Figs S7–S8 use area average calculated poleward of 60 degrees for each hemisphere for the CAFE-60 reanalysis. Reanalysis results are compared to the ERA5 (Hersbach et al. 2020) over the common period from January 1979 to March 2020 inclusive. A cell is considered to have sea ice if the concentration is greater than 0.1%. The annual cycle in both hemispheres is moderately realistic, and there is a substantial deficit of sea ice in each hemisphere during winter (>40, 25% for NH and SH, respectively). Interestingly, the forecasts have monthly variability closer to the ERA5 observations in the NH, whereas for the SH, the minimum monthly averages are about half of that observed (20 cf. 40%).
Supplementary Figs S9–S12 are commonly discussed diagnostics of ocean transports and those most important for understanding large-scale biases in the ocean component of a coupled modelling system. There is much interest, for example, in the variability and potential from trends in the NADW formation caused by natural or human-induced internal and external forcing; however, these have limited observational values due to the need to extensively spatially and temporally sample the ocean to obtain meaning monthly averages. In Supplementary Fig. S11b, the CAFE reanalysis captures the annual cycle of AMOC26°N reasonably well, although the model appears to underestimate the observations by 2 × 109 kg s−1 throughout the whole year.
The Atlantic transport indices suggest significant multi-year variability and trends in their behaviour. For example, NADW in Supplementary Fig. S9c, trends for the year 2000 forecast look comparable to that in the reanalysis. In the 1980 forecast, the trend for the first half of the experiment looks comparable; however, they diverge significantly in the second half of the forecast experiment. In the 2019 forecast, the trend appears to follow the trend registered in the reanalysis, but by the end of the forecast it is reduced in a mostly linear fashion by approximately 7.5 × 109 kg s−1.
The ‘spikey behaviour’ between 1960 and 1965 in the CAFE-60 mean is simply the spin-up phase where the model is progressively being constrained to the observations as discussed in O’Kane et al. 2021b). Note that starting the reanalysis in 1970, for example, would not stop this from happening, as it is a function of having sparse subsurface ocean observations (prior to Argo) and strongly constrained CDA where there are cross-domain, ocean-atmosphere covariances.
In summary, the CAFE-60 datasets have added to international efforts to make available a comprehensive data resource for studying internal climate variability and predictability, including the climate response to anthropogenic forcing on multi-year to decadal time scales (O’Kane et al. 2021b). Moreover, the inclusion of CAFE-60 hindcasts/forecasts in the Decadal Exchange provides a unique and tangible opportunity for SH-based climate modellers to imprint at least some of our significant Australian region modelling focus and capability on the international multi-model Update.
Data availability
General access to all datasets use to generate the model results in the paper are described in this paper, however for CSIRO researchers the CAFE-60 reanalysis and forecast data are readily available on the CSIRO High Performance Computing Cluster of Petrichor. All other datasets used to compare with model results are publicly available and are cited in the references. Please contact the authors if you require guidance finding any of these data.
Conflicts of interest
The authors declare no conflict of interest.
Declaration of funding
This work was supported by the Australian Commonwealth Scientific and Industrial Research Organisation (CSIRO) Decadal Climate Forecasting Project (https://research.csiro.au/dfp).
Supplementary material
Supplementary material is available online.
Acknowledgements
Thanks to Didier Monselesan for preparation of the JRA55 data required to perform the CAFE-60 reanalysis. Richard Matear corresponded with Australia’s WMO Permanent Representative, enabling the formal GPCP status of the DCFP. Data from the RAPID MOC monitoring project are funded by the Natural Environment Research Council and are freely available (https://www.rapid.ac.uk/rapidmoc). The CMOR3 python interface was created by the Program for Climate Model Diagnosis and Intercomparison and are available at https://github.com/PCMDI/cmor, and the CMIP6 tables can be accessed by the CMOR from https://github.com/PCMDI/cmip6-cmor-tables. Section 4 describes the data that support this study along with information on how to publicly access it.
References
Delworth TL, Broccoli AJ, Rosati A, et al. (2006) GFDL’s CM2 global coupled climate models. Part I: Formulation and simulation characteristic. Journal of Climate 19, 643–674.Hersbach H, Bell B, Berrisford P, et al. (2020) The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society 146, 1999–2049.
| The ERA5 global reanalysis.Crossref | GoogleScholarGoogle Scholar |
Kobayashi S, Ota Y, Harada Y, et al. (2015) The JRA-55 reanalysis: General specifications and basic characteristics. Journal of the Meteorological Society of Japan. Ser. II 93, 5–48.
| The JRA-55 reanalysis: General specifications and basic characteristics.Crossref | GoogleScholarGoogle Scholar |
O’Kane TJ, Sandery PA, Monselesan DP, et al. (2019) Coupled data assimilation and ensemble initialization with application to multiyear ENSO prediction. Journal of Climate 32, 997–1024.
| Coupled data assimilation and ensemble initialization with application to multiyear ENSO prediction.Crossref | GoogleScholarGoogle Scholar |
O’Kane TJ, Sandery PA, Kitsios V (2021a) CAFE60v1: a 60-year large ensemble climate reanalysis. Part I: system design, model configuration and data assimilation. Journal of Climate 34, 5153–5169.
| CAFE60v1: a 60-year large ensemble climate reanalysis. Part I: system design, model configuration and data assimilation.Crossref | GoogleScholarGoogle Scholar |
O’Kane TJ, Sandery PA, Kitsios V (2021b) CAFE60v1: a 60-year large ensemble climate reanalysis. Part II: evaluation. Journal of Climate 34, 5171–5194.
| CAFE60v1: a 60-year large ensemble climate reanalysis. Part II: evaluation.Crossref | GoogleScholarGoogle Scholar |
Sandery P, O’Kane T, Kitsios V, Sakov P (2020) Climate model state estimation using variants of enKF coupled data assimilation. Monthly Weather Review 148, 2411–2431.
| Climate model state estimation using variants of enKF coupled data assimilation.Crossref | GoogleScholarGoogle Scholar |
Smeed D, McCarthy G, Rayner D, Moat B, Johns W, Baringer M, Meinen C (2015). Atlantic meridional overturning circulation observed by the RAPID-MOCHA-WBTS (RAPID-meridional overturning circulation and heatflux array-western boundary time series) array at 26N from 2004 to 2014. British Oceanographic Data Centre – Natural Environment Research Council. https://doi.org/
| Crossref |
UKMO (2020) Global annual to decadal climate update. UK Meteorological Office: World Meteorological Organization. Available at https://hadleyserver.metoffice.gov.uk/wmolc/WMO_GADCU_2019.pdf
UKMO (2021) Global annual to decadal climate update. UK Meteorological Office: World Meteorological Organization. Available at https://hadleyserver.metoffice.gov.uk/wmolc/WMO_GADCU_2020.pdf
1 Additional variables of surface 2 m air temperature and sea surface temperature were supplied for their potential utility but were not used in the final Update and, therefore, were not formally required to become a data provider.