APS2-ACCESS-C2: the first Australian operational NWP convection-permitting model

Greg Roff; Ilia Bermous; Gary Dietachmayer; Joan Fernon; Jim Fraser; Wenming Lu; Susan Rennie; Peter Steinle; Yi Xiao

doi:10.1071/ES21013

RESEARCH ARTICLE (Open Access)

Next Contents Vol 72(1)

APS2-ACCESS-C2: the first Australian operational NWP convection-permitting model

Greg Roff

^A ^* , Ilia Bermous ^A , Gary Dietachmayer

^A , Joan Fernon ^A , Jim Fraser ^A , Wenming Lu ^A , Susan Rennie ^A , Peter Steinle

^A and Yi Xiao ^A

+ Author Affiliations

- Author Affiliations

^A Australian Bureau of Meteorology, GPO Box 1289, Vic. 3001, Australia.

^* Correspondence to: greg.roff@bom.gov.au

Journal of Southern Hemisphere Earth Systems Science 72(1) 1-18 https://doi.org/10.1071/ES21013
Submitted: 31 May 2021 Accepted: 8 December 2021 Published: 14 February 2022

© 2022 The Author(s) (or their employer(s)). Published by CSIRO Publishing on behalf of BoM. This is an open access article distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND)

Abstract

The Australian Bureau of Meteorology’s ‘Australian Parallel Suite’ (APS) operational numerical weather prediction regional Australian Community Climate and Earth-System Simulator (ACCESS) city-based system (APS1 ACCESS-C1) was updated in August 2017 with the commissioning of the APS2 ACCESS-C2. ACCESS-C2 runs over six regional domains. Significant upgrade changes included implementation of Unified Model 8.2 code; nesting in the 12 km resolution APS2 ACCESS-R2 regional model; and, importantly, an increased horizontal resolution from 4 to 1.5 km, enabling C2 to become the first Australian operational convection-permitting model (CPM). Traditional rainfall verification metrics and Fractions Skill Score show C2 forecast skill over ACCESS-C domains in summer and winter was generally, and in many cases, significantly better than C1. Case studies showed that C2 forecasts had better-detailed wind and precipitation fields, particularly at longer forecast ranges and higher rain rates. The improvements in C2 forecasts were principally due to its CPM ability to simulate high temporal and spatial resolution features, which continue to be of great interest to forecasters. C2 also laid the groundwork for the present day APS3 ACCESS-C forecast C3 and ensemble CE3 models and further development of higher resolution (down to 300 m) fire weather and urban models.

Keywords: ACCESS model, convection permitting model, Fractional Skill Score, high resolution NWP, verification metrics.

1. Introduction

The Bureau of Meteorology runs a suite of numerical weather prediction (NWP) Australian Community Climate and Earth-System Simulator (ACCESS) models (Puri et al. 2013) to support its global (ACCESS-G), regional (ACCESS-R) and city (ACCESS-C) operational forecast services. ACCESS-G and ACCESS-R became operational in 2009, and ACCESS-C in 2010. ACCESS models evolved via three Australian Parallel Suite (APS) updates: APS1, 2 and 3. Successive APS updates for ACCESS-C are discussed in Bureau of Meteorology (2013, 2018) and Rennie et al. ‘ACCESS-C: Australian Convective-Scale NWP with Hourly 4D-Var Data Assimilation’ (work in progress), respectively.

All ACCESS models had >4 km resolution and so needed to use convective parameterisation to account for sub-grid scale convective activity to reduce atmospheric instability. The first real-time test of an ACCESS convection-permitting model (CPM), also known as a convection-allowing model, was in the ‘Forecasting Demonstration Project’ (Seed et al. 2019) in 2014, which used a ~1.5 km resolution model with 3D-Var data assimilation. This demonstrated that CPM forecasts could capture high-resolution detail, without using convective parameterisation, and improved forecaster’s understanding of extreme convective events, as it could simulate the evolution and detail much closer to reality than previous NWP models (Hartfield 2017).

Such studies led to the 2017 APS2 ACCESS-C upgrade (hereafter shortened to ‘C2’) being the first operational CPM. This paper explains the importance of this CPM step in ACCESS-C development and the need for different verification methods, such as the Fractions Skill Score (FSS) for such high-resolution forecasts. C2 was eventually superseded in 2020, when APS3 ACCESS-C was operationalised, along with a companion city ensemble version (see Conclusions for detail).

Significant C2 changes from APS1 ACCESS-C (hereafter shortened to ‘C1’) included: (i) updated model code and physics from Unified Model (UM) UM 7.5 to UM 8.2, also used in its global counterpart APS2 ACCESS-G (Bureau of Meteorology 2016a) (hereafter shortened to ‘G2’); (ii) nesting inside the updated regional APS2 ACCESS-R (Bureau of Meteorology 2016b) (hereafter shortened to ‘R2’); and (iii) a decrease in horizontal-grid spacing from 0.036° (~4 km) to 0.0135° (~1.5 km).

G2 had N512 (~25 km) resolution and 70 vertical levels with ~80 km top, while the nested limited area model, R2, had the same vertical levels and covered the Australasian domain [65.0°S to 16.95°N, 65.0°E to 184.57°E] with 0.11° (~12 km) horizontal resolution.

R2 nesting provided C2 with initial and lateral boundary conditions that enabled superior simulation of rapidly changing small-scale, low-level features than C1. This is partly due to R2’s upgrade and C2 having more vertical levels in the boundary layer than R2, but primarily due to C2’s increased resolution. C2 continued to be a ‘forecast only’ and ‘downscaler’ system like C1, as it did not include an assimilation cycle (Bureau of Meteorology 2013). C2 continued to run over six city domains, as seen in Fig. 1: Darwin (DN), Perth (PE), Adelaide (AD), VicTas (VT), Sydney (SY) and Brisbane (BN) with no change in the vertical resolution (Bureau of Meteorology 2018).

**Fig. 1.** Australian R2 domain with black C2 domains overlayed – most northerly is Darwin (DN) while the rest are (west to east): Perth (PE), Adelaide (AD), VicTas (VT), Sydney (SY) and Brisbane (BN).

Section 2 discusses how the adoption of C2’s CPM approach impacted the accuracy of precipitation forecasts. This is then demonstrated via various verification statistics in Section 3 and case studies in Section 4, with conclusions in Section 5.

2. Results

C2’s increased resolution enabled running without a convective parameterisation scheme, which is a significant change from C1 where convective activity was entirely column-based, and so had no memory of the previous state or horizontal transport. Fig. 2 shows C2’s increased resolution also better represented topography and coastlines. More realistic near-surface fields, particularly in rainfall, wind and temperature, could also evolve in C2.

**Fig. 2.** VicTas domain C1 (left) and C2 (right) orography fields around Hobart with overlayed coastline.

G2, R2 and C2 (with 25, 12, and 1.5 km resolutions) 24-h accumulated precipitation 0000UTC 36-h forecasts for Adelaide on 30 September 2016 (Fig. 3) are compared with daily rainfall from the Australian Water Availability Project (AWAP) (Evans et al. 2020 ) 0.05 degree gridded data – an analysis of the Automatic Weather Station (AWS) gauge data. The grid boxes are obvious in G2 and R2 as are the corresponding smooth rainfall fields. C2 is continuous across coastal and inland regions with more physically credible features whereas R2 looks like observations, but AWAP can smooth out some higher observed gauge accumulations.

**Fig. 3.** Adelaide 24-h accumulated precipitation observations and G2, R2 and C2 36-h forecasts for 0000UTC 30 September 2016 (approx. 9 am local).

C2 and C1 Tasmanian 24-h accumulated rainfall 0000UTC 36-h forecasts and AWAP for 25 July 2015 (Fig. 4) show an example of C1’s parameterised convection forecast problem ‘coastal locking’ (Clark et al. 2016) – smoothed precipitation over the sea terminated on the west coast with little rain falling beyond. Although C2 cannot resolve convective clouds (requires <100 m resolution), it can explicitly capture processes with ‘convection like’ characteristics, which may subsequently drive scales that it can resolve and allow the more realistic rain fields seen in C2.

**Fig. 4.** (Left) C1 and (right) C2 Tasmanian 24-h accumulated rainfall 0000UTC 36-h forecasts and AWAP (bottom) for 25 July 2015.

Being a CPM, C2 needed to ‘spin-up’ convection in the first few hours due to nesting in R2’s coarser, and thus smoother, initial conditions. It had to generate convection ‘from scratch’, not instantaneously via parameterisation, as in C1, and could not always capture details of the onset of convection (Terry Davies 2014; Prein et al. 2015; Tennant 2015, Matte et al. 2017). This constraint did not prevent CPMs from successfully modelling any subsequent larger-scale organisation, such as into mesoscale convective systems (Clark et al. 2014, 2016).

CPM spin-up issues could result in ‘near boundary rainfall’ effects where, within the first few hours, rainfall was initially not simulated near the inflow advection boundary. This problem was mitigated by designing lateral boundaries to be away from the area of forecaster interest, which required larger domains and the introduction of variable grids (Tang et al. 2013).

C2 spin-up is seen in Fig. 5 for VicTas 24-h (top) and 36-h (bottom) forecast mean sea level pressure (MSLP) and 24-h precipitation from C2 and R2 initialised at 1200UTC on 6 May 2015. The 24-h forecast had no precipitation forming near the western inflow boundary, unlike R2, but further into the domain convection it had time to spin-up the inflow fields to create more realistic values. The 36-h forecast then showed well-developed precipitation.

**Fig. 5.** Spin-up near boundary effect seen in VicTas (VT) MSLP and 24-h precipitation forecasts (left) C2 and (right) R2 valid at (top) 1300UTC 7 May 2015 and (bottom) 01UTC 8 May 2015 for initialisation at 1200UTC 6 May 2015.

Verification of CPM models was also an area that needed close attention when grid point observation networks were used, since CPM’s suffered from a ‘double penalty’ issue due to their high resolution. This is discussed further in the next section.

Despite these drawbacks, the benefits provided by the CPM approach were such that ‘Convection-permitting models (CPMs) have provided operational weather forecasting centres with a step-change in their capabilities to forecast rainfall’ (Clark et al. 2016).

3. Verification

C2 and C1 forecasts were compared across all domains using radar, satellite and point-based AWS and rain gauge network observations. Model verification from all these datasets have problems, particularly with CPM models.

Radar observations have high temporal and spatial resolution, but their range is often limited to a small portion of CPM domains, and they also suffer from effects of clutter, orography and virga (http://www.bom.gov.au/australia/radar/about/what_is_radar.shtml#re).

Satellite observations can cover the entire model domain but often do not have the resolution of radar. Satellite-derived precipitation has inherent limitations, such as limited space/time coverage over rapidly evolving convection and retrieving rainfall over land without using microwave emissions, only the scattering (ice crystal) signal, which isn’t effective for rain from low-level clouds (Dos Reis et al. 2017).

Single point observations penalise CPM verification, relative to parameterised convection, due to ‘double penalty’ issues, where both missed and false-alarms are counted in standard score contingency tables – unless they are exactly matched with any single point observation – resulting in significantly worse evaluation scores. This is compounded by highly variable (spatially and temporally) rainfall fields, and often denser point locations, in coastal areas, which give model performance in such areas relatively greater statistical weight.

Furthermore, all precipitation verification has its limitations. For example, the same 24-h accumulated rainfall could be due to 5 h of 2 mm h⁻¹ light rain, which is not the same as 1 h of 10 mm h⁻¹.

All verification periods discussed below used 0000UTC model forecasts for austral winter (1 June 2016–31 July 2016) and austral summer (1 December 2016–13 February 2017), unless otherwise stated, and are referred to as winter and summer, respectively.

3.1. Surface weather – AWS and rain gauge observations

Fig. 6a displays AWAP AWS hourly temperature, dewpoint and wind speed scorecards for C1 and C2 every 6–36 h from 0000Z for winter. Fig. 6b displays the same for summer. Fields are interpolated to observation points and verified against observations within 30 min of the model valid time, with blue/clear/red indicating that C2 was better/comparable/worse than C1 using 95% significance. The root mean square error (RMSE) (R) represents the ‘average’ error, weighted according to the square of the error, while the Bias (B) is the correspondence between the mean forecast and the mean observation. Sub-daily AWAP data was available but not used, as it can be strongly influenced by local orography and rainfall-producing mechanisms (Jakob et al. 2011). Such scorecards suffer from ‘double penalties’ but still provide a very useful measure of large-scale systematic performance of C2 compared with C1.

**Fig. 6.** C1 and C2 AWS RMSE (R) and Bias (B) scorecards for surface screen temperature, dew point and 10-m wind speed, every 6–36 h, with blue/clear/red indicating that C2 is better/comparable/worse than C1 using 95% significance, (a) for winter and (b) for summer.

The winter RMSE is mostly blue/clear, indicating that C2 was generally better or comparable to C1 for all domains and fields. C2 was worse only in Sydney winter dewpoint but much more so in summer, particularly in the 10-m wind parameter.

The Bias scores are more mixed, reflecting the problems found when averaging scores ranging around zero. Bias again varied with domain and season with mostly blue/clear, indicating that C2 was better/comparable to C1. When C2 was worse (red), it occurred mainly in summer when convection is more active. This Bias was corrected with post-processing (Sweeney et al. 2013).

The more difficult summer period seen here is next examined in more detail for two domains using rain gauge observations.

Fig. 7 shows C2 and C1 summer RAINVAL (McBride and Ebert 2000) precipitation verification for Sydney and Darwin, using the actual rain gauge point station observations in RAINVAL instead of the gridded daily analysis as C2’s resolution was very high and gridded data too coarse.

**Fig. 7.** (Left) C1 and (right) C2 RAINVAL thresholds metrics ACC, BIAS, POD, FAR, CSI, ETS, HSS and HK on (top) Sydney (SY) and (bottom) Darwin (DN) domains. See text for detail.

Metric scores are plotted against maximum rainfall thresholds from 0.1 to 50 (mm day⁻¹), and their ranges are Average Correlation (ACC, 0 to 1, perfect 1), frequency Bias Score (BIAS, 0 to ∞; perfect 1), Probability of Detection (POD, 0 to 1; Perfect 1), False Alarm Ratio (FAR, 0 to 1; Perfect score: 0), Critical Success Index (CSI, 0 to 1; 0 no skill, perfect 1.), Equitable Threat Score (ETS, Range: −1/3 to 1; 0 indicates no skill; Perfect score: 1), Heidke skill score (HSS, Range: −1 to 1; 0 indicates no skill; Perfect score: 1), and Hanssen & Kuipers Score (HK, Range: −1 to 1; 0 indicates no skill; Perfect score: 1). Note that ETS adjusts for forecast bias, as over-forecasting can artificially improve this metric (Mesinger 2008). More metric details can be found at http://www.cawcr.gov.au/projects/verification.

Most metrics acted in a similar manner for both domains and models as threshold size increased: POD and CSI decreased, FAR increased, ETS and HK rose then fell, while ACC fell then rose. This similarity was not unexpected due to the smoothing done by 24-h accumulations.

The exception is the blue BIAS line, where C1 had a strong positive/negative bias in Sydney for low/high rainfall rates, whereas C2 was always positive but generally closer to a perfect 1, except at high rainfall rates. Darwin C2 had little bias, except at thresholds above 20 mm day⁻¹, and C1 had large bias below 10 mm day⁻¹. This suggests that C2 performed better at forecasting heavy rainfall events but over-estimated them, whereas C1 often missed these and over-estimated low rainfall events. Similar results (not shown here) were also seen in the other domains.

3.2. FSS – satellite and radar observations

The RAINVAL verification showed C2 and C1 precipitation had systemic large-scale characteristic differences, such as high bias at higher rainfall thresholds.

Standard objective precipitation verification scores, such as RAINVAL, penalised C2 forecasts, since they presumed precise space and time matching of model and observational data. Thus, an otherwise ‘good’ forecast with the correct scale and intensity but also a small displacement error will suffer from ‘double penalty’ as discussed previously. Mittermaier et al. (2013) demonstrated that over a 6-month trial, the standard ETS verification of four United Kingdom (UK) Met Office NWP systems (global at 25 km resolution, regional at 12 km, and UK at 4 and 1.5 km) produced higher (better) ETS scores for the coarser-resolution models than the higher-resolution models, suggesting that standard verification metrics are challenging for precipitation, particularly heavy precipitation, and standard ‘ETS’ score alone could suggest that lower resolution models maybe more useful than higher resolution models,which was inconsistent with forecaster experience. The advantage of lower-resolution models was that the forecast rain filled the grid box, covering an overall greater area, than rain forecast using the higher-resolution models. It’s, therefore, more likely to predict rain where it occurred, even if its intensity was too low. The conservative forecast also tends to have lower RMSE.

To address the limitations of traditional scores when applied to high-resolution forecasts, several classes of approaches have been suggested (neighbourhood, scale separation, feature-based, and field deformation methods), and the neighbourhood FSS (Roberts and Lean 2008) was used here. This considers not just grid boxes, but a sequence of increasing sized grid box ‘neighbourhoods’ (1 × 1, 3 × 3, 5 × 5, etc.) at each grid box. For the 5 × 5 neighbourhood, for example, if both the model and the observations have, say, 8 of the 25 grid boxes with precipitation greater than some target precipitation threshold (e.g. 1 mm h⁻¹) the forecast is considered a good one at that threshold for that length-scale of five-times-grid-spacing, even if the eight grid boxes from the model and the eight grid points from the observations are at different locations in the 5 × 5 neighbourhood. The verification question then moves from ‘is this forecast accurate?’ to ‘at what length-scales (i.e. neighbourhood size) does the model achieve some prescribed level of accuracy?’.

Using the FSS approach, Mittermaier et al. (2013) were able to demonstrate that for the top 10% precipitation threshold, the UK 4 km model forecast was as accurate as the 12 km model, but at 10 km smaller spatial length-scales, i.e. the 4 km model, it added in a lot more (correct) detail.

In the both seasons, forecast accumulated rainfall over 3-h periods was regridded to the Global Precipitation Measurement-Integrated MultisatellitE Retrievals (Huffman et al. 2017) satellite rainfall 0.1° × 0.1° grid and verified over the same 3-h periods. These data were then used for the FSS calculations. The 3-h timescale is important for hydrological, precipitation and thunderstorm forecasting. More generally, at high resolution, we would like to be able to forecast the evolution of precipitation, rather than just its time-average in the form of daily gauge values.

The FSS has values between 0 and 1: 0 for a complete mismatch, 1 for a perfect forecast, and a skilful scale of FSS ~ 0.5 and above. Results for several 3-hourly thresholds for up to 36 h are presented (Figs 8–10), and if there are no rain events forecast and some occur, or some occur and none are forecast, the score is always 0. Note that the larger the rainfall threshold gets, the number of events that will be included decreases, and thus, FSS results become less reliable.

**Fig. 8.** C1/C2 (square/circle) domain median FSS as a function of spatial scale for 3.0 mm h⁻¹ threshold and forecast hours 21–24 h for winter (blue) and summer (green) (reproduced from Bureau of Meteorology (2018), fig. 3).

**Fig. 9.** C1/C2 (square/circle) Sydney, Brisbane and VicTas 167 km spatial scale median FSS as a function of lead times (3-hourly from 6 to 36 h) at threshold 1.5 mm h⁻¹ (green) and 6.0 mm h⁻¹ (blue) for (row 1) summer and (row 2) winter. Similar plots for Adelaide, Darwin and Perth are seen in rows 3 and 4) (reproduced from Bureau of Meteorology (2018), fig 4).

**Fig. 10.** FSS C2/C1 (blue/red) comparison panels for (top) Sydney (SY), (middle) Brisbane (BN), and (bottom) Adelaide (AD) at base times (left) 0000UTC t + 12 h and (right) 1200UTC t + 12 for using satellite precipitation as ‘truth’. Each panel shows six plots for 3-hourly thresholds of 1.5, 3, 6, 12, 18 and 24 mm h⁻¹.

Fig. 8 shows FSS as a function of spatial scale for the 3.0 mm h⁻¹ threshold and forecast rainfall accumulation over lead times between 21 and 24 h over each of the domains for winter and summer verification periods, and a few trends becoming immediately apparent. Firstly, FSS increased with the spatial distance scale, as expected, since we are effectively allowing for a greater tolerance of location errors in the forecast. Secondly, the FSS values for winter (green) were generally better than for summer (blue), except for Darwin during winter (its dry season) when few grid boxes reached the 3.0 mm threshold (hence the results are unreliable); and there was little difference between C1 and C2 values.

Both these points are consistent with the rainfall being much less convective during winter and hence rainfall fields being smoother in space and time and therefore more inherently predictable.

It is during the summer period (blue), when convection is more active, that the objectively better skill of C2 (blue circle) compared with C1 (blue square) is obvious. For every domain, apart from VicTas, the C2 FSS scores are clearly better than C1.

Another notable point is that for both the Brisbane and Perth domains the summer C2 FSS scores were the same or better than for the corresponding C1 winter scores. This was encouraging since, as discussed previously, summertime convective precipitation is inherently more difficult to predict than winter rainfall.

Fig. 9 shows FSS at the single spatial scale of 167 km as a function of 3-hourly forecast lead times from 6 to 36 h for both seasons and 1.5 mm h⁻¹ and 6.0 mm h⁻¹ thresholds. For most plots in both summer and winter, the FSS for the lower threshold are higher than the corresponding FSS at 6.0 mm h⁻¹, consistent with higher rainfall events being inherently less predictable than lower threshold events. While C2 scores were consistently higher than C1, the differences in FSS are much larger for the 6.0 mm h⁻¹ threshold compared with the lower threshold. Another interesting trend was that for the higher rainfall threshold for Darwin in winter, and for some lead times for Brisbane, and Perth and Adelaide (winter only), C1 did miss forecasting the high rainfall events totally, scoring near zero. These very low scores are consistent with the large negative biases for C1 for high rainfall thresholds noted in the previous RAINVAL section.

The analysis of the FSS results shown suggests the C2 CPM was objectively much better at forecasting rainfall events in the 3-h time period than C1, with skill being more marked for the higher rainfall thresholds. C1 also could completely miss higher threshold rainfall events.

Comparison between RAINVAL and FSS verification is not straightforward in that FSS 3-hourly verification results and point-based daily verification for a similar period can give somewhat different messages. For example, the daily point verifications above C2 were not clearly better according to most/all metrics in Sydney or Darwin but they werein VicTas and Perth. This contrasts with the 3-hourly FSS spatial verification results, which show C2 performed significantly better in Sydney and Darwin but only somewhat better in VicTas.

There are representativeness errors associated with comparing gridded model output to rain gauge observations, but at high resolution, maybe this is not so concerning. The larger grid size in C1 gave it a slight advantage over C2, as it was less likely to miss the gauge, and the areal average rain amounts would tend to be more conservative than C2 for the same domain rainfall, which tends to result in lower RMSEs.

But if having realistic rain structures becomes important, then the FSS verification is appropriate. The GPM IMERG rainfall estimates contain errors, but as the C1 and C2 models both had the same scale-dependent verification treatment, then errors in the reference data are unlikely to change the conclusions about relative performance.

Fig. 10 (left) shows median FSS C2 (blue) and C1 (red) comparisons panels for Sydney, Brisbane and Adelaide for summer base times 0000UTC t + 12 h using 3-hourly satellite precipitation observations. Each panel shows six, 3-hourly plots from 6 to 36 h and thresholds of 1.5, 3, 6, 12, 18 and 24 mm h⁻¹. These plots show that, in the Adelaide domain, C2 was better than C1, whereas Sydney C2 was markedly better than C1, and Brisbane C2 was very clearly better than C1. Fig. 10 (right) has the corresponding 1200UTC plots to show the diurnal cycle, and the domains have similar relative improvements but all with much lower FSS scores.

How well C2 and C1 captured convective initiation for both light and heavy rain (greater than 1 and 4 mm h⁻¹, respectively) when compared with radar observations using Rainfields (Seed et al. 2007), which provided grids of geo-referenced rainfall data or averages over user-defined catchments, is seen in box-and-whisker FSS statistics for 0000UTC December 2014 to April 2015 precipitation forecasts over the Darwin, VicTas and Brisbane domains (Fig. 11). The FSS scores are plotted against the length scale of the window-of-interest, from 11 to 582 km, and are calculated using data within the radar range. The small black line in the blue (C2) and red (C1) boxes mark the median or 50% percentile.

**Fig. 11.** C2/C1 (blue/red) box-and-whisker FSS radar precipitation verification statistics for panels (left) Darwin, (right) Vic Tas and (bottom) Brisbane at 0000UTC for December 2014 to April 2015. The top and bottom rows in each panel show rain rates greater than 1 mm h⁻¹ (light rain), and greater than 4 mm h⁻¹ (heavy rain), respectively.

Tropical Darwin shows C2 was much better than C1, and more so at a higher threshold 4 mm h⁻¹ (heavy rain), where at least 50% of these cases were not even forecast by C1. This suggested C1’s parameterised convection is particularly problematic for tropical convection. Mid-latitude VicTas C1 managed to capture the heavy rain events but was still outperformed by C2, whereas Brisbane shows C2 was much better at forecasting precipitation than C1. These trends observed in C2 vs C1 precipitation forecast quality are consistent with those from the satellite-based verification in the previous section.

4. Case studies

Case studies are used here to examine in more detail the impact of the C2 upgrade on actual extreme rainfall forecast simulations for three domains: Darwin, Brisbane and VicTas.

4.1. Extreme rain event in Darwin on 4 May 2017

C1, C2 and radar 24-h total precipitation for 4 May 2017 Darwin (Fig. 12) show that C1 had the entire domain covered with relatively light rain and little structure – a typical problem with parameterised convective models (discussed above) – while C2 captured the heavy rain around Darwin and the outer areas. Although C2 definitely ‘overdoes it’ relative to the radar (although this may be explained by radar errors), there was also more structure to the C2 precipitation and no spurious land-locking.

**Fig. 12.** Plots of 24-h accumulations of rain from C1 (left), C2 (right) and radar for the Darwin domain (bottom) on 4 May 2017.

As in the previous verification section, C2 was better than C1 with this forecast; but were convective resolving models useful for forecasters? This question is addressed in the next case study.

4.2. Supercell in Brisbane on 22 September 2017

This study indicates that C2 did well compared to C1, but even CPMs, such as C2, could not directly simulate the details of thunderstorm structure, as the formal resolution of C2 grid-cells were 0.0135 degrees (~1.5 km), which is still much less (>5–7 times) than the ‘effective resolution’ (i.e. the length-scale on which a model can simulate phenomena in a physically realistic and consistent manner) (Bryan et al. 2003; Skamarock 2004; Bierdel et al. 2012). Updraft strength is related to updraft size, and forecasters can infer details from the amount of vertical updraft and vertical vorticity rotation (updraft helicity) seen in a storm’s lower to middle troposphere. If this is significant, it is more likely to become a supercell.

Fig. 13 shows the Brisbane storm on 0800UTC 22 September 2017 radar-composite and 1-hourly precipitation, with a red circle around the storm, as well as C2 3-hourly precipitation and 400, 700 and 800 mb vertical velocity fields zoomed into this storm (where the model grid spacing is now evident).

**Fig. 13.** Brisbane storm on 0800UTC 22 September 2017: (top panel) Brisbane domain (left) Radar-composite and (right) 1-hourly precipitation plots; (middle panel) 1-h precipitation and vertical velocity at 400 mb C2 fields zoomed into the red circle region shown in the precipitation plot; and (bottom panel) zoomed in 700 and 800 mb vertical velocity. Note different contouring for each panel.

These plots show that the model had simulated the circular nature of updrafts extending vertically in supercell storms – it was not just a model single-point, ‘grid-point storm’. This was well simulated vertically but with local peak up/downdrafts located a few horizontal grid boxes from where the observed storm evolved. The updraft maxima increased with height with the 800, 700 and 400 mb plots, having maxima of 9–10, 12–15 and 24–25 m s⁻¹. This growth of updraft strength with height is typical of supercell formation, although upper-level maxima can get above 40 m s⁻¹.

4.3. Storm in Geelong on 27 January 2016

Geelong storm high space/time resolution radar images for a 6-h period starting 0054UTC 27 January 2016 (Fig. 14) show its convective temporal occurrence and evolution. All was quiet at 0054UTC, but by 0254UTC, convection had developed along the coastline south-west of Geelong. By 0331UTC, this consolidated as it approached Geelong, and by 0606UTC, it had subsequently propagated off to the north-east.

**Fig. 14.** Radar images of the Geelong storm on 27 January 2016 for (top) 0054UTC (left) and 0254UTC (right), (bottom) 0331UTC (left) and 0606UTC (right), [local time is plus 11 h i.e. corresponding to 11:54 am, 1:54 pm, 2:31 pm and 5:06 pm, respectively].

Compared to the radar, C2 captured the temporal and spatial growth and decay of the Geelong storm (Fig. 15), although with a time delay of some 3 h, with a simulated storm size and location much better than C1 at 0600UTC and 0700UTC forecast times. C1 shows coastally locked rain over the Geelong area and much weaker and smoother rainfall fields than indicated by radar observations.

**Fig. 15.** VicTas operational simulations at (left) 6-h and (right) 7-h forecast times of (top) C2 1.5 km convection-permitting and (bottom) C1 4 km parameterised convection.

5. Conclusions

With ~1.5 km resolution, C2 was much improved on C1, as it could simulate high temporal and spatial resolution convective features and be a CPM. The explicit convection of C2 avoided C1’s convective parameterisation problems, including coastal locking, weaker and unrealistically smooth simulated rain fields, and an inability to capture convective temporal occurrence and evolution. C2’s simulations improved convective lifecycle, organisation, motion and strength, and associated wind changes, although the timing of convective initiation was still a problem.

Verification of C1 and C2 forecasts against point-based AWAP AWS and rain-gauge network temperatures, dewpoints and wind speeds were presented, and RAINVAL verifications of rain gauge 24-h precipitation accumulation using traditional metrics were examined. The AWS scores varied with domain and season, but C2 was generally better/comparable to C1, and when worse, this occurred mainly in the Bias, and then mostly in summer when convection is more active. RAINVAL showed that most metrics acted in a similar manner for both domains and models as threshold size increased, except for the Bias, where C2 performed better at forecasting heavy rainfall events but over-estimated them, whereas C1 often missed these while over-estimating low rainfall events.

Standard point-based RAINVAL metrics require space-time matching of model and observational data, and the incurred miss and false-alarm penalties (‘double penalties’) were shown to be inappropriate for C2. This led to the use of the FSS metric to provide high-resolution, in time and space, verification against radar and satellite observations, while retaining RAINVAL for establishing large-scale grid box statistics – and for historical continuity.

Analysis of the FSS results showed that C2 was objectively better than C1 at forecasting rainfall events and was even more marked for high rainfall thresholds, where C1 completely missed some higher threshold rainfall events, consistent with the large negative C1 RAINVAL biases for high rainfall thresholds.

High rainfall case studies supported the verification results and demonstrated that C1 often had relatively light rain regions with little structure, due mainly to its parameterised convection, whereas C2 better captured the heavy rain regions. Although C2 often ‘overdid’ the rainfall relative to observations, it had no spurious land-locking and more realistic precipitation structure, which enabled a better simulation of convective events.

The present APS3 ACCESS-C forecast (C3) and ensemble (CE3) suites became operational in 2020. Significant C3 changes from C2 include the implementation of a variable grid, improved vertical resolution to 80 km, hourly 4D-Var data assimilation, updated to UM version 10.6 and nesting in APS3 ACCESS-G. More details on C3 and its comparison to C2, as well as procedures used for the assimilation of surface, radar and satellite data, can be seen in Rennie et al. (2020).

The CE3 forecast suite is based on the short-range, convective-scale ensemble prediction system over the UK, known as the Met Office Global and Regional Ensemble Prediction System (Hagelin et al. 2017). It has 2.2 km resolution, 12 members, runs 4 cycles per day to 42 h, the same vertical resolution and UM version as C3, and runs over the Brisbane, Sydney and VicTas domains. CE3 uses C3 analysis as a basis for initial conditions and lateral boundary conditions and large-scale perturbations from APS3 ACCESS-G. More details on CE3 can be seen in Cooper et al. (2020).

Thus, the C2 CPM improved Australian guidance rainfall and addressed several systemic issues in C1. C2 also laid the groundwork for the present-day C3 and CE3 models and further development of higher-resolution (down to 300 m) urban models.

Data availability

Data sharing is not applicable to this article as no new data were created.

Conflicts of interest

The authors declare no conflicts of interest.

Declaration of funding

This research did not receive any specific funding.

Acknowledgements

The authors would like to thank the many Australian Bureau of Meteorology individuals who contributed to the development of ACCESS-C2, without whom this operational CPM suite would never have eventuated, although this is by no means a comprehensive list. We thank Robin Bowen, Richard Dare, Imtiaz Dharssi, Yimin Ma, Michael Naughton, Rod Potts, Lawrie Rikus, Belinda Roux, Xiaoxi Wu and Hongyan Zhu.

References

Bierdel L, Friederichs P, Bentzien S (2012) Spatial kinetic energy spectra in the convection-permitting limited-area NWP model COSMO-DE. Meteorologische Zeitschrift 21, 245–258.

Bryan GH, Wyngaard JC, Fritsch JM (2003) Resolution Requirements for the Simulation of Deep Moist Convection. Monthly Weather Review 131, 2394–2416.
| Resolution Requirements for the Simulation of Deep Moist Convection.Crossref | GoogleScholarGoogle Scholar |

Bureau of Meteorology (2013) NMOC APS1 ACCESS-C Operational Bulletin No. 99. APS1 upgrade of the ACCESS-C Numerical Weather Prediction system. Available at http://www.bom.gov.au/australia/charts/bulletins/apob99.pdf

Bureau of Meteorology (2016a) BNOC Operations Bulletin No. 105. APS2 Upgrade to the ACCESS-G Numerical Weather Prediction System. Available at http://www.bom.gov.au/australia/charts/bulletins/APOB105.pdf

Bureau of Meteorology (2016b) BNOC Operations Bulletin No. 107. APS2 Upgrade to the ACCESS-R Numerical Weather Prediction System. Available at http://www.bom.gov.au/australia/charts/bulletins/apob107-external.pdf

Bureau of Meteorology (2018) NMOC Operations Bulletin Number 114 APS2 upgrade of the ACCESS-C Numerical Weather Prediction system. Available at http://www.bom.gov.au/australia/charts/bulletins/BNOC_Operations_Bulletin_114.pdf

Clark PA, Browning KA, Forbes RM, Morcrette CJ, Blythd AM, Leane HW (2014) The evolution of an MCS over southern England. Part 2: model simulations and sensitivity to microphysics. Quarterly Journal of the Royal Meteorological Society 140, 458–479.
| The evolution of an MCS over southern England. Part 2: model simulations and sensitivity to microphysics.Crossref | GoogleScholarGoogle Scholar |

Clark PA, Roberts NM, Lean HW, Ballard SP, Charlton‐Perez C (2016) Convection‐permitting models: a step‐change in rainfall forecasting. Meteorological Applications 23, 165–181.
| Convection‐permitting models: a step‐change in rainfall forecasting.Crossref | GoogleScholarGoogle Scholar |

Cooper S, Rennie S, Dietachmayer G, Steinle P, Xiao Y, Finch J, Marshall M (2020) ACCESS City Ensemble: Uncertainty in High Resolution NWP. Bureau of Meteorology Annual Research and Development Workshop. Available at http://www.bom.gov.au/research/workshop/2020/Talks/Shaun-Cooper.pdf

Davies T (2014) Lateral boundary conditions for limited area models. Quarterly Journal of the Royal Meteorological Society 140, 185–196.
| Lateral boundary conditions for limited area models.Crossref | GoogleScholarGoogle Scholar |

Dos Reis JBC, Rennó CD, Lopes ESS (2017) Validation of Satellite Rainfall Products over a Mountainous Watershed in a Humid Subtropical Climate Region of Brazil. Remote Sensing 9, 1240
| Validation of Satellite Rainfall Products over a Mountainous Watershed in a Humid Subtropical Climate Region of Brazil.Crossref | GoogleScholarGoogle Scholar |

Evans A, Jones D, Smalley R, Lellyett S (2020) An enhanced gridded rainfall analysis scheme for Australia. Bureau of Meteorology Bureau Research Report 41. Available at http://www.bom.gov.au/research/publications/researchreports/BRR-041.pdf

Hagelin S, Son J, Swinbank R, McCabe A, Roberts N, Tennant W (2017) The Met Office convective-scale ensemble, MOGREPS-UK. Quarterly Journal of the Royal Meteorological Society 143, 2846–2861.
| The Met Office convective-scale ensemble, MOGREPS-UK.Crossref | GoogleScholarGoogle Scholar |

Hartfield G (2017) How Convection-Allowing Models Have Changed Our World. Available at https://nwas.org/convection-allowing-models-changed-world/

Huffman GJ, Bolvin DT, Nelkin EJ (2017) Integrated Multi-satellite Retrievals for GPM (IMERG) Technical Documentation. Available at https://pmm.nasa.gov/sites/default/files/document_files/IMERG_technical_doc_3_22_17.pdf

Jakob D, Karoly DJ, Seed A (2011) Non-stationarity in daily and sub-daily intense rainfall – Part 2: Regional assessment for sites in south-east Australia. Natural Hazards and Earth System Sciences 11, 2273–2284.
| Non-stationarity in daily and sub-daily intense rainfall – Part 2: Regional assessment for sites in south-east Australia.Crossref | GoogleScholarGoogle Scholar | 201

Matte D, Laprise R, Thériault JM, et al. (2017) Spatial spin-up of fine scales in a regional climate model simulation driven by low-resolution boundary conditions. Climate Dynamics 49, 563–574.
| Spatial spin-up of fine scales in a regional climate model simulation driven by low-resolution boundary conditions.Crossref | GoogleScholarGoogle Scholar |

McBride J, Ebert E (2000) Verification of Quantitative Precipitation Forecasts from Operational Numerical Weather Prediction Models over Australia. Weather and Forecasting 15, 103–121.
| Verification of Quantitative Precipitation Forecasts from Operational Numerical Weather Prediction Models over Australia.Crossref | GoogleScholarGoogle Scholar |

Mesinger F (2008) Bias Adjusted Precipitation Threat Scores. Advances in Geosciences 16, 137–142.
| Bias Adjusted Precipitation Threat Scores.Crossref | GoogleScholarGoogle Scholar |

Mittermaier M, Roberts NM, Thompson SA (2013) A long term assessment of precipitation forecast skill using the fractions skill score. Meteorological Applications 20, 176–186.
| A long term assessment of precipitation forecast skill using the fractions skill score.Crossref | GoogleScholarGoogle Scholar |

Prein AF, Langhans W, Fosser G, et al. (2015) A review on regional convection-permitting climate modeling: Demonstrations, prospects, and challenges. Reviews of Geophysics 53, 323–361.
| A review on regional convection-permitting climate modeling: Demonstrations, prospects, and challenges.Crossref | GoogleScholarGoogle Scholar | 27478878PubMed |

Puri K, Dietachmayer D, Steinle P, Dix M, Rikus L, Logan L, Naughton M, Tingwell C, Xiao Y, Barras V, Bermous I, Bowen R, Deschamps L, Franklin C, Fraser J, Glowacki T, Harris B, Lee J, Le T, Roff G, Sulaiman A, Sims H, Sun X, Sun Z, Zhu H, Chattopadhyay M, Engel C (2013) Implementation of the initial ACCESS numerical weather prediction system. Australian Meteorological and Oceanographic Journal 63, 265–284.

Rennie S, Rikus L, Eizenberg N, Steinle P, Krysta M (2020) Impact of Doppler Radar Wind Observations on Australian High-Resolution Numerical Weather Prediction. Weather and Forecasting 35, 309–324.
| Impact of Doppler Radar Wind Observations on Australian High-Resolution Numerical Weather Prediction.Crossref | GoogleScholarGoogle Scholar |

Roberts NM, Lean HW (2008) Scale-selective verification of rainfall accumulations from high resolution forecasts of convective events. Monthly Weather Review 136, 78–96.
| Scale-selective verification of rainfall accumulations from high resolution forecasts of convective events.Crossref | GoogleScholarGoogle Scholar |

Seed A, Duthie E, Chumchean S (2007) Rainfields: the Australian Bureau of Meteorology system for quantitative precipitation estimation. Proc. 33rd Conf. on Radar Meteorology, Cairns, Australia. Available at https://ams.confex.com/ams/33Radar/techprogram/paper_123340.htm

Seed A, Bell A, Steinle P, Rennie S (Eds) (2019) Forecasting Demonstration Project – Sydney 2014. Bureau of Meteorology Bureau Research Report 46. Available at http://www.bom.gov.au/research/publications/researchreports/BRR-046.pdf

Skamarock WC (2004) Evaluating Mesoscale NWP Models Using Kinetic Energy Spectra. Monthly Weather Review 132, 3019–3032.
| Evaluating Mesoscale NWP Models Using Kinetic Energy Spectra.Crossref | GoogleScholarGoogle Scholar |

Sweeney CP, Lynch P, Nolan P (2013) Reducing errors of wind speed forecasts by an optimal combination of post-processing methods. Meteorological Applications 20, 32–40.
| Reducing errors of wind speed forecasts by an optimal combination of post-processing methods.Crossref | GoogleScholarGoogle Scholar |

Tang Y, Humphrey WL, Bornemannb J (2013) The benefits of the Met Office variable resolution NWP model for forecasting convection. Meteorological Applications 20, 417–426.
| The benefits of the Met Office variable resolution NWP model for forecasting convection.Crossref | GoogleScholarGoogle Scholar |

Tennant W (2015) Improving initial condition perturbations for MOGREPS-UK. Quarterly Journal of the Royal Meteorological Society 141, 2324–2336.
| Improving initial condition perturbations for MOGREPS-UK.Crossref | GoogleScholarGoogle Scholar |