GA7 Performance

Template:Needs Update This page needs updating

The following charts show the performance of the UM running GA7 at various processor decomposition, to help you decide the best tradeoff between walltime and SU cost

While increasing the number of CPUs in a run will speed it up, as you add more CPUs the model becomes more inefficient - more time gets spent by the CPUs communicating instead of computing.

This data was calculated from a number of 10 model day runs. Real run times will also be affected by writing output, and time spent in the queue can increase the walltime spent waiting for the model to complete considerably (changing the restart length / walltime to different values may improve this)

Changing the science settings can also improve the model performance - for example for longer runs you might consider using the `EasyAerosol` scheme instead of the default aerosols.

Recommended settings

We recommend for general GA 7 runs running on Raijin's Broadwell nodes the following settings:

Description Config Setting Value Comment
Restart length RESUB P4Y 4 model years
Walltime CLOCK '24:00:00' 24 hours
X Decomposition PE_ATM_NPROCX 34
Y Decomposition PE_ATM_NPROCY 28
OpenMP Threads OMPTHR_ATM 1

This will run 4 model years in approximately 18 wall hours, and cost 21 kSU

To conserve SU, a decomposition of 18 x 28 will run 4 model years in 24 hours and cost 15 kSU

To speed up the model (at the expense of SU cost) it is important to enable OpenMP threads, by setting OMPTHR_ATM to 2, otherwise the model becomes very inefficient at CPU counts above 1000 (the total CPU count is NPROCX * NPROCY * OMPTHR)

Performance by CPU count

NCI recommend starting out choosing a decomposition setting by minimising the value (walltime x SU cost), which can be seen below for UM versions 10.6 and 11.0

include component="page" wikiName="climate-cms" page="Index UM" editable="1"