The following charts show the performance of the UM running GA7 at various processor decomposition, to help you decide the best tradeoff between walltime and SU cost
While increasing the number of CPUs in a run will speed it up, as you add more CPUs the model becomes more inefficient - more time gets spent by the CPUs communicating instead of computing.
This data was calculated from a number of 10 model day runs. Real run times will also be affected by writing output, and time spent in the queue can increase the walltime spent waiting for the model to complete considerably (changing the restart length / walltime to different values may improve this)
Changing the science settings can also improve the model performance - for example for longer runs you might consider using the `EasyAerosol` scheme instead of the default aerosols.
We recommend for general GA 7 runs running on Raijin's Broadwell nodes the following settings:
|Restart length||RESUB||P4Y||4 model years|
This will run 4 model years in approximately 18 wall hours, and cost 21 kSU
To conserve SU, a decomposition of 18 x 28 will run 4 model years in 24 hours and cost 15 kSU
To speed up the model (at the expense of SU cost) it is important to enable OpenMP threads, by setting OMPTHR_ATM to 2, otherwise the model becomes very inefficient at CPU counts above 1000 (the total CPU count is NPROCX * NPROCY * OMPTHR)
Performance by CPU count
NCI recommend starting out choosing a decomposition setting by minimising the value (walltime x SU cost), which can be seen below for UM versions 10.6 and 11.0