Difference between revisions of "WRF performance"

 
Line 13: Line 13:
 
In the legend, "T" indicates the number of threads (OpenMP).
 
In the legend, "T" indicates the number of threads (OpenMP).
  
One can see for this configuration, the best option is to run with 432 processors, 2 OpenMP threads and 1 I/O group with 2 tasks. This gives the best walltime at one of the lowest cost in SU. Unfortunately, it seems simulations with I/O quilting aren't very stable (this doesn't show in those short tests) so please choose the configuration you think is best for you.
+
One can see for this configuration, the best option is to run with 624 processors and 2 OpenMP threads. This gives the best walltime at one of the lowest cost in SU. I/O quilting does not seem to be worth using in this case.
  
[[File:WRF scaling v4.1.3.png|1000px|center]]
+
[[File:WRF_v4.1.3_SU_per_model_day.png|1000px|center]]
 +
[[File:WRF_v4.1.3_Walltime_per_model_day.png|1000px|center]]

Latest revision as of 22:24, 22 January 2020

Configuration

We tested the performance of WRF with the CORDEX domain with a nest over NSW, NARCliM project setup.

It is possible to look at WRF performance with other domains as long as those are used by a group of people and are used over a significant period of time if possible. Please contact cws_help@nci.org.au to discuss the possibility.

The configuration tested can be found at /g/data/sx70/wrf-scaling.

Results

Below is a chart showing the walltime (WT) and the SU cost per model day for different number of processors, threads and I/O quilting options.

The chart shows the data for Gadi and some for Raijin as a comparison.

In the legend, "T" indicates the number of threads (OpenMP).

One can see for this configuration, the best option is to run with 624 processors and 2 OpenMP threads. This gives the best walltime at one of the lowest cost in SU. I/O quilting does not seem to be worth using in this case.

WRF v4.1.3 SU per model day.png
WRF v4.1.3 Walltime per model day.png