UM version4.5 benchmarks

From SourceWiki

Revision as of 15:24, 17 December 2012 by GethinWilliams (talk | contribs) (→‎Intel SandyBridge)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to navigation Jump to search

Benchmarking UM Version4.5 on different Architectures

Preamble

Cluster/Parallel file systems are often a bottleneck.
If the model is not filesystem-bound, it is often (MPI massage) latency-bound.
Only the master process writes output, this can lead to load-balance issues, which hinder scaling.

AMD Bulldozer

Intel Westmere

Intel SandyBridge

Test system: Quad socket, 8-core E-4650L (2.60GHz) (L for Low power)
20MB L3 cache

MPI message latency
	0 bytes	128 bytes	1024 bytes
between sockets	~0.70us	~1.15us	~2.0us

FAMOUS

Domain Decomposition	Model-years/day
4x2	~327
8x2	~450
8x4	~480

HadCM3

Domain Decomposition	Model-years/day
8x2	~48
8x4	~65

Retrieved from "https://source.geography.bristol.ac.uk/mediawiki/index.php?title=UM_version4.5_benchmarks&oldid=8606"

JASMIN