IBM Systems Magazine, Mainframe - January/February 2017 - SE27
((( SPONSORED AVER TISING CONTENT )))
STOP THROTTLING THAT
FERRARI: GET BETTER GAS
z Performance Specialist
John has been analyzing and
tuning MVS systems for over
20 years. He never trusts a
computer he can lift.
MVS Solutions Inc.
#400 - 8300 Woodbine Ave.
Markham, ON, L3R 9Y7, Canada
Today, we seem to be reaching the upper limits of CPU power in terms of raw clock speed.
The latest IBM z Systems* machine (z13*) has a lower clock speed than its predecessor,
the EC12*-yet the z13 achieves a 12 percent average improvement in total capacity
according to the Large System Performance Reference (LSPR). How is this possible?
The primary capacity bottleneck today is not in the capability of the CPU to execute
instructions; it's in the capability of the supporting infrastructure to keep the CPU
supplied with a steady stream of data and instructions to process. This critical task
falls to the CPU caches.
All modern CPU designs make use of multilevel caches in order to achieve this important
goal. In the z13, the small but fast L1 and L2 caches are dedicated to each CPU core.
Further out, there are larger L3, L4, and main memory caches that are shared among
CPUs. In IBM z Systems architecture, these shared cache areas are referred to as the nest.
The z13 CPUs "spin" at a rate of 5 GHz-5 billion cycles per second. In traditional
capacity planning terms, this could be referred to as 5,000 MIPS. However, your actual
achieved MIPS rate largely depends on how many cycles are wasted waiting for data
or instructions to be fetched from the processor caches. You likely only need a few
cycles to fetch from the local L1 or L2 caches. The deeper and more often you need
to go into the nest, the greater the cost in lost cycles. This measurement is known as
Relative Nest Intensity (RNI).
How do you optimize access to processor caches? Many factors are simply beyond
practical controls, however actions can be taken:
* Study your SMF 113 records to understand your processor cache efficiency
* Limit processor sharing among LPARs by setting PR/SM weights carefully and
balancing logical and physical CPUs; in a Hiperdispatch context, try to avoid
"Vertical Low" polarized CPs
* Reuse data in caches by minimizing the number of diverse applications and limit
concurrency in LPARs (automate batch initiators and CICS*/IMS* regions)
* Automate input queues to control CPU utilization (to avoid processor overload)
It's not about the engine; it's about the fuel. If you're not keeping the engine well
fed, you're just spinning your wheels.
ibmsystemsmag.com/buyersguide 2017 // 27
pg 8-C4.indd 20
11/30/16 1:26 PM