IBM Systems Magazine, Mainframe - July/August 2013 - (Page 40)

TECH CORNER Architected for AVAILABILITY In addition to high performance, System z processors are designed to be reliable T he IBM System z* design focuses on reliability, availability and serviceability (RAS). To achieve an extremely dependable commercial system, each component in its hierarchy must have good error detection and recoverability. The microprocessors within each System z machine provide significant performance improvements over their predecessors while retaining the dependability expected from IBM mainframes. C. Kevin Shum is a Distinguished Engineer in IBM Poughkeepsie’s Systems and Technology Group, working in the development of System z microprocessors. Scott B. Swaney is a senior technical staff member working on hardware and system design and diagnostics for IBM servers in Poughkeepsie, NY. Although fault-tolerant design techniques are known to many, their application is important. System z processors are developed with diligent incorporation of checking logic designed to detect faults in the underlying circuits, which can be transient (due to charged particle strikes) or permanent (due to circuit failures). In addition to thorough error-detection coverage, the designs strive for highly transparent error handling. Their capability to seamlessly detect and correct faults while applications run is essential to maintaining near-zero downtime. Meanwhile, system availability data is collected and monitored so unforeseen problems can be identified and timely updates provided. Error Detection The logic in typical processors comprises arrays, dataflow and control. Arrays are typically used to hold large, structured sets of data, such as caches. Error detection is implemented by including check bits with the data written to the array. They’re used to validate the data when it’s later read from the array. The two categories of check Figure 1: Instruction Retry Any Error Detected Block Checkpoint sparing Initiate Sparing Determine recoverability retry Write through to L3 any checkpointed storage updates Notify L3 this core is now temporarily fenced off Array structures re-initialized Hardware states corrected / refreshed 40 // JULY/AUGUST 2013 Refresh starts Notify L3 core is online Start processing

Table of Contents for the Digital Edition of IBM Systems Magazine, Mainframe - July/August 2013

Table of Contents
Publisher's Desk: By design
IBM Perspective: Ongoing innovation on IBM System z
IT Today: Economics and performance make Linux on System z the clear choice
Partner PoV: Linux and open-source HA build on mainframe's strengths
Trends: New DataPower appliance for IMS rapidly transforms data for cloud and mobile apps
Cover Story: The Next Evolution of Linux on System z: The benefits of this technological synergy continue to advance
Feature: Making a Splash: Linux consolidation helps System z forge inroads in new markets
Feature: Software-Defined Environments Make Computing Smarter: By adding intelligence to the IT infrastructure, enterprises become responsive and flexible, an interview with IBM's Arvind Krishna
Tech Corner: In addition to high performance, System z processors are designed to be reliable
Administrator: System z innovations automatically define configurations for greater availability
Solutions: Compuware Workbench; ThruPut Manager AE+
Advertiser Index
Stop Run: Kochishan challenges misconceptions about polka—and the mainframe
Reference Point - Global Events, Education, Resources for Mainframe
2013 Mainframe Buyer's Guide Index

IBM Systems Magazine, Mainframe - July/August 2013