IBM Systems Magazine, Mainframe - November/December 2014 - (Page 33)

TECH CORNER Bringing Value With HADOOP No matter where your data resides, get the most out of it with System z M ost enterprise operational data around the world is stored on System z* servers. Enterprises rely on this structured data for their day-to-day operations. But with the exponential growth of both semi-structured and unstructured forms of data from mobile, social media and various systems of engagement, yielding new business insights from all of the available data is becoming increasingly difficult. Vic Leith is a senior software engineer at IBM's Competitive Project Office. John Thomas is a senior technical staff member at IBM's Competitive Project Office. Companies can build a 360-degree view of their customers, giving them a competitive edge when identifying opportunities and allowing them to better serve their customers. One way is by using Apache Hadoop, an open-source framework that has become popular for processing various data types, particularly semi-structured and unstructured. IBM provides an implementation of Hadoop called IBM InfoSphere* BigInsights* that is available on multiple platforms. A spectrum of use cases tie Hadoop processing with the System z platform. This article focuses on two categories of use cases: data that originates on the mainframe and data that originates outside the mainframe (see Figures 1 and 2, page 34). Hadoop With System z The first category deals with structured and semi-structured data that originates on the mainframe. Operational data from System z in the form of log files, XML, VSAM, DB2* data, IMS* data, etc., can be brought into a Hadoop environment and processed using Hadoop tools. The result set may be consumed by System z or used downstream. The second category deals with large amounts of semi-structured or unstructured data that originates outside System z but must be integrated with operational data on the mainframe. Again, the processing is done in the Hadoop environment and the result set can be consumed by System z or downstream. A combination of these categories is also possible. Distributed Hadoop implementations are widespread; however, customers face challenges when they use a hybrid environment. Two common challenges are: challenges of Hadoop processing in conjunction with System z, such as: 1. Data governance. Data is considered secure as long as it's on the mainframe. The System z platform must be in control of the data. System z administrators have well-established standards that govern processing policies, security standards, and regulatory and auditing requirements. 2. Ingestion of data from System z into the Hadoop environment. We need highspeed or optimized connectors between traditional z/OS* LPARs and a Hadoop environment under System z control. Solutions can address the In each of these examples, two key requirements are the capability to efficiently extract and load data, and end-to-end processing, which includes data loading, job submission and results retrieval using System z credentials and under System z governance. The IBM Competitive Project Office conducted a series of tests that studied several of the aforementioned hybrid Hadoop options under System z governance (see Figure 3, page 34), including: ĀÃ $QÃRQSUHPLVHVÃ+DGRRSÃDSSOLance under System z control ĀÃ $QÃRQSUHPLVHVÃ+DGRRSÃFOXVWHUÃ built on discrete hardware, under System z control ĀÃ $QÃRQSUHPLVHVÃ+DGRRSÃ cluster built with Linux* on System z nodes ĀÃ 2IISUHPLVHVÃ+DGRRSÃHQYLronments under System z control. These clusters can be appliances, discrete nodes or cloud-based nodes ĀÃ 2QSUHPLVHVÃGLVWULEXWHGÃ hybrid. This environment consists of a z/OS LPAR with an on-premises NOVEMBER/DECEMBER 2014 // 33

Table of Contents for the Digital Edition of IBM Systems Magazine, Mainframe - November/December 2014

Table of Contents
Editor's Desk: A satisfied customer
IBM Perspective: Ensuring a faster response
Partner PoV: Through enablement and transformation, System z offers delivery and support
Cover Story: The Big Return on Big Data: Transforming data into informative insights increases its value
Feature: The Analytics Advantage: Accelerate analytics with IBM's self-managing hybrid database management system
Feature: A Customer Focus: In real-time scoring with an IBM solution, data is processed in milliseconds, offering quick service to customers
Tech Showcase: Why modernizing makes good business sense
Tech Corner: No matter where your data resides, get the most out of it with System z
Solutions: VNAC; Control-M Workload Change Manager
Stop Run: Ukrainian-born mainframer feels at home in the U.S.
Reference Point - Global Events, Education, Resources for Mainframe
2015 Mainframe Solutions Edition Product Index

IBM Systems Magazine, Mainframe - November/December 2014