IBM Systems Magazine, Mainframe - May/June 2017 - 16

R&D

"We're looking at adding this new notion
of data value because, although some
correlation might exist between the value
and the popularity of the same piece of
data, not all data that's popular is
valuable and vice versa."
-Giovanni Cherubini, data storage scientist, IBM

metadata, and what their importance is-it's essentially teaching
itself. Whenever a new file comes
in, the system looks at its metadata and learns to detect similar
patterns identified as being
important or not important. This
would potentially create an estimation of the data value. That's
then used by the selector, which
decides on the type of storage
policy to implement on a piece of
new data with a certain estimated
data value, what medium to store
it on and the type of redundancy
that should be involved.
Another important part of this
system is the feedback loop, which
aims to periodically reassess data
value to determine if it had been
potentially misclassified as important. It can also track changes in
data value over time. Different
types of data, some of which is
called must-keep data, may be critical to a business and must be reliably stored for extended periods
of time. Other types of data with
an expiration date-data collected
for regulatory reasons for a certain
number of years-may exist.
Business or experimental data
is another type where we think
cognitive storage might be most
appropriate. Businesses collect

this all the time, and it usually has
immediate value. For example,
data collected by an online store
about what's in your shopping cart
can be used for billing, checking
out, shipping and so on. But it
might also have value later, in the
sense that data analytics can be
performed on shopping cart data
from multiple consumers over long
periods of time to gain insight into
which customers buy what, market
identification, etc. This is how data
values can change over time, so
reassessments come into play. This
is a key aspect of our cognitive
storage concept.
ISM: As part of your paper,
"Cognitive Storage for Big
Data" in IEEE Computer magazine, you did a cognitive
storage demonstration that
was nearly 100 percent accurate. What was involved?
VV: Our test server had about 1.7
million files on it, and we asked
our colleagues to label the files as
belonging to one project or another. The users classified the files
into three projects. Then metadata
was collected from the file system:
file size, extension and name, the
path to the file in the directory
structure, the user who owns the

16 // MAY/JUNE 2017 ibmsystemsmag.com

W
Of the

1.7
million
files used to
demonstrate
cognitive
storage,
approximately

170,000
were used to
train a test server
system

file, the group to which the user
belongs, access permission for the
file, and so on.
We fed this information to a
supervised machine-learning
algorithm called information
bottleneck. It knows the metadata
corresponding to the file, the file
size, the file's owner, what the
extension is and what project it
belongs to. Of the 1.7 million files
we had, approximately 170,000
files were used to train the system.
All 170,000 files were given labels
for each project they belonged
to, and the learning algorithm
searched for patterns in the metadata that would indicate which
project it belonged to. Keep in
mind that one of the projects had
only 157 files out of the total of
1.7 million files, while others had
hundreds of thousands of files.
We chose 10 percent of the
files from each of these projects
to train the system using information bottleneck. Then we
took the remainder of the files
and asked the system to predict, based on what it learned,
which project it thought a new
file (that it hadn't seen before)
belonged to. Because we had the
project information for all 1.7
million files, we could compare


http://www.ibmsystemsmag.com

Table of Contents for the Digital Edition of IBM Systems Magazine, Mainframe - May/June 2017

Table of Contents
Editor's Desk: Blockchain's Assets
IBM Perspective: The Foundation for Trust in Blockchains
Infographic: Securing Your Blockchian: The details that matter
Techbits: Spurring Economic Growth in Africa
Partner PoV: Current Demand: Modernize environment and applications to stay competitive
R&D: Cognitive Conditioning: How IBM researchers use machine learning to make storage smarter
Cover Story: The Business of Blockchain: The ledger technology is set to streamline markets and transform industries
Feature: Added Security: IBM delivers additional protection for blockchain
Feature: Shared Information: Blockchain works with systems of record to integrate data
TECH Showcase: To Serve and Protect: IBM Lab Services for z Systems and LinuxONE helps clients stay vigilant against data hackers
Administrator: Logical Process: IBM CICS asynchronous API allows for more natural program creation
Solutions: Mainframe Operations Intelligence, CA Technologies; GIT for IBM z/OS, Rocket Software; SMA_RT Software V3.3.005, Type80 Security Software Inc.; Topaz for Total Test, Compuware Corporation
Stop Run: From Military to Mainframe: McLaughlin defies the odds with determination and hard work
Reference Point - Global Events, Education, Resources for Mainframe
2017 Mainframe Solutions Edition
IBM Systems Magazine, Mainframe - May/June 2017 - Intro
IBM Systems Magazine, Mainframe - May/June 2017 - Cover1
IBM Systems Magazine, Mainframe - May/June 2017 - Cover2
IBM Systems Magazine, Mainframe - May/June 2017 - 1
IBM Systems Magazine, Mainframe - May/June 2017 - Table of Contents
IBM Systems Magazine, Mainframe - May/June 2017 - 3
IBM Systems Magazine, Mainframe - May/June 2017 - 4
IBM Systems Magazine, Mainframe - May/June 2017 - 5
IBM Systems Magazine, Mainframe - May/June 2017 - Editor's Desk: Blockchain's Assets
IBM Systems Magazine, Mainframe - May/June 2017 - 7
IBM Systems Magazine, Mainframe - May/June 2017 - IBM Perspective: The Foundation for Trust in Blockchains
IBM Systems Magazine, Mainframe - May/June 2017 - Infographic: Securing Your Blockchian: The details that matter
IBM Systems Magazine, Mainframe - May/June 2017 - Techbits: Spurring Economic Growth in Africa
IBM Systems Magazine, Mainframe - May/June 2017 - 11
IBM Systems Magazine, Mainframe - May/June 2017 - Partner PoV: Current Demand: Modernize environment and applications to stay competitive
IBM Systems Magazine, Mainframe - May/June 2017 - 13
IBM Systems Magazine, Mainframe - May/June 2017 - R&D: Cognitive Conditioning: How IBM researchers use machine learning to make storage smarter
IBM Systems Magazine, Mainframe - May/June 2017 - 15
IBM Systems Magazine, Mainframe - May/June 2017 - 16
IBM Systems Magazine, Mainframe - May/June 2017 - 17
IBM Systems Magazine, Mainframe - May/June 2017 - Cover Story: The Business of Blockchain: The ledger technology is set to streamline markets and transform industries
IBM Systems Magazine, Mainframe - May/June 2017 - 19
IBM Systems Magazine, Mainframe - May/June 2017 - 20
IBM Systems Magazine, Mainframe - May/June 2017 - 21
IBM Systems Magazine, Mainframe - May/June 2017 - 22
IBM Systems Magazine, Mainframe - May/June 2017 - 23
IBM Systems Magazine, Mainframe - May/June 2017 - 24
IBM Systems Magazine, Mainframe - May/June 2017 - 25
IBM Systems Magazine, Mainframe - May/June 2017 - 26
IBM Systems Magazine, Mainframe - May/June 2017 - Feature: Added Security: IBM delivers additional protection for blockchain
IBM Systems Magazine, Mainframe - May/June 2017 - 28
IBM Systems Magazine, Mainframe - May/June 2017 - 29
IBM Systems Magazine, Mainframe - May/June 2017 - 30
IBM Systems Magazine, Mainframe - May/June 2017 - 31
IBM Systems Magazine, Mainframe - May/June 2017 - Feature: Shared Information: Blockchain works with systems of record to integrate data
IBM Systems Magazine, Mainframe - May/June 2017 - 33
IBM Systems Magazine, Mainframe - May/June 2017 - 34
IBM Systems Magazine, Mainframe - May/June 2017 - 35
IBM Systems Magazine, Mainframe - May/June 2017 - 36
IBM Systems Magazine, Mainframe - May/June 2017 - TECH Showcase: To Serve and Protect: IBM Lab Services for z Systems and LinuxONE helps clients stay vigilant against data hackers
IBM Systems Magazine, Mainframe - May/June 2017 - 38
IBM Systems Magazine, Mainframe - May/June 2017 - 39
IBM Systems Magazine, Mainframe - May/June 2017 - 40
IBM Systems Magazine, Mainframe - May/June 2017 - 41
IBM Systems Magazine, Mainframe - May/June 2017 - 42
IBM Systems Magazine, Mainframe - May/June 2017 - 43
IBM Systems Magazine, Mainframe - May/June 2017 - Administrator: Logical Process: IBM CICS asynchronous API allows for more natural program creation
IBM Systems Magazine, Mainframe - May/June 2017 - 45
IBM Systems Magazine, Mainframe - May/June 2017 - Solutions: Mainframe Operations Intelligence, CA Technologies; GIT for IBM z/OS, Rocket Software; SMA_RT Software V3.3.005, Type80 Security Software Inc.; Topaz for Total Test, Compuware Corporation
IBM Systems Magazine, Mainframe - May/June 2017 - 47
IBM Systems Magazine, Mainframe - May/June 2017 - Stop Run: From Military to Mainframe: McLaughlin defies the odds with determination and hard work
IBM Systems Magazine, Mainframe - May/June 2017 - Cover3
IBM Systems Magazine, Mainframe - May/June 2017 - Cover4
IBM Systems Magazine, Mainframe - May/June 2017 - Reference Point - Global Events, Education, Resources for Mainframe
IBM Systems Magazine, Mainframe - May/June 2017 - 2017 Mainframe Solutions Edition
IBM Systems Magazine, Mainframe - May/June 2017 - SE2
IBM Systems Magazine, Mainframe - May/June 2017 - SE3
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20191112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20190910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20190708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20190506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20190304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/relevantz_20190102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/2019mfse
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20190102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20181112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20180910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20180708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20180506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20180304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20180102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/2018mfse
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20171112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170910_v2
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_sesupp
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20170102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_linuxsupp
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20161112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/MainframeSecurity
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20160910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20160708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20160506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20160304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20160102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20151112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150910_se
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150910
http://www.ibmsystemsmagmainframedigital.com/MFSkills
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150506_supp
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20150102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20141112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140910_v2
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_gt_201405
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/BigData
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20140102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20131112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130910_v2
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20130102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20121112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/buyersguide2013
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20120910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20120708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20120506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20120304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20120102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/ibmsystems_mainframe_2012bg
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20111112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20110910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20110708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20110506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20110304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20110102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20101112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100910_bg
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100506
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100304
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20100102
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20091112
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20090910
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20090708
http://www.ibmsystemsmagmainframedigital.com/nxtbooks/ibmsystemsmag/mainframe_20090506
http://www.nxtbook.com/nxtbooks/ibmsystemsmag/mainframe_20090304
http://www.nxtbook.com/nxtbooks/mspcomm/ibmsystems_mainframe_200901
http://www.nxtbookMEDIA.com