Tuesday 25 June 2024

Speed, scale and trustworthy AI on IBM Z with Machine Learning for IBM z/OS v3.2

Speed, scale and trustworthy AI on IBM Z with Machine Learning for IBM z/OS v3.2

Recent years have seen a remarkable surge in AI adoption, with businesses doubling down. According to the IBM® Global AI Adoption Index, about 42% of enterprise-scale companies surveyed (> 1,000 employees) report having actively deployed AI in their business. 59% of those companies surveyed that are already exploring or deploying AI say they have accelerated their rollout or investments in the technology. Yet, amidst this surge, navigating the complexities of AI implementation, scalability issues and validating the trustworthiness of AI continue to be significant challenges that companies still face.   

A robust and scalable environment is crucial to accelerating client adoption of AI. It must be capable of converting ambitious AI use cases into reality while enabling real-time AI insights to be generated with trust and transparency.  

What is Machine Learning for IBM z/OS? 


Machine Learning for IBM® z/OS® is an AI platform tailor-made for IBM z/OS environments. It combines data and transaction gravity with AI infusion for accelerated insights at scale with trust and transparency. It helps clients manage their full AI model lifecycles, enabling quick deployment co-located with their mission-critical applications on IBM Z without data movement and minimal application changes. Features include explainability, drift detection, train-anywhere capabilities and developer-friendly APIs. 

Machine Learning for IBM z/OS use cases


Machine Learning for IBM z/OS can serve various transactional use cases on IBM z/OS. Top use cases include:

1. Real-time fraud detection in credit cards and payments: Large financial institutions are increasingly experiencing more losses due to fraud. With off-platform solutions, they were only able to screen a small subset of their transactions. In support of this use case, the IBM z16™ system can process up to 228 thousand z/OS CICS credit card transactions per second with 6 ms response time, each with an in-transaction fraud detection inference operation using a Deep Learning Model.

Performance result is extrapolated from IBM internal tests running a CICS credit card transaction workload with inference operations on IBM z16. A z/OS V2R4 logical partition (LPAR) configured with 6 CPs and 256 GB of memory was used. Inferencing was done with Machine Learning for IBM z/OS running on Websphere Application Server Liberty 21.0.0.12, using a synthetic credit card fraud detection model and the IBM Integrated Accelerator for AI. Server-side batching was enabled on Machine Learning for IBM z/OS with a size of 8 inference operations. The benchmark was run with 48 threads performing inference operations. Results represent a fully configured IBM z16 with 200 CPs and 40 TB storage. Results might vary. 

2. Clearing and settlement: A card processor explored using AI to assist in determining which trades and transactions have a high-risk exposure before settlement to reduce liability, chargebacks and costly investigation. In support of this use case, IBM has validated that the IBM z16 with Machine Learning for IBM z/OS is designed to score business transactions at scale delivering the capacity to process up to 300 billion deep inferencing requests per day with 1 ms of latency.

Performance result is extrapolated from IBM internal tests running local inference operations in an IBM z16 LPAR with 48 IFLs and 128 GB memory on Ubuntu 20.04 (SMT mode) using a synthetic credit card fraud detection model exploiting the Integrated Accelerator for AI. The benchmark was running with 8 parallel threads, each pinned to the first core of a different chip. The lscpu command was used to identify the core-chip topology. A batch size of 128 inference operations was used. Results were also reproduced using a z/OS V2R4 LPAR with 24 CPs and 256 GB memory on IBM z16. The same credit card fraud detection model was used. The benchmark was run with a single thread performing inference operations. A batch size of 128 inference operations was used. Results might vary. 
 
3. Anti-money laundering: A bank was exploring how to introduce AML screening into their instant payments operational flow. Their current end-day AML screening was no longer sufficient due to stricter regulations. In support of this use case, IBM has demonstrated that the IBM z16 with z/OS delivers up to 20x lower response time and up to 19x higher throughput when colocating applications and inferencing requests versus sending the same inferencing requests to a compared x86 server in the same data center with 60 ms average network latency.

Performance results based on IBM internal tests using a CICS OLTP credit card workload with in-transaction fraud detection. A synthetic credit card fraud detection model was used. On IBM z16, inferencing was done with MLz on zCX. Tensorflow Serving was used on the compared x86 server. A Linux on IBM Z LPAR, located on the same IBM z16, was used to bridge the network connection between the measured z/OS LPAR and the x86 server. Additional network latency was introduced with the Linux “tc-netem” command to simulate a network environment with 5 ms average latency. Measured improvements are due to network latency. Results might vary. IBM z16 configuration: Measurements were run using a z/OS (v2R4) LPAR with MLz (OSCE) and zCX with APAR- oa61559 and APAR- OA62310 applied, 8 CPs, 16 zIIPs and 8 GB of memory. x86 configuration: Tensorflow Serving 2.4 ran on Ubuntu 20.04.3 LTS on 8 Skylake Intel® Xeon® Gold CPUs @ 2.30 GHz with Hyperthreading turned on, 1.5 TB memory, RAID5 local SSD Storage.  

Machine Learning for IBM z/OS with IBM Z can also be used as a security-focused on-prem AI platform for other use cases where clients want to promote data integrity, privacy and application availability. The IBM z16 systems, with GDPS®, IBM DS8000® series storage with HyperSwap® and running a Red Hat® OpenShift® Container Platform environment, are designed to deliver 99.99999% availability.

Necessary components include IBM z16; IBM z/VM V7.2 systems or above collected in a Single System Image, each running RHOCP 4.10 or above; IBM Operations Manager; GDPS 4.5 for management of data recovery and virtual machine recovery across metro distance systems and storage, including Metro Multisite workload and GDPS Global; and IBM DS8000 series storage with IBM HyperSwap. A MongoDB v4.2 workload was used. Necessary resiliency technology must be enabled, including z/VM Single System Image clustering, GDPS xDR Proxy for z/VM and Red Hat OpenShift Data Foundation (ODF) 4.10 for management of local storage devices. Application-induced outages are not included in the preceding measurements. Results might vary. Other configurations (hardware or software) might provide different availability characteristics. 

Source: ibm.com

Related Posts

0 comments:

Post a Comment