Thursday 20 June 2024

The recipe for RAG: How cloud services enable generative AI outcomes across industries

The recipe for RAG: How cloud services enable generative AI outcomes across industries

According to research from IBM, about 42 percent of enterprises surveyed have AI in use in their businesses. Of all the use cases, many of us are now extremely familiar with natural language processing AI chatbots that can answer our questions and assist with tasks such as composing emails or essays. Yet even with widespread adoption of these chatbots, enterprises are still occasionally experiencing some challenges. For example, these chatbots can produce inconsistent results as they’re pulling from large data stores that might not be relevant to the query at hand.

Thankfully, retrieval-augmented generation (RAG) has emerged as a promising solution to ground large language models (LLMs) on the most accurate, up-to-date information. As an AI framework, RAG works to improve the quality of LLM-generated responses by grounding the model on sources of knowledge to supplement the LLM’s internal representation of information. IBM unveiled its new AI and data platform, watsonx, which offers RAG, back in May 2023.

In simple terms, leveraging RAG is like making the model take an open book exam as you are asking the chatbot to respond to a question with all the information readily available. But how does RAG operate at an infrastructure level? With a mixture of platform-as-a-service (PaaS) services, RAG can run successfully and with ease, enabling generative AI outcomes for organizations across industries using LLMs.

How PaaS services are critical to RAG


Enterprise-grade AI, including generative AI, requires a highly sustainable, compute- and data-intensive distributed infrastructure. While the AI is the key component of the RAG framework, other “ingredients” such as PaaS solutions are integral to the mix. These offerings, specifically serverless and storage offerings, operate diligently behind the scenes, enabling data to be processed and stored more easily, which provides increasingly accurate outputs from chatbots.

Serverless technology supports compute-intensive workloads, such as those brought forth by RAG, by managing and securing the infrastructure around them. This gives time back to developers, so they can concentrate on coding. Serverless enables developers to build and run application code without provisioning or managing servers or backend infrastructure.

If a developer is uploading data into an LLM or chatbot but is unsure of how to preprocess the data so it’s in the right format or filtered for specific data points, IBM Cloud Code Engine can do all this for them—easing the overall process of getting correct outputs from AI models. As a fully managed serverless platform, IBM Cloud Code Engine can scale the application with ease through automation capabilities that manage and secure the underlying infrastructure.

Additionally, if a developer is uploading the sources for LLMs, it’s important to have highly secure, resilient and durable storage. This is especially critical in highly regulated industries such as financial services, healthcare and telecommunications.

IBM Cloud Object Storage, for example, provides security and data durability to store large volumes of data. With immutable data retention and audit control capabilities, IBM Cloud Object Storage supports RAG by helping to safeguard your data from tampering or manipulation by ransomware attacks and helps ensure it meets compliance and business requirements.

With IBM’s vast technology stack including IBM Code Engine and Cloud Object Storage, organizations across industries can seamlessly tap into RAG and focus on leveraging AI more effectively for their businesses.

The power of cloud and AI in practice


We’ve established that RAG is extremely valuable for enabling generative AI outcomes, but what does this look like in practice?

Blendow Group, a leading provider of legal services in Sweden, handles a diverse array of legal documents—dissecting, summarizing and evaluating these documents that range from court rulings to legislation and case law. With a relatively small team, Blendow Group needed a scalable solution to aid their legal analysis. Working with IBM Client Engineering and NEXER, Blendow Group created an innovative AI-driven tool, leveraging the comprehensive capabilities of  to enhance research and analysis, and streamlines the process of creating legal content, all while maintaining the utmost confidentiality of sensitive data.

Utilizing IBM’s technology stack, including IBM Cloud Object Storage and IBM Code Engine, the AI solution was tailored to increase the efficiency and breadth of Blendow’s legal document analysis.

The Mawson’s Huts Foundation is also an excellent example of leveraging RAG to enable greater AI outcomes. The foundation is on mission to preserve the Mawson legacy, which includes Australia’s 42 percent territorial claim to the Antarctic and educate schoolchildren and others about Antarctica itself and the importance of sustaining its pristine environment.

With The Antarctic Explorer, an AI-powered learning platform running on IBM Cloud, Mawson is bringing children and others access to Antarctica from a browser wherever they are. Users can submit questions via a browser-based interface and the learning platform uses AI-powered natural language processing capabilities provided by IBM watsonx Assistant to interpret the questions and deliver appropriate answers with associated media—videos, images and documents—that are stored in and retrieved from IBM Cloud Object Storage.

By leveraging infrastructure as-a-service offerings in tandem with watsonx, both the Mawson Huts Foundation and Blendow Group are able to gain greater insights from their AI models by easing the process of managing and storing the data that is contained within them.

Enabling Generative AI outcomes with the cloud


Generative AI and LLMs have already proven to have great potential for transforming organizations across industries. Whether it’s educating the wider population or analyzing legal documents, PaaS solutions within the cloud are critical for the success of RAG and running AI models.

At IBM, we believe that AI workloads will likely form the backbone of mission-critical workloads and ultimately house and manage the most-trusted data, so the infrastructure around it must be trustworthy and resilient by design. With IBM Cloud, enterprises across industries using AI can tap into higher levels of resiliency, performance, security, compliance and total cost of ownership.

Source: ibm.com

Related Posts

0 comments:

Post a Comment