Posts

Showing posts from March, 2024

Why RAG Answers Fall Short and How Query Enrichment Can Help

Image
❓Why do the users of RAG describe the answers as ‘correct but not useful’? How can the RAG pipeline be improved to avoid this? πŸ˜’ RAG user: ‘this answer is good, but it is not useful to me. It should have also mentioned a few more things’. πŸ€” RAG Developer: ‘this looks perfectly good to me. I am myself amazed how my program is able to produce such a beautiful answer’. One will encounter many such interactions while putting RAG into production. During the prototyping, when RAG starts producing correct answers, there is a feeling of awe and surprise. But when the actual users start using it for day-to-day use, they find the answers very basic. They say that they can find these answers reasonably quickly without the tool. Why does this happen? 🀨 In my previous posts, I described how RAG works. It basically has three parts: 1️⃣Retrieve the material relevant to users' query. 2️⃣Augment the prompt with the material. 3️⃣Generate the answer using an LLM. Each of the above parts can be im...

AI learns language with the help of a child

Image
  Today I want to talk to you about an experiment conducted by some AI researchers to see if AI can learn language the way human babies do. With this in mind, they put a head mounted camera on an infant and tried to understand how human babies comprehend language with the help of the visual and auditory cues it gets from its environment and can AI learn on the same lines. We’ll discuss that experiment in a bit. If you come to think of it, learning a language is such a complex task and yet human babies do it pretty easily! Contrary to the belief that parents or caretakers teach a child to learn a language, children acquire language through interaction — not only with their parents and other adults, but also with other children. All normal children who grow up in normal households, surrounded by conversation, will acquire the language that is being used around them. And it is just as easy for a child to acquire two or more languages at the same time, as long as they are regularly int...

From Demo to Deployment: Why Scaling RAG Is No Easy Task

Image
  ❓RAG demos are easy. RAG in production is much harder. Why is it so? Recently I wrote about Retrieval Augmented Generation (RAG), the technique that enables enterprises to leverage the power of large language models (link in comment ). Many companies have experienced that their technology teams are able to demonstrate the use of RAG quite rapidly. However, when the time comes to put the solution into production, they face difficulties. The users do not accept the solution easily and many don’t think it adds a lot of value. We will first see why the demos are easy and then reflect on the challenge of productionising RAG. Why are RAG demos easy? πŸ‘‰ Simple architecture: The principle of RAG is quite simple to understand. In the basic RAG pipeline, there are very few components: vector database, source chunking, vector matching and LLM interface. It is easy for engineers to understand the architecture. πŸ‘‰ Framework availability: To make matters simpler, helper frameworks such as Lang...

What is RAG, the technique that helps companies to leverage Generative AI?

Image
❓What is RAG, the technique that helps companies to leverage Generative AI? If you have used ChatGPT or Bard (now Gemini), you know how useful GenAI can be. It can answer your questions, summarize books πŸ“˜ and write articles. There are a host of other things that you can do with LLMs like ChatGPT and Gemini, by learning a bit about how to write prompts. All this is good for personal use. But what about the use of LLMs inside organizations 🏒? Not many applications of ChatGPT for companies come to mind. The idea πŸ’‘that helped companies to make use of LLMs is ‘augmentation’. In simple words, it means including additional information in the prompt. This additional information comes from a source within the company. It can be a contract πŸ“ƒ, records from a database or even a powerpoint presentation. We can also combine information from many sources and include it in the prompt. Of course, the additional information must be first fetched from the right source. A CXO wants to know the impli...