Redis Iris: Beyond RAG with Agentic Context Engines | Tom Karels

Redis Iris: Beyond RAG with Agentic Context Engines

YouTube

This video explores the evolution of Retrieval-Augmented Generation (RAG) into more complex agentic retrieval systems, focusing on the newly announced Redis Iris. While many claim that traditional RAG is dead, the reality is that enterprises are shifting toward a knowledge or context layer that sits between AI agents and their underlying data sources. This shift is driven by the need to solve common production issues like stale data, slow retrieval, and fragmented memory that often plague basic RAG implementations. The presenter discusses how high-speed data retrieval pioneers are converging on solutions that provide agents with a navigable path through business entities rather than just simple vector search. Redis Iris is introduced as an end-to-end context engine designed to function at scale by meeting four specific requirements: navigability, speed, freshness, and self-improvement over time. The video breaks down the specific components of the Iris stack, including Redis Data Integration (RDI) for real-time syncing, the Redis Context Retriever for entity mapping, and specialized memory and caching layers like LangCache. By mirroring operational data into a high-speed Redis environment, developers can provide agents with a flattened, de-normalized view of their business without overloading transactional systems. Finally, the video contrasts Redis Iris's runtime-focused architecture with build-time solutions like Pinecone Nexus, helping developers choose the right tool for their specific data environment.

AI Agents Agentic RAG Redis Iris

The video provides a deep dive into Redis Iris, a newly announced end-to-end context engine designed to power the next generation of AI agents. It covers how Redis Iris addresses the shortcomings of traditional RAG (Retrieval-Augmented Generation) by providing a high-speed, real-time knowledge layer that mirrors operational data. By moving away from simple vector search and toward a structured, navigable context layer, Redis Iris allows AI agents to reason over business entities like customers, orders, and tickets with much higher reliability and lower latency. This architecture is particularly suited for production environments where data changes rapidly and agents need a fresh, consistent view of the world to function effectively.

Key Takeaways

Traditional RAG is evolving into agentic retrieval, which utilizes a dedicated context layer between the agent and data sources.
Redis Iris is an end-to-end context engine that focuses on four pillars: navigability, fast retrieval, data freshness, and self-improvement.
Redis Data Integration (RDI) uses change data capture (CDC) to mirror operational databases like Postgres or Oracle into Redis in real-time.
The Context Retriever provides agents with pre-defined tools and entities (MCP/CLI) to traverse complex data relationships.

Timestamps

00:00

IntroductionDiscussing the 'RAG is dead' sentiment and the shift to agentic retrieval.

00:56

The Need for Context EnginesExplaining why production AI agents often underdeliver and the problems they face.

02:12

Requirements for ScaleThe four requirements for agents to function at scale: navigability, speed, freshness, and improvement.

03:06

Redis Iris Stack OverviewBreaking down RDI, Context Retriever, Agent Memory, and LangCache.

04:24

Redis Data Integration (RDI)How RDI mirrors operational data into Redis in real-time.

05:34

Context Retriever & ToolsDefining business entities and providing agents with tools to query them.

06:29

Memory & CachingDeep dive into Redis Agent Memory and LangCache semantic caching.

09:03

Comparison with Pinecone NexusContrasting build-time vs. runtime knowledge layer architectures.

Target Audience

AI engineers, software architects, and data scientists building production-grade LLM applications who are struggling with data freshness and retrieval latency.

Use Cases

-Building real-time customer support bots that need access to live order data
-Developing internal enterprise search tools that require complex entity relationships
-Optimizing LLM costs and latency through semantic caching
-Implementing long-term memory for personalized AI assistants

Key Topics

Agentic Retrieval Architecture