2024-04-17: RAG against the Machine

The Truth Serum for Structured Outputs

🔷 Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.

Here’s today at a glance:

🖲️ RAG Against the Machine

Interesting applied LLM product work from ServiceNow, which recently declared massive revenue increases from the use of AI products. Their reduction in hallucinations is significant but still a work in progress.

Research Paper Title:

An LLM that could actually perfectly do this would be significant

What products does this enable?

Near Term:

  • Natural Language to Workflow Tools: Develop applications that allow users to create workflows using natural language instructions, simplifying and democratizing process automation.

  • Low-Resource LLM Applications: Implement RAG with smaller LLMs and retrievers to enable deployment in resource-constrained environments.

  • Customizable Enterprise Workflows: Build systems that allow enterprises to easily add their own workflow steps and adapt the system to their specific needs.

Long Term:

  • Generalized Structured Output Generation: Expand the RAG approach to other structured output tasks, such as generating code, SQL queries, or other domain-specific formats.

  • AI-powered Business Process Automation: Design systems that can automatically analyze and optimize business processes using natural language descriptions and generate corresponding workflows.

Who:

The research was conducted by Patrice Béchard and Orlando Marquez Ayala, both affiliated with ServiceNow, a leading provider of cloud-based workflow automation solutions.

Why

The researchers aimed to address the challenge of hallucination in LLMs when applied to structured output tasks, such as generating workflows from natural language requirements. This is crucial for real-world GenAI systems to gain user trust and adoption.

How

The team implemented a Retrieval-Augmented Generation (RAG) approach, where a retriever suggests relevant steps and table names based on the user's natural language input. These suggestions are then incorporated into the LLM prompt to guide the generation of the structured JSON output representing the workflow. They experimented with various LLM and retriever models, comparing their performance on in-domain and out-of-domain datasets.

The RAG system architecture: User queries are processed by both the retriever and the LLM to generate structured workflow outputs.

What did they find

  • RAG significantly reduces hallucination in generated workflows, improving the trustworthiness of the output.

  • Using a small, well-trained retriever allows deploying a smaller LLM without sacrificing performance, making the system more resource-efficient.

  • Their RAG approach generalizes well to out-of-domain scenarios, demonstrating its adaptability to different use cases.

What are the limitations and what's next

  • The retriever's recall could be improved, potentially by decomposing complex queries into smaller parts for more precise retrieval.

  • The LLM could benefit from further training to better understand the semantics of the task, particularly for steps involving logic and control flow.

  • Future work includes exploring joint training of the retriever and LLM for better synergy and investigating alternative structured output formats and decoding methods to improve efficiency.

2 classes of errors encountered

Why it matters:

This research demonstrates the effectiveness of RAG in mitigating hallucination and improving the quality and trustworthiness of structured outputs generated by LLMs. This has significant implications for the development of reliable and practical GenAI systems, particularly in enterprise settings where accuracy and customization are crucial.

🌠 Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!

Or send them the below subscription link:

🖼️ AI Artwork Of The Day

LinkedIn profile shots for women working in Finance - u/grpswshrs from r/MidJourney

That’s it for today! Become a subscriber for daily breakdowns of what’s happening in the AI world:

:

Join the conversation

or to participate.