Mastering RAG with Spring AI: Transform Your Data into Smart Applications

In today’s AI-driven world, businesses are striving to extract meaningful insights from vast amounts of data. The concept of Retrieval-Augmented Generation (RAG) has gained popularity for blending the power of information retrieval and generative models. When combined with Spring AI, RAG becomes an invaluable tool for building intelligent, data-driven applications. In this blog tutorial, we’ll explore how to implement RAG with Spring AI, with practical examples and step-by-step instructions. We’ll also dive into integrating Ollama into your Spring AI RAG applications.

Table of Contents

What is RAG?

Retrieval-Augmented Generation (RAG) is an AI model architecture that combines both retrieval-based and generation-based models. In simple terms, it retrieves relevant data from external sources or a database and uses generative AI models to produce responses based on the retrieved information. This makes it particularly powerful for handling large datasets and improving the accuracy and relevance of AI-generated responses.

Why Use RAG with Spring AI?

By using RAG with Spring AI, developers can build smarter applications that leverage both stored data and generative capabilities, offering more dynamic and context-aware responses. The Spring AI framework helps integrate RAG seamlessly into your applications, allowing for easier implementation, scalability, and maintainability.

The combination of Spring AI RAG and Ollama, a local AI model designed for efficient retrieval and processing, allows developers to build AI applications that are both high-performing and cost-efficient.

Setting Up Your Spring AI RAG Application

Step 1: Set Up Spring Boot Project

To start implementing RAG with Spring AI, first, we need to create a Spring Boot project.

Go to Spring Initializr and create a new Spring Boot project with the following dependencies:
- Spring Web
- Spring Data JPA
- Spring Boot Starter AI (for integrating AI models)
- H2 Database (for data storage)
Download the project and import it into your favorite IDE (such as IntelliJ IDEA or Eclipse).
Once the project is imported, update your pom.xml file to include the OpenAI dependencies, as we will use OpenAI’s API for generative capabilities in our RAG model.

<dependency>
    <groupId>com.theokanning.openai-gpt3-java</groupId>
    <artifactId>client</artifactId>
    <version>0.10.0</version>
</dependency>

Step 2: Creating the Retrieval Layer

In RAG with Spring AI, the retrieval layer fetches relevant information from a database or external source. We’ll create a repository using Spring Data JPA that allows us to query relevant data.

Create an Entity for Your Data

@Entity
public class Document {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String content;

    private String title;

    // Getters and Setters

}

This code defines a Document entity that can be persisted in a database using Hibernate or JPA. The @Entity annotation tells the ORM framework to map this class to a corresponding table in the database. The @Id and @GeneratedValue annotations ensure that each document has a unique identifier. The content and title fields store the actual content and title of the document.

Create a Repository

@Repository
public interface DocumentRepository extends JpaRepository<Document, Long> {
    List<Document> findByContentContaining(String keyword);
}

This code defines a repository interface that can be used to perform various operations on Document entities. You can use this interface to:

Retrieve all documents: List<Document> allDocuments = documentRepository.findAll();
Find a document by its ID: Document document = documentRepository.findById(1L);
Save a new document: documentRepository.save(newDocument);
Delete a document: documentRepository.deleteById(1L);
Find documents based on content: List<Document> documentsWithKeyword = documentRepository.findByContentContaining("keyword");

This repository allows you to search for documents based on a keyword, which will be used in the retrieval phase of the RAG process.

Step 3: Integrating OpenAI for Generation

Next, we’ll integrate the generative model by connecting OpenAI’s GPT-3 model using Spring AI. You’ll need an API key from OpenAI.

Service for GPT-3 Generation

@Service

public class OpenAIService {

    private final String apiKey = "your-openai-api-key";

    public String generateResponse(String prompt) {
        OpenAiService service = new OpenAiService(apiKey);
        CompletionRequest completionRequest = CompletionRequest.builder()
            .prompt(prompt)
            .model("text-davinci-003")
            .maxTokens(150)
            .build();
        CompletionResult result = service.createCompletion(completionRequest);
        return result.getChoices().get(0).getText().trim();
    }
}

This code provides a convenient way to interact with the OpenAI API from your Java application. You can use this class to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

This service communicates with OpenAI and generates responses based on the prompt provided.

Step 4: Creating the RAG Workflow

Now that we have both the retrieval and generation layers set up, we can combine them to create a RAG workflow in our Spring AI RAG application.

RAG Service

@Service

public class RAGService {

    private final DocumentRepository documentRepository;

    private final OpenAIService openAIService;

    @Autowired
    public RAGService(DocumentRepository documentRepository, OpenAIService openAIService) {
        this.documentRepository = documentRepository;
        this.openAIService = openAIService;
    }

    public String getAugmentedResponse(String query) {
        // Retrieve relevant documents
        List<Document> relevantDocs = documentRepository.findByContentContaining(query);
        // Build prompt from retrieved documents
        StringBuilder prompt = new StringBuilder("Based on the following information:\n");
        for (Document doc : relevantDocs) {
            prompt.append(doc.getContent()).append("\n");
        }
        // Use OpenAI to generate response
        String generatedResponse = openAIService.generateResponse(prompt.toString());
        return generatedResponse;
    }
}

This code provides a basic implementation of RAG, where relevant documents are retrieved and used to augment the prompt for an OpenAI-generated response. This can be useful for tasks like question answering, summarization, and creative writing, where the context of existing documents can be leveraged to improve the quality of the generated response.

This RAGService fetches relevant data using the retrieval layer and sends it to OpenAI to generate a context-aware response. This is the core of RAG with Spring AI.

Step 5: Adding Ollama for Local Processing

Integrating Ollama into your Spring AI RAG application provides a cost-effective way to handle retrieval without relying on external cloud services. Ollama can be used as a local model to process and return the most relevant data.

Ollama Integration

To integrate Ollama, follow the steps outlined in your local Ollama setup guide. Once Ollama is installed, you can configure your application to retrieve data using this local model instead of relying solely on Spring Data JPA.

OllamaService

@Service
public class OllamaService {

    public String processQueryLocally(String query) {
        // Simulate local processing using Ollama
        // Here, we return a mock response for simplicity
        return "Ollama processed response for: " + query;
    }
}

This code provides a basic implementation of a service that can be used to process queries locally using Ollama. While this code doesn’t actually integrate with Ollama, it serves as a placeholder for a more complete implementation that would call Ollama’s API or use a local Ollama instance.

This service can be integrated into the RAG workflow to allow for local retrieval processing before sending the data to the generative model.

Step 6: Creating the Controller

Finally, we’ll create a REST controller to expose the RAG functionality in your Spring AI RAG application.

RAGController

@RestController
@RequestMapping("/api/rag")
public class RAGController {

    private final RAGService ragService;

    @Autowired
    public RAGController(RAGService ragService) {
        this.ragService = ragService;
    }

    @GetMapping("/generate")
    public ResponseEntity<String> generateResponse(@RequestParam String query) {
        String response = ragService.getAugmentedResponse(query);
        return ResponseEntity.ok(response);
    }
}

This code defines a RESTful API endpoint for generating augmented responses using the RAG service. Clients can send HTTP GET requests to the /api/rag/generate endpoint with a query parameter to obtain the augmented response.

This controller exposes an API that takes a query as input, retrieves the relevant data, and generates a response using RAG with Spring AI.

Conclusion

RAG with Spring AI offers a powerful solution for creating smart applications that can retrieve and generate relevant, context-aware information. By integrating Ollama for local data retrieval and using OpenAI for generative responses, your Spring AI RAG application can provide high-quality, real-time insights. This tutorial gave you the foundation to start building your own intelligent applications using RAG with Spring AI.

Share the post