Spring AI in Action: Integrating LLM Capabilities into Your Java Applications
Spring AI is a framework that simplifies the integration of artificial intelligence capabilities into Spring applications. In this introductory article to the 'Spring AI in Action' series, we explore the challenges of LLM integration in Java, the benefits of Spring AI, and the architecture of the demo project based on Spring Boot 4 and Spring AI 2.0.0-M2.
Generative artificial intelligence has transformed the way we design software applications. Large Language Models (LLMs) offer remarkable capabilities: text generation, translation, data analysis, and much more. But integrating these capabilities into an enterprise Java application is not trivial.
This is precisely the problem that Spring AI solves. In this article series "Spring AI in Action", we will explore each framework feature in depth through concrete, working examples.
In this introductory article, we will cover:
- The constraints of direct LLM integration
- What Spring AI is and its advantages
- The framework's key concepts
- The demo project architecture
A- The Constraints of LLM Integration
Integrating an LLM directly into a Java application presents several challenges:
Request and Response Management
LLM calls require configuring prompts in JSON format, managing authentication, configuring HTTP headers, and processing and parsing responses. Each provider (OpenAI, Anthropic, Google, etc.) has its own API with its own specificities.
Cost and Latency
LLM calls can be expensive and response times can vary depending on call volume, query complexity, or the model used. You need to implement token management, caching, and rate limiting strategies.
Vendor Lock-in
Without an abstraction layer, application code is tightly coupled to a specific provider. Migrating from OpenAI to Anthropic or a local model would require rewriting a significant portion of the code.
Cross-cutting Features
Needs like conversational memory, RAG (Retrieval-Augmented Generation), tool calling, or agent orchestration require complex implementations that would be inefficient to rebuild for each project.
B- Spring AI: Simplifying AI Integration
Spring AI is a framework that simplifies the integration of AI capabilities into Spring applications, providing a complete and consistent foundation.
Key Benefits of Spring AI
- Simplified AI integration: provides a unified interface for different AI services
- Consistency: uses familiar Spring patterns for AI development
- Boilerplate reduction: handles authentication, rate limiting, and network calls
- Portable API: unified API across different AI model providers and vector databases
C- Key Concepts of Spring AI
Spring AI is built around several fundamental concepts:
ChatClient API
A fluent API for communicating with AI models, idiomatically similar to Spring's WebClient and RestClient APIs.
ChatClient chatClient = chatClientBuilder.build();
String response = chatClient.prompt("Explain LLMs to me")
.call()
.content();Portable API
Unified API across different vector database providers and AI models. Switching providers often comes down to changing configuration, without touching business code.
Document Ingestion
An ETL (Extract, Transform, Load) framework for data engineering tasks: reading documents, splitting into chunks, generating embeddings, and storing in a vector database.
Advisors
The Advisor pattern encapsulates recurring cross-cutting concerns in AI applications: conversational memory, RAG, logging, etc.
Tools / Function Calling
Allows the AI model to request execution of client-side tools and functions, thereby accessing real-time information or triggering actions.
Memory Support
Native support for conversational memory to maintain context between exchanges with the LLM.
D- Supported Providers
Spring AI supports all major AI model providers:
| Type | Providers |
|---|---|
| Chat | OpenAI, Anthropic, Google, Amazon Bedrock, Azure OpenAI, Ollama, Mistral, etc. |
| Embeddings | OpenAI, Ollama, Azure, Amazon, etc. |
| Vector Databases | PostgreSQL/PGVector, Chroma, Pinecone, Milvus, Redis, Elasticsearch, Neo4j, etc. |
| Image | OpenAI DALL-E, Stability AI |
| Audio | OpenAI (Transcription, Speech) |
E- Demo Project Architecture
The "Spring AI in Action" project is a multi-module Maven project designed to illustrate each Spring AI feature progressively:
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>4.0.2</version>
</parent>
<groupId>com.rickenbazolo</groupId>
<artifactId>spring-ai-en-action</artifactId>
<version>0.1.0-SNAPSHOT</version>
<packaging>pom</packaging>
<properties>
<java.version>25</java.version>
<spring-ai.version>2.0.0-M2</spring-ai.version>
</properties>
<modules>
<module>chat</module>
<module>rag</module>
<module>tools</module>
<module>agent</module>
<module>mcp</module>
</modules>Tech Stack
- Spring Boot 4.0.2: latest major version of the framework
- Spring AI 2.0.0-M2: milestone version of the AI framework
- Java 25: lts JDK version
- Ollama: local model execution (qwen3, mistral, etc.)
- PostgreSQL + PGVector: vector database for RAG
Modules
| Module | Description | Spring AI Features |
|---|---|---|
| chat | LLM interactions | ChatClient, streaming, multi-model, memory |
| rag | Retrieval-Augmented Generation | VectorStore, embeddings, ingestion, advanced queries |
| tools | Tool calling / Function Calling | @Tool, @Description, ToolContext, Spring Security |
| agent | AI agent orchestration | Multi-agent, parallel execution, evaluation |
| mcp | Model Context Protocol | MCP Server/Client, remote tools |
F- Local Execution with Ollama
One of the project's advantages is using Ollama to run models locally, with no cloud dependency. To get started:
# Install Ollama (macOS)
brew install ollama
# Start the server
ollama serve
# Download models used in the project
ollama pull qwen3:0.6b
ollama pull mistral
ollama pull mxbai-embed-large
ollama pull qwen3-vl:2bThe Spring AI configuration for Ollama is straightforward:
spring:
ai:
ollama:
chat:
model: qwen3:0.6bG- Series Outline
This article series will cover all project modules:
- Introduction (this article) — Overview of Spring AI and the project
- ChatClient API — From simple prompts to multi-model streaming
- Chat Memory — Conversational context management
- RAG: Ingestion Pipeline — Building the data pipeline
- RAG: From Naive to Advanced — Pre-retrieval strategies
- Function Calling — When the LLM calls your Java methods
- Tools + Security — Securing AI tools with Spring Security
- Agents — Multi-agent orchestration with Virtual Threads
- MCP — Model Context Protocol with Spring AI
Each article comes with complete, working source code available on GitHub.
Conclusion
Spring AI brings to the Java ecosystem what developers needed: first-class AI integration, consistent with Spring conventions, and abstract enough to avoid vendor lock-in.
In the next article, we will dive into the ChatClient API with concrete examples: simple prompts, token tracking, response streaming, and multi-model support.
I hope you found this article useful. Thank you for reading.
To learn more:
- Spring AI Documentation: https://docs.spring.io/spring-ai/reference/
- Project source code: spring-ai-en-action
- Find our #autourducode videos on our YouTube channel