Java Developer Builds Local AI Assistant That Keeps Your Data Off The Cloud

Sujan Kim has open-sourced Jarvis AI Platform, a local-first AI assistant built entirely in Java that runs on your own hardware—no cloud subscriptions required. The project uses Spring Boot 4.0.6 with Spring AI 2.0 and Ollama as the primary model runner, keeping conversations private by design. Available now on GitHub at github.com/sujankim/jarvis-ai-platform.

Why Local-First Matters

Most AI assistants route your conversations through third-party infrastructure. You depend on their uptime, their pricing changes, and their privacy policies—all factors outside your control. Jarvis flips this model: everything runs locally via Ollama, with Google Gemini as an automatic fallback if the local model goes offline. The architecture ensures zero external dependency for core functionality.

Provider Abstraction Layer

The project's most interesting design decision is its provider-agnostic architecture. Every AI backend—Ollama or Gemini—implements a single AiProvider interface with methods for streaming chat and availability checks. A router automatically selects the best available provider at runtime, meaning users never need to reconfigure anything when switching models or when Ollama becomes unavailable.

Reactive Streaming Pipeline

Jarvis implements full end-to-end reactive streaming from model output to terminal display. Tokens flow through Spring AI into a Flux, then through an SSE endpoint before reaching the CLI client. This means users see responses appear character-by-character rather than waiting 10-30 seconds for complete generation—a critical UX improvement for AI applications where latency dominates the experience.

The Whitespace Bug

One of the stranger debugging sessions involved spaces disappearing from responses, producing output like "Hellohowareyoutoday?" instead of properly spaced sentences. The culprit turned out to be Server-Sent Events stripping leading whitespace from tokens during transmission. Kim's fix wrapped each token in a JSON payload with escaped characters rather than sending raw text—proof that AI projects still contain plenty of mundane bugs unrelated to the model itself.

Working Memory and Prompt Assembly

Jarvis injects contextual information into every request via a working-memory block containing current date, username, role, session ID, and model name. A PromptAssembler component constructs final prompts by concatenating system instructions, working memory, session history, and the user's current message before sending to the AI provider.

Tech Stack

The full stack includes Java 21, Spring Boot 4.0.6, Spring WebFlux, Spring Security 7 with JWT authentication, Argon2id password hashing, PostgreSQL 16 accessed via R2DBC, Flyway migrations, and Spring Shell 4 for the CLI interface. MapStruct handles object mapping between layers.

Key Takeaways

Provider abstraction makes AI backends interchangeable without touching application logic
Reactive streaming is essential for AI apps where generation latency dominates user experience
Local models (llama3.1:8b) are already capable enough for many personal workflows
Spring AI integrates naturally into the Java ecosystem, reducing friction for backend developers

The Bottom Line

This project proves the Java ecosystem is finally ready for serious AI development. If you've been watching the AI space from the sidelines as a Java developer, Spring Boot 4 and Spring AI have closed the gap with Python-first frameworks. The tools exist now—time to build something.

> Java Developer Builds Local AI Assistant That Keeps Your Data Off The Cloud