JetBrains Junie Exits Beta as Top-Ranked AI Coding Agent on SWE-Bench

JetBrains dropped its Junie AI coding agent out of beta today, and the timing couldn't be better for developers who've been watching this one evolve. The tool that started as an internal experiment—asking "what if an AI agent actually used your tools instead of guessing at them?"—is now production-ready across all JetBrains IDEs and the Junie CLI.

A Benchmark Win That Matters

Junie just topped the latest SWE-Rebench run, an independent agent benchmark that refreshes its task set each cycle to keep evaluations honest. The numbers: 61.6% resolved and a 72.7% pass@5 rate, placing it ahead of other agents and competitive with raw frontier models. Alexander Golubev, Research Lead at Nebius, noted in the announcement that "SWE-Rebench draws fresh tasks each cycle to keep the evaluation honest"—meaning this isn't a cherry-picked result. This is reproducible performance.

Plan Mode: The Agent Thinks Before It Codes

The feature I'm most interested in seeing developers actually use is Advanced Plan mode. Most AI coding agents fail because they implement before anyone agrees on what they're building—you end up reviewing a PR that solves the wrong problem or burning tokens on a path you'd have rejected in thirty seconds. Junie addresses this by making planning a first-class artifact: before writing code, it produces a structured document with tabs for product requirements, technical design, delivery stages, and testing strategy. You read it, edit it directly in your editor, approve it—and then Junie implements. The plan lives in .junie/plans so you can commit it as living task documentation, not a throwaway chat message. Activate Plan mode with Shift+Tab, open the plan with Ctrl+P, hit Confirm when ready to implement.

Agentic Debugging: Real Breakpoints, Not Print Statements

Here's where Junie differentiates itself from the AI coding agent crowd. When something breaks, most agents add log statements and hope for the best. Junie opens the actual debugger. It can launch run configurations, debug tests, or take over existing sessions you already have open. Set breakpoints in project code, library code, SDK code—even decompiled .class files and sources inside JARs. Inspect real runtime state: stack frames, thread state, expression evaluation, run-to-line. The agent collects actual evidence instead of theorizing about what your code might be doing. "Continue my current debug session and tell me why this value becomes null" becomes a legitimate handoff—routine inspection work that frees you to think about the bigger picture.

No Model Lock-In: Bring Your Own Key

One thing JetBrains gets right is treating cost as a dial you control, not a constraint imposed by the vendor. Junie supports any model without lock-in. Use frontier models from Anthropic, OpenAI, or Google from day zero, or point Junie at local runtimes via LiteLLM, LMStudio, or Ollama. Prompts and code never leave your machine if you're running local. The philosophy here is sound: "Plan on a strong model; implement on a cheap one." Top-tier reasoning handles the thinking; smaller models handle grunt work. Cost efficiency becomes something you manage, not something imposed.

Remote Control and Context-Aware Code Review

Some tasks don't fit in a focused 30-minute session—Spring Boot upgrades, migrations to Java records, adding test coverage to legacy services. Junie runs asynchronously so you can start from your laptop, check progress from your phone during a meeting, and review the PR over coffee. For code review specifically, Junie brings full project context: build configuration, tests, conventions, past decisions. It highlights meaningful changes, explains design rationale, and gives accept/reject controls inline. Trigger reviews from GitHub Actions or GitLab (including on-prem), or via the /review command in CLI.

Deep IDE Integration Built on ACP

Junie has always worked inside JetBrains IDEs, but with GA the integration is rebuilt on top of ACP—the Agent Communication Protocol that Junie CLI also uses. One engine powers AI chat, the dedicated Junie tool window, and Junie CLI; improvements ship once and appear everywhere. The agent uses your IDE's semantic index, build configurations, test runners, and debugger—not approximations of them. Database integration goes through DataGrip and the JetBrains Database plugin, letting Junie query real data and write, fix, and validate SQL in the same session handling your code.

Key Takeaways

Junie hit #1 on SWE-Rebench with 61.6% resolved, 72.7% pass@5—reproducible against fresh tasks
Advanced Plan mode makes planning a first-class artifact before any code ships
Agentic debugging uses real breakpoints and IDE debugger integration instead of log statements
Zero model lock-in: frontier APIs or local runtimes via LiteLLM, LMStudio, Ollama

The Bottom Line

JetBrains built Junie for developers who want AI that works like a senior engineer on their team, not an autocomplete wrapper with delusions of grandeur. The benchmark win is nice validation, but the real story is what you can now delegate: planning, debugging, code review, and long-running tasks—all with your IDE's actual tools under the hood rather than approximations. If you've been burned by AI agents that confidently implement the wrong thing, Plan mode alone justifies giving Junie a shot on your next backlog task.

> JetBrains Junie Exits Beta as Top-Ranked AI Coding Agent on SWE-Bench