How to Run AI Writing Tools Locally with WindowSill and Ollama

Some things shouldn't leave your computer. Medical notes, legal drafts, journal entries, messages to your therapist, that honest performance review you're still editing. When you paste those into a cloud AI service, they travel to servers you don't control and may end up in training data you never agreed to. That's the privacy problem WindowSill's new Ollama integration aims to solve. WindowSill, the Windows app that brings AI writing features to any application you're already using, now supports local LLM inference through Ollama—a free tool that runs AI models directly on your hardware. No data leaves your machine. Zero subscription fees for the AI layer. Just grammar checking, rewriting, tone adjustment, and translation running locally in Word, Outlook, Slack, or whatever you use to write.

What You'll Need

The setup requirements are reasonable for most modern PCs: WindowSill installed from the Microsoft Store (the free tier handles setup, though AI features need WindowSill+), Ollama downloaded from ollama.com, a minimum of 8 GB RAM (16 GB recommended if you want to run other apps alongside it), and a few gigabytes of disk space for your chosen model. No GPU required—Ollama runs on CPU just fine, though an NVIDIA card with 6+ GB VRAM will give you noticeably snappier responses.

Installing Ollama and Pulling Your First Model

Download the installer from ollama.com and run it. Once installed, Ollama sits quietly in the background as a local service. You can confirm it's working by opening any terminal and typing ollama --version—if you see a version number, you're good to go. Next comes pulling a model. The article recommends three solid options for writing tasks: Deepseek R1 8B (about 8 GB total) for general writing and grammar work, Qwen 3.5 4B (around 4.5 GB) if you need something lighter but still capable, or Deepseek R1 1.5B (roughly 1 GB) when speed matters more than nuanced output. Open a terminal and run ollama pull followed by your chosen model—this downloads everything to your machine once and keeps it there.

Connecting WindowSill to Ollama

This part is straightforward. Ollama exposes a local API at http://localhost:11434, which WindowSill can tap into automatically. Open Settings from the command bar in WindowSill, navigate to AI Writing & Analysis, find the AI Providers section, and select Ollama. Point it to localhost:11434, pick your pulled model from the dropdown, and you're done. Every AI request now routes through your local hardware instead of bouncing off external servers.

What You Can Do With This Setup

Once connected, WindowSill's full writing toolkit runs through your local model: grammar and spell checking across any app without switching windows, paragraph rewriting for polishing drafts or simplifying dense prose, tone adjustment between professional/casual/attention-grabbing modes (or custom presets like "customer support reply"), translation to 35+ supported languages, reusable prompts with variable injection for things like auto-formatted meeting recaps, and document summarization. Test it by typing a deliberately broken sentence—"Their going to the meeting tommorrow at 3pm, can you confirmed?"—selecting it, and hitting Spell Check from WindowSill's Analyze/Rewrite bar.

Performance Tips for Local AI

Local models are slower than cloud APIs—that's the honest trade-off. Keep things comfortable by closing models when you're done: run ollama stop to free up RAM. Use smaller models for quick tasks like grammar checks where Qwen 3.5 4B handles things well and responds faster than Deepseek R1 8B. WindowSill supports per-prompt model selection, so you can assign a fast small model for spell-checking while reserving a larger one for complex rewrites.

When Local Isn't Enough

Local models are good but not magic. The guide acknowledges where cloud still wins: long complex rewrites requiring nuance benefit from GPT 5.5 or Claude 4.5 Sonnet, uncommon language pairs like Finnish-to-Japanese need broader training data, and if you need responses in 1-2 seconds rather than 5-15 on CPU, the cloud has your back. The good news? You don't have to choose one or the other. WindowSill supports both local and cloud providers simultaneously, letting you route sensitive content through Ollama while using a cloud API for non-sensitive tasks.

Key Takeaways

Ollama runs fully offline once installed—no internet required after setup
Start with Qwen 3.5 4B or Deepseek R1 8B for most writing tasks; upgrade if needed
Per-prompt model selection lets you balance speed and quality per task
You can mix local (Ollama) and cloud providers in the same workflow

The Bottom Line

This setup won't match the raw capability of frontier models, but it doesn't have to. For grammar fixes, quick rewrites, and anything sensitive—therapist notes, legal drafts, honest feedback—it delivers solid results without the privacy anxiety. If you've been avoiding AI writing tools because you didn't want your data training the next model release, this is a practical off-ramp worth ten minutes of setup time.

> How to Run AI Writing Tools Locally with WindowSill and Ollama