Financial advisors have a dirty secret: they spend almost half their working hours buried in paperwork instead of actually advising clients. The culprit? Compliance documentation. After every client meeting, advisors must document what was discussed, what was recommended, whether those recommendations were suitable under FINRA and SEC rules, and format everything for CRM ingestion. A 45-minute meeting routinely generates 2 hours of this drudgery. Developer Archit Mittal built an open-source tool that collapses that timeline to roughly 3 minutes. The full architecture is now live on GitHub at github.com/archit-akg13/advisor-meeting-notetaker, along with a detailed teardown on DEV.to explaining every architectural decision and trade-off.
The Real Pain Point Isn't What You Think
When Mittal started interviewing advisory firms, he expected complaints about slow meetings or clunky CRM software. Instead, every compliance officer said the same thing: "We're not worried about the notes. We're worried about what's NOT in the notes." If a client says "I can't afford to lose this money" and the advisor recommends an aggressive growth fund hours later from memory, that's a FINRA 2111 suitability violation with no record of the red flag. This insight fundamentally reshaped the entire system design โ it's not a transcription tool with formatting, it's a compliance engine that actively hunts for mismatches between what clients say they need and what advisors recommend.
Architecture: A Four-Stage Pipeline
The stack uses Python/FastAPI on the backend, React frontend, Whisper running locally for audio transcription, and Claude via OpenRouter for structured extraction. The pipeline flows: Audio โ Transcription (Whisper) โ Structured Extraction (Claude via OpenRouter) โ Compliance Check (Rule engine) โ CRM Note (Formatter). Two architectural decisions stand out as non-obvious wins. First, Whisper runs locally on-premise rather than sending audio to cloud APIs. Advisory meetings contain PII and legally privileged information โ for most firms, transmitting that data externally isn't just undesirable, it's a regulatory non-starter. Second, the compliance engine deliberately does NOT use an LLM. You can't have a probabilistic system making deterministic compliance judgments. The LLM's job ends at extraction; actual compliance checking happens against structured data with hardcoded rules.
Extraction: Hunting for Risk Signals
The LLM receives raw transcript text and returns structured JSON, but the critical field is risk_signals โ an array of objects capturing what was said, severity level (low/medium/high), and surrounding context. The system specifically hunts phrases indicating compliance risk: "I can't afford to lose this money" signals risk tolerance mismatch, "my wife doesn't know about this account" flags documentation issues, and "just put it wherever you think is best" raises discretionary authority concerns. Mittal sets the extraction prompt temperature to 0.1 โ when extracting compliance-relevant data, creative interpretation is the enemy. Most meeting note-takers extract topics and action items; this system extracts the specific phrases that could trigger regulatory scrutiny.
The Compliance Engine: Rules Over Models
The compliance check code is deliberately simple. That's the point. Risk-averse keywords like "can't afford to lose," "conservative," "safe," and "preserve capital" get cross-referenced against aggressive product mentions including growth funds, equities, crypto, leveraged products, options, and futures. If both appear in the same meeting record without reconciliation, status turns RED with a flag: "SUITABILITY CONCERN: Risk-averse language detected alongside aggressive product recommendations." The Reg BI check catches another common examination finding by verifying whether alternatives were documented when recommending products โ if not, status becomes YELLOW. This keyword matching isn't glamorous or sophisticated, but it's exactly what compliance officers need to prevent six-figure fines during audits.
Why Open Source?
Mittal makes a compelling case for releasing the codebase publicly: the code itself isn't the competitive advantage. Any competent developer can chain Whisper โ LLM โ rule engine in an afternoon. Real value lives in knowing which problem to solve (built from dozens of conversations with compliance officers, not a spec sheet), production hardening for SOC 2 and on-prem deployment, CRM integrations, SSO, audit logging, and ongoing regulatory updates as FINRA changes rules. The moat is domain expertise layered onto solid engineering โ understanding that compliance departments already think in traffic-light systems, knowing which phrases trigger suitability concerns, recognizing that PII constraints make local processing non-negotiable for most advisory firms.
Key Takeaways
- Whisper runs locally to protect PII and legally privileged meeting content from cloud API exposure
- Compliance checking uses deterministic rule engines, NOT LLMs โ probabilistic systems can't make regulatory determinations
- Traffic-light status (GREEN/YELLOW/RED) maps directly to how compliance departments already categorize risk
- Temperature 0.1 on extraction prompts prevents creative interpretation of compliance-relevant data
- The open-source code is commodity; domain expertise and production hardening are the actual value
The Bottom Line
This project demonstrates a principle more developers should internalize: sometimes the boring solution wins. Keyword matching for compliance flags isn't sexy, but it keeps firms out of regulatory trouble. Before reaching for another LLM wrapper, ask whether your use case actually needs probabilistic reasoning โ or whether you need something that does exactly what it says on the tin, every single time.