When the Neuron AI team shipped their official router package, they got hammered with the same question from every PHP developer who touched it: can you route easy prompts to cheap models and hard ones to strong models? The answer was technically yes, but nobody had solved the hard part—defining 'hard' in code that actually works at scale. The existing workarounds are all embarrassing once you look closely. Routing by character count assumes longer equals harder, which falls apart when a one-line question about Italian contract law beats a ten-page log file summary. Keyword matching with lists like "legal," "code," or "calculate" survives exactly one week before you're maintaining an eternal dictionary and missing every phrasing you didn't anticipate. The most honest attempt—asking an LLM to rate difficulty first—actually works, except now you're paying for a model call just to decide whether to make another model call. For something running on every single request, that's the wrong trade by definition. The new neuron-core/llm-classifier package takes a fundamentally different approach. It builds a small classifier that reads an incoming prompt and returns a difficulty score between 0 and 1—where 0 means your models handle it easily and 1 means they struggle. The critical word is 'your.' This isn't a generic guess about abstract difficulty; it's learned from the specific model lineup you're actually routing between, so the scores reflect what your particular setup finds hard.

Pure PHP, No Sidecars

The package runs entirely in PHP with only ext-mbstring as a dependency. There's no Python sidecar to deploy, no GPU required, and no inference server sitting next to your application waiting to crash at 3 AM. Training happens once offline, producing a single model.bin file that you commit alongside your code. Scoring runs in microseconds, in-process, before any socket opens to an external provider. On every request, you get a number—and that number costs nothing extra. Under the hood, words become numbers using fastText word vectors (300 dimensions), so semantically similar terms like "buy" and "purchase" land close together while irrelevant pairs stay distant. Each prompt gets reduced to one averaged fingerprint of those numbers, which feeds directly into the classifier. The pieces of the dictionary your training data actually uses get baked into model.bin at runtime—no fastText file needed in production.

RouterBench: Ready-to-Use Training Data

You don't have to assemble your own dataset to get started. The package ships with a ready-to-use dataset derived from the public RouterBench benchmark—a stratified sample of around 1,845 prompts that already carries precomputed difficulty labels for each one. Because the hard work is done, this path needs no model panel, no graders, and zero API calls. You only need the fastText vectors and a few seconds of CPU time.

Integration With DifficultyRule

The router now ships a DifficultyRule wrapper that makes wiring this up straightforward. You pass it the loaded classifier and your provider lineup, then set thresholds: overall() scores under 0.33 route to the cheap tier (like GPT-4o-mini), scores between 0.33 and 0.70 hit the mid-range model, and everything above climbs to your most capable option. There's also a coverage cut-off that decides how unfamiliar a prompt must be before you stop trusting the score entirely and default to maximum capability. As far as the author knows, this is the first time a prompt difficulty classifier has been wired into a production framework in pure PHP—and that's the part worth being quietly pleased about.

Key Takeaways

  • Difficulty scores come from YOUR models, not generic benchmarks—reflecting what your specific lineup struggles with
  • Training runs once offline; scoring adds microseconds to request latency with zero external dependencies
  • The package includes RouterBench data so you can have a working model before finishing your coffee
  • Two tuning knobs: difficulty thresholds and coverage cut-off—both adjustable via production logs, not intuition

The Bottom Line

For years, PHP developers routing LLM requests had two choices: blast everything to the expensive model or hand-roll keyword matching that rotted within days. This package finally gives you a measured answer from your own data at essentially zero cost. Train it once, deploy model.bin, and watch your API bill drop without touching user-facing latency.