Post
14
Open-sourced TRACER.
Many LLM classification calls in production are overkill.
For tasks like intent detection, moderation, tagging, or routing, TRACER learns which requests can be safely offloaded to a lightweight ML model trained on the LLM’s own outputs.
You keep the hard cases on the LLM, set a target quality bar, and offload the easy traffic.
On the right workloads, this can remove 90%+ of LLM calls.
GitHub:
https://github.com/adrida/tracer
Many LLM classification calls in production are overkill.
For tasks like intent detection, moderation, tagging, or routing, TRACER learns which requests can be safely offloaded to a lightweight ML model trained on the LLM’s own outputs.
You keep the hard cases on the LLM, set a target quality bar, and offload the easy traffic.
On the right workloads, this can remove 90%+ of LLM calls.
GitHub:
https://github.com/adrida/tracer