Most “AI Problems” Aren’t AI Problems - Start With First Principles (Then Add AI Where It Helps)
Most “AI Problems” Aren’t AI Problems
There’s a pattern we see again and again: teams jump straight to model training when the problem can be solved faster, cheaper, and more reliably by clarifying what you actually need the system to do.
Our recent file-system demo is a perfect example. You ask in plain English, the app routes that intent to a deterministic tool (Windows Search or a SQLite cache), and an LLM simply interprets the request and summarises results. No fine-tuning. No GPU burn. Just outcomes.
The lesson scales far beyond “find my PDFs”.
First principles vs first models
Before you reach for a 70B parameter hammer, ask yourself some simple questions.
-
Is this decision deterministic? If a human would write a filter like
ext = .pdf AND size > 50MB AND modified < 90d, you probably don’t need training. Use rules, indexes, and APIs. - Is the task ambiguous, cross-document, or insight-seeking? If a human would skim, compare, reason, and summarise – that’s where AI helps.
- Can we expose the real system of record via tools/APIs? If yes, let the LLM call tools to fetch facts, then reason over them. Don’t try to “memorise” your world inside a model.
Where AI actually adds value
These are the kinds of prompts that actually need AI because they require interpretation, synthesis, and judgement – not just filtering.
- Prioritised risk summary: “From last quarter’s incident reports, what are the top 3 recurring causes, and which teams own the fixes? Draft Jira tickets with titles and acceptance criteria.”
- Cross-channel customer signal: “Looking at the last 60 days of support emails, account notes, and invoice comments, which high-ARR customers are trending negative sentiment and why? Produce a 7-day outreach plan.”
- Policy compliance synthesis: “Compare our leave-policy PDF to the enterprise award and highlight conflicts by clause. Suggest redlines with rationale.”
- Vendor contract variance: “Across all active MSAs, where do liability and termination clauses differ from our standard? Table the deltas and tag ‘needs legal review’.”
- Ops forecast from free text: “Given these maintenance logs, estimate spare-parts demand next quarter and explain your assumptions.”
The file-system example: right tool, right layer
In the file-system demo, the architecture is simple and boring in the best possible way.
- Deterministic layer (tools/MCP): Windows Search or a lightweight cache gives millisecond filters on name, size, date, and type. Deterministic, testable, cheap.
- LLM layer (reasoning): A small model routes intent to the right tool and explains results in business language. No training needed.
- Optional content layer (RAG): When you must read file contents, embed and retrieve relevant chunks, then let the LLM answer grounded in citations.
This stack gets you most of the value without ever fine-tuning.
When (and how) training makes sense
You don’t need to train a model to filter files. You might train when:
- Output style must be strict and domain-specific, such as underwriting justifications or regulatory submissions.
- You need domain reasoning beyond what prompting and RAG can deliver.
- Latency and cost constraints push you to distil a large prompt or RAG workflow into a smaller model.
Training options, from lightest to heaviest:
- Prompt engineering (zero/few-shot): Fast iteration. Structure the task, specify format, add positive/negative examples.
- MCP / tool orchestration: Don’t “teach” the model facts – teach it where to fetch facts.
- RAG (retrieval-augmented generation): Keep truth in your data, not the weights.
- LoRA / QLoRA fine-tuning: Add narrow skills or style using lightweight adapters.
- Full fine-tuning / distillation: Reserve for mature, high-throughput use cases where you’ve proven everything else.
Practical architecture we recommend
In practice, the pattern is straightforward:
- Expose systems as tools or APIs: files, email, CRM, finance, tickets.
- Start with a small local model for intent-to-tool routing and summarisation.
- Add RAG for content-aware answers and citations.
- Instrument quality: hallucination rate, latency, containment.
- Only then consider LoRA or fine-tuning if there is a measurable gap.
What this looks like in practice
Examples of real questions this stack can handle without training:
- Sales leader: “Who are our expansion-ready customers this month and why?” Tools pull usage, support sentiment, and contract dates. The LLM ranks, explains, and drafts outreach notes.
- Compliance manager: “List policy conflicts by risk severity and suggested fix.” Tools fetch policies and regulations; RAG retrieves passages; the LLM produces a redline brief with citations.
- COO: “Which maintenance sites are trending toward SLA breach, and what parts are gating?” Tools query logs and SLAs; the LLM synthesises and proposes actions.
Call to action
If you’re ready to enact on AI, not just talk about it, we’ll help you:
- Map your problems to deterministic filters vs real AI.
- Stand up tooling over your systems (files, email, CRM, finance).
- Add prompt engineering and RAG for insight and explainability.
- Prove value fast, then decide if training is worth it.
Let’s build the thing. Reach out to AppGenie and we’ll get your first AI-enabled workflow into production quickly – without lighting money on fire.