Most AI Problems Aren’t AI Problems

AI Strategy · First Principles · Tooling

Most “AI Problems” Aren’t AI Problems - Start With First Principles (Then Add AI Where It Helps)

AI · Architecture
2025
blog

Most “AI Problems” Aren’t AI Problems

There’s a pattern we see again and again: teams jump straight to model training when the problem can be solved faster, cheaper, and more reliably by clarifying what you actually need the system to do.

Our recent file-system demo is a perfect example. You ask in plain English, the app routes that intent to a deterministic tool (Windows Search or a SQLite cache), and an LLM simply interprets the request and summarises results. No fine-tuning. No GPU burn. Just outcomes.

The lesson scales far beyond “find my PDFs”.

First principles vs first models

Before you reach for a 70B parameter hammer, ask yourself some simple questions.

Is this decision deterministic? If a human would write a filter like ext = .pdf AND size > 50MB AND modified < 90d, you probably don’t need training. Use rules, indexes, and APIs.
Is the task ambiguous, cross-document, or insight-seeking? If a human would skim, compare, reason, and summarise – that’s where AI helps.
Can we expose the real system of record via tools/APIs? If yes, let the LLM call tools to fetch facts, then reason over them. Don’t try to “memorise” your world inside a model.

Where AI actually adds value

These are the kinds of prompts that actually need AI because they require interpretation, synthesis, and judgement – not just filtering.

Prioritised risk summary: “From last quarter’s incident reports, what are the top 3 recurring causes, and which teams own the fixes? Draft Jira tickets with titles and acceptance criteria.”
Cross-channel customer signal: “Looking at the last 60 days of support emails, account notes, and invoice comments, which high-ARR customers are trending negative sentiment and why? Produce a 7-day outreach plan.”
Policy compliance synthesis: “Compare our leave-policy PDF to the enterprise award and highlight conflicts by clause. Suggest redlines with rationale.”
Vendor contract variance: “Across all active MSAs, where do liability and termination clauses differ from our standard? Table the deltas and tag ‘needs legal review’.”
Ops forecast from free text: “Given these maintenance logs, estimate spare-parts demand next quarter and explain your assumptions.”

The file-system example: right tool, right layer

In the file-system demo, the architecture is simple and boring in the best possible way.

Deterministic layer (tools/MCP): Windows Search or a lightweight cache gives millisecond filters on name, size, date, and type. Deterministic, testable, cheap.
LLM layer (reasoning): A small model routes intent to the right tool and explains results in business language. No training needed.
Optional content layer (RAG): When you must read file contents, embed and retrieve relevant chunks, then let the LLM answer grounded in citations.

This stack gets you most of the value without ever fine-tuning.

When (and how) training makes sense

You don’t need to train a model to filter files. You might train when:

Output style must be strict and domain-specific, such as underwriting justifications or regulatory submissions.
You need domain reasoning beyond what prompting and RAG can deliver.
Latency and cost constraints push you to distil a large prompt or RAG workflow into a smaller model.

Training options, from lightest to heaviest:

Prompt engineering (zero/few-shot): Fast iteration. Structure the task, specify format, add positive/negative examples.
MCP / tool orchestration: Don’t “teach” the model facts – teach it where to fetch facts.
RAG (retrieval-augmented generation): Keep truth in your data, not the weights.
LoRA / QLoRA fine-tuning: Add narrow skills or style using lightweight adapters.
Full fine-tuning / distillation: Reserve for mature, high-throughput use cases where you’ve proven everything else.

Practical architecture we recommend

In practice, the pattern is straightforward:

Expose systems as tools or APIs: files, email, CRM, finance, tickets.
Start with a small local model for intent-to-tool routing and summarisation.
Add RAG for content-aware answers and citations.
Instrument quality: hallucination rate, latency, containment.
Only then consider LoRA or fine-tuning if there is a measurable gap.

What this looks like in practice

Examples of real questions this stack can handle without training:

Sales leader: “Who are our expansion-ready customers this month and why?” Tools pull usage, support sentiment, and contract dates. The LLM ranks, explains, and drafts outreach notes.
Compliance manager: “List policy conflicts by risk severity and suggested fix.” Tools fetch policies and regulations; RAG retrieves passages; the LLM produces a redline brief with citations.
COO: “Which maintenance sites are trending toward SLA breach, and what parts are gating?” Tools query logs and SLAs; the LLM synthesises and proposes actions.

Call to action

If you’re ready to enact on AI, not just talk about it, we’ll help you:

Map your problems to deterministic filters vs real AI.
Stand up tooling over your systems (files, email, CRM, finance).
Add prompt engineering and RAG for insight and explainability.
Prove value fast, then decide if training is worth it.

Let’s build the thing. Reach out to AppGenie and we’ll get your first AI-enabled workflow into production quickly – without lighting money on fire.

Back to Blog