AI Strategy · First Principles · Tooling

Most “AI Problems” Aren’t AI Problems - Start With First Principles (Then Add AI Where It Helps)

  • AI · Architecture
  • 2025
  • blog
Most “AI Problems” Aren’t AI Problems

There’s a pattern we see again and again: teams jump straight to model training when the problem can be solved faster, cheaper, and more reliably by clarifying what you actually need the system to do.

Our recent file-system demo is a perfect example. You ask in plain English, the app routes that intent to a deterministic tool (Windows Search or a SQLite cache), and an LLM simply interprets the request and summarises results. No fine-tuning. No GPU burn. Just outcomes.

The lesson scales far beyond “find my PDFs”.

First principles vs first models

Before you reach for a 70B parameter hammer, ask yourself some simple questions.

  • Is this decision deterministic? If a human would write a filter like ext = .pdf AND size > 50MB AND modified < 90d, you probably don’t need training. Use rules, indexes, and APIs.
  • Is the task ambiguous, cross-document, or insight-seeking? If a human would skim, compare, reason, and summarise – that’s where AI helps.
  • Can we expose the real system of record via tools/APIs? If yes, let the LLM call tools to fetch facts, then reason over them. Don’t try to “memorise” your world inside a model.
Where AI actually adds value

These are the kinds of prompts that actually need AI because they require interpretation, synthesis, and judgement – not just filtering.

  • Prioritised risk summary: “From last quarter’s incident reports, what are the top 3 recurring causes, and which teams own the fixes? Draft Jira tickets with titles and acceptance criteria.”
  • Cross-channel customer signal: “Looking at the last 60 days of support emails, account notes, and invoice comments, which high-ARR customers are trending negative sentiment and why? Produce a 7-day outreach plan.”
  • Policy compliance synthesis: “Compare our leave-policy PDF to the enterprise award and highlight conflicts by clause. Suggest redlines with rationale.”
  • Vendor contract variance: “Across all active MSAs, where do liability and termination clauses differ from our standard? Table the deltas and tag ‘needs legal review’.”
  • Ops forecast from free text: “Given these maintenance logs, estimate spare-parts demand next quarter and explain your assumptions.”
The file-system example: right tool, right layer

In the file-system demo, the architecture is simple and boring in the best possible way.

  • Deterministic layer (tools/MCP): Windows Search or a lightweight cache gives millisecond filters on name, size, date, and type. Deterministic, testable, cheap.
  • LLM layer (reasoning): A small model routes intent to the right tool and explains results in business language. No training needed.
  • Optional content layer (RAG): When you must read file contents, embed and retrieve relevant chunks, then let the LLM answer grounded in citations.

This stack gets you most of the value without ever fine-tuning.

When (and how) training makes sense

You don’t need to train a model to filter files. You might train when:

  • Output style must be strict and domain-specific, such as underwriting justifications or regulatory submissions.
  • You need domain reasoning beyond what prompting and RAG can deliver.
  • Latency and cost constraints push you to distil a large prompt or RAG workflow into a smaller model.

Training options, from lightest to heaviest:

  • Prompt engineering (zero/few-shot): Fast iteration. Structure the task, specify format, add positive/negative examples.
  • MCP / tool orchestration: Don’t “teach” the model facts – teach it where to fetch facts.
  • RAG (retrieval-augmented generation): Keep truth in your data, not the weights.
  • LoRA / QLoRA fine-tuning: Add narrow skills or style using lightweight adapters.
  • Full fine-tuning / distillation: Reserve for mature, high-throughput use cases where you’ve proven everything else.
Practical architecture we recommend

In practice, the pattern is straightforward:

  • Expose systems as tools or APIs: files, email, CRM, finance, tickets.
  • Start with a small local model for intent-to-tool routing and summarisation.
  • Add RAG for content-aware answers and citations.
  • Instrument quality: hallucination rate, latency, containment.
  • Only then consider LoRA or fine-tuning if there is a measurable gap.
What this looks like in practice

Examples of real questions this stack can handle without training:

  • Sales leader: “Who are our expansion-ready customers this month and why?” Tools pull usage, support sentiment, and contract dates. The LLM ranks, explains, and drafts outreach notes.
  • Compliance manager: “List policy conflicts by risk severity and suggested fix.” Tools fetch policies and regulations; RAG retrieves passages; the LLM produces a redline brief with citations.
  • COO: “Which maintenance sites are trending toward SLA breach, and what parts are gating?” Tools query logs and SLAs; the LLM synthesises and proposes actions.
Call to action

If you’re ready to enact on AI, not just talk about it, we’ll help you:

  • Map your problems to deterministic filters vs real AI.
  • Stand up tooling over your systems (files, email, CRM, finance).
  • Add prompt engineering and RAG for insight and explainability.
  • Prove value fast, then decide if training is worth it.

Let’s build the thing. Reach out to AppGenie and we’ll get your first AI-enabled workflow into production quickly – without lighting money on fire.