The Evidence Problem: Why AI Legal Research Needs to Start With Your Files

In June 2023, a New York lawyer submitted a brief citing six cases that didn’t exist. His AI tool had generated them — plausible-sounding citations, complete with volume numbers and page references, for cases that were never decided. The court sanctioned him. The story made international headlines. And the legal profession collectively asked: Is AI actually useful for legal work, or just dangerous?

The question was wrong. The right question is: Where does the AI get its information?

The source problem

When you ask a general-purpose AI tool a legal question, it generates a response based on patterns in its training data — the enormous corpus of text it was trained on. This includes legal texts, but the AI isn’t “looking up” cases. It’s predicting what a good answer would look like based on statistical patterns.

This is why fabricated citations happen. The pattern of “Case Name v. Party Name, vol. Reporter pg. (Year)” is so common in the training data that the AI fills in the blanks with details that sound right but aren’t. It’s not lying. It’s doing exactly what it was designed to do: generate plausible text.

For brainstorming or general research, that’s useful. For legal work, where the source of every claim matters, it’s insufficient.

Retrieval changes the equation

There’s an alternative architecture called retrieval-augmented generation (RAG) that fundamentally changes how AI answers questions. Instead of generating answers from training data alone, a RAG system:

Takes your question and converts it into a semantic representation
Searches your actual documents — the files you uploaded, the case logs you wrote, the exhibits you organized
Retrieves the most relevant passages ranked by meaning, not just keywords
Generates an answer grounded in those passages — citing real evidence from real files

The difference is structural: the AI is no longer making things up. It’s synthesizing information from documents you provided and can verify.

Beyond keyword search

Traditional document search matches exact strings. Search for “negligence” and you miss passages about “failure to exercise reasonable care.” RAG uses semantic search — converting text into mathematical representations of meaning, so you find information by concept, not just by word.

For legal research, this is transformative. You search for “breach of contract” and also find discussions of “failure to perform contractual obligations.” You search for “proximate cause” and surface passages about “foreseeable consequences.” You find evidence you didn’t know you had.

What this looks like in practice

Imagine you’re preparing for a deposition. You’ve uploaded the contract, prior correspondence, and three witness statements to your workspace. Instead of re-reading everything, you ask:

“Does the defendant’s account of the timeline in their statement contradict the dates in the contract?”

A grounded AI system searches your files, finds the relevant passages, and synthesizes an answer — with references to the specific documents and sections where it found the information. You read the cited passages, verify the analysis, and walk into the deposition with a targeted line of questioning.

No fabricated citations. No confident assertions you can’t trace. Just your evidence, intelligently connected.

The trust equation

The fundamental barrier to AI adoption in legal work isn’t capability — it’s trust. Lawyers are trained to verify everything. With ungrounded AI, verification means re-doing the research yourself, which defeats the purpose.

With grounded AI, verification is built into the workflow. The AI tells you where it found the information. You read the source. You agree or disagree. The AI accelerated your research; it didn’t replace your judgment.

This is the only architecture that respects how lawyers actually work: evidence first, conclusions second.

Going further: self-validating AI

Grounding alone significantly improves accuracy, but the best systems go one step further: multi-pass validation. After generating a response from retrieved documents, a second analysis pass reviews the output against the source material — checking that cited passages actually exist, flagging unsupported claims, and scoring overall accuracy.

Think of it as a built-in self-review. The AI drafts, then critiques its own draft before you ever see it. Fabricated references get caught. Unsupported extrapolations get flagged. What reaches the attorney has already survived a round of pressure-testing.

This doesn’t eliminate the need for professional judgment — nothing will — but it raises the floor meaningfully. The difference between a single-pass response and a validated response is the difference between a first-year associate’s draft and one that’s been through a round of self-editing.

Privacy as a prerequisite

For grounded AI to work in legal contexts, the privacy model has to be airtight. Your case files contain privileged communications, trade secrets, and sensitive personal information. The indexing system needs to handle this responsibly.

The right approach: generate search indexes entirely on your machine using local models. Store everything locally. When cloud processing is needed for AI analysis, send only the minimum necessary context, process it in real-time with zero retention, and ensure the provider contractually commits to never training on your data.

No cloud storage of case files. No model training on your documents. No data retention after processing. This isn’t optional for legal tech — it’s a professional obligation.

The bottom line

AI is a powerful tool for legal research, but only when it’s built on the right foundation. The question isn’t whether AI can help lawyers — it’s whether the AI’s answers are grounded in evidence or generated from patterns.

If you’re evaluating AI tools for your practice, ask one question: When the AI gives me an answer, can I trace it back to a specific document in my files? If the answer is no, you’re taking a risk every time you rely on it. If the answer is yes, you have a research assistant that’s actually read the record.

Aquiles is built on retrieval-augmented generation with multi-pass scorecard validation, grounded in your case files, with bank-grade encryption and zero-retention cloud processing. AI outputs are saved as searchable workspace references you build on over time. Request early access to see it work on your own matters.