This article answers the questions you’ll inevitably ask when talking to your IT team or a vendor: how does “regular ChatGPT” differ from a system based on company documents, how much does it cost, and why is it important to distinguish between these architectures in the first place?
Before you sign off on a budget for “AI implementation,” there’s one thing you should know: "AI" is not a single product. Here are a few architecture – from a simple chat to a system with a full archive and automation. Below: what each one offers, what it doesn't offer, how much it costs – no shortcuts like “trust us, we’ll build the AI.”
Why do you need to know this before making a decision?
One email with a concrete point each week.
AI, B2B sales, and implementations. No spam, unsubscribe with one click.
Because the choice of architecture determines:
- What the system actually does – Does it just provide answers, or does it also use your documents and cite sources?
- Cost scale – ranging from a few dozen euros a month to tens of thousands of euros a year.
- Data security – whether the content leaves the company in a controlled manner or remains within a specific region.
- Start time – from one day to several months.
You don't need to know anything about neural networks. You just need to be able to say: “We need architecture X because we have problem Y” – or ask the supplier a tough question. This will also help A more comprehensive guide to tools and costs.
1. A simple AI chatbot (question → answer)
How it works: You enter a question. The model responds based on its training—without having constant access to your files.
Your question → [model] → answer
Examples: ChatGPT, Claude, and Gemini in basic mode.
What it can do: write, translate, summarize, assist with code, numbers, and formatting; analyze excerpt, which you'll paste into the window.
What it can't do (to put it bluntly): does not have access to the company's entire archive; does not scan the disk; session-to-session memory is sometimes limited or nonexistent; may be hallucinating – It sounds wise, but it’s not always true.
Analogy A conversation with a person who is generally considered "wise," who she has never seen your contracts.
Cost (order of magnitude): approx. 0–30 EUR per person per month.
When it's enough: writing, translation, brainstorming – without the need to search through hundreds of company documents.
2. Context-sensitive chat (file upload)
How it works: You attach a file (PDF, DOCX) to the conversation. The model responds about the contents of this file during the session.
Question + uploaded file → [model] → answer regarding this file
Examples: ChatGPT with attachments, Claude with documents, Copilot in Word.
What it can do that a simple chat can't: analysis of a specific contract, comparison of several uploaded files, summaries.
What it can't do: does not index the entire company database – only what you upload now; file size and number limits; lack of a lasting "corporate memory" per document; weak access control (who can view whose data).
Analogy Someone gets one folder – He’ll read it and reply, but he doesn’t have access to the entire archive.
Cost: approx. 20–50 EUR per person per month.
When it's enough: work on individual documents without having to search through hundreds of files.
3. RAG – access to the company's knowledge base
How it works: The documents are added to the database. When asked, the system first searches for relevant excerpts, then the model builds a response based on them, often with references to sources.
Question → search within documents → excerpts → [model] → answer + footnotes
Key difference: The model doesn't just "guess" when it comes to your business—it has specific quotes from your files.
What it can do: searching through a large number of documents; Footnotes (document, page); the database grows as new files are added; combining information from multiple sources into a single response.
What RAG itself usually doesn't do: does not automatically perform actions in external systems (email, Customer Relationship Management) without extensions; quality = the quality of documents and configurations; errors still possible – verification remains.
Analogy An assistant who went through the archives and for each answer, it indicates: "This is from file X."
Cost: infrastructure and maintenance often hundreds to thousands of euros per month plus implementation (ranging from a few thousand to tens of thousands of euros) – depending on the scale.
When does this make sense: lots of documents, lots of people, industries where what matters is sources and audit (law, finance, medicine, HR, consulting).
If you're just testing the hypothesis "is this profitable for us?", this line of reasoning makes sense MVP – Start small, then scale up.
4. AI agents – performing multiple steps
How it works: A single command runs plan: searching the database, checking the calendar, drafting a document, preparing it for approval, etc.
What it can do that RAG alone cannot: multi-step workflows, integrations (email, calendar, Customer Relationship Management—within the scope of the project), automation of repetitive processes.
Risks: an agent error may result in greater impact rather than an error in the chat itself; it must be clearly defined what's allowed, what isn't; implementation and testing yese longer.
Analogy RAG is like a librarian with a catalog; agent like an assistant who does something yeses care of – under your supervision and according to your rules.
Cost: often thousands of euros per month + implementation tens of thousands of euros depending on the complexity.
When: usually po a stable RAG or a clearly defined knowledge base; it is often associated with order in sales and operations.
5. Fine-tuning – a model “trained” on your data
How it works: The base model is tunable based on selected company data (style, common phrases, domain).
Difference from RAG: the knowledge is embedded in the model's "weights," not just in the searchable database; responses can be faster; does not replace footnotes to specific current documents.
Risks and costs: the training is roads; updating knowledge requires additional cycles; without RAG It's harder to show the source as with a quote from a PDF; a large amount is needed qualitative training data.
When: rarely as the only solution; more often RAG + optionally tailoring the model where it makes business sense.
Summary table
| Feature | A simple cabin | Chat + file | RAG | Agents | Fine-tuning |
|---|---|---|---|---|---|
| Knows company documents | No | Only uploaded | Yes (base) | Yes | "Inside the Model" |
| Indicates sources | No | Partially | Yes | Yes | No |
| Database memory between sessions | No | No | Yes | Yes | Yes |
| Performs actions on systems | No | No | Usually not* | Yes | No |
| Cost entry threshold | Short | Short | Medium/high | Tall | Tall |
| Complexity of implementation | Low | Low | Average | Tall | Tall |
*Possible extensions beyond pure RAG.
Which architecture to choose?
Typeical path:
- Today: Simple Chat / Copilot – the team is learning how to work with models.
- When access to the archive is missing: RAG (often the biggest jump in value in documentary compaNos).
- Optional later: agents – process automation.
You don't have to start with the most expensive variant. Start with the problem: if your main pain is "searching through hundreds of files", the tools guide in the first article of the series and RAG architecture are more important than fine-tuning from day one.
When the end customer has repetitive questions before it reaches the salesperson, a similar logsc of "answers from above" is implemented by a well-designed one digital quotations and calculations – a different layer, Same direction: less chaos, more clear answers.
Do you want to tailor your architecture to the scale of your documents and compliance requirements? Drop us a line—we’ll walk you through your specific situation without any technical jargon.