What Is an AI, Really? A Non-Technical Guide to Understanding Large Language Models
- Aparajita Sihag
- Mar 20
- 12 min read
Updated: Mar 21
This is the first in a three-part series for professionals who want to build with AI - without needing a computer science degree. By the end of this series, you'll be able to design, build, and deploy AI-powered tools for your organisation.
-----------------------------------------------------------------------------------------------------------------
There's a question that most people are too embarrassed to ask out loud, even as they use ChatGPT daily, sit through AI strategy presentations, and nod along in vendor meetings.
The question is: what is an AI, actually?
Not the marketing version. Not the science fiction version. What is it, physically, technically, and conceptually? What happens between the moment you type a question and the moment an answer appears?
The answer is both simpler and more profound than most people expect. And once you have it, everything else - building chatbots, integrating tools, designing workflows, having intelligent conversations with your IT team - clicks into place with surprising ease.
The Most Honest Explanation of an LLM
When people say "AI" in the context of ChatGPT, Claude, or Gemini, they're referring to a specific kind of technology called a Large Language Model, or LLM.
Here is the most accurate mental model: an LLM is an incredibly well-read intern who has absorbed a vast portion of the internet - books, articles, code, research papers, conversations - and developed a deep intuition for how language works. Not rules. Not facts stored in a database. Intuition.
When you ask this intern a question, they don't look it up. They draw on pattern recognition built from billions of examples. When you type "The capital of France is ___", the model doesn't consult a map. It has learned, from an almost incomprehensible volume of text, that "Paris" is the most natural completion of that sentence.
This is the core mechanism: LLMs predict the most likely next piece of text, one small chunk at a time.

Those small chunks are called tokens. A token isn't exactly a word - it's more like a fragment of one. "Unbelievable" might be two tokens. "AI" is one. On average, one token is roughly three-quarters of a word. This matters because LLMs have a limit on how much text they can process at once - called the context window - which works like the model's working memory. Feed it too much, and it starts to lose track of what was said earlier in the conversation.
What Is a "Model," Really?
Here's where most explanations go wrong. People imagine an AI as some kind of intelligent server, a program with rules written by hand, or a database of answers. It is none of these things.
A model is, at its most literal, a very large file full of numbers.
These numbers are called parameters or weights, and there are billions - sometimes trillions of them. GPT-4 has roughly 1.8 trillion. Claude has hundreds of billions. These numbers aren't random. They have been painstakingly adjusted through a process called training until, collectively, they encode something remarkable: an intuition about language, reasoning, and knowledge that rivals - and in some narrow ways surpasses - that of any individual human expert.
When you send a message to an LLM, your text gets converted into numbers, those numbers get passed through the weights file in a series of mathematical operations, and a new set of numbers comes out the other end - which gets converted back into text. That's the response you read.

The "model" is physically just a large file - anywhere from ten gigabytes to five hundred gigabytes depending on its size - loaded onto powerful graphics cards (GPUs) that are exceptionally good at the kind of matrix mathematics these calculations require. When you use ChatGPT or Claude through a website, OpenAI and Anthropic are running these files on their own servers and giving you access over the internet. When a company wants the model on their own hardware - more on why that matters shortly - they can download the weights file and run it themselves.
How Does an LLM Learn? The Three Phases of Training
Training is the process that turns a blank weights file into something that can hold a conversation, write code, or analyse a legal document. It happens in roughly three phases - and understanding this will clarify many things about why LLMs behave the way they do.
Phase 1 - Pre-training. The model reads an enormous amount of text: web pages, books, Wikipedia, academic papers, code repositories. As it reads, it repeatedly tries to predict the next token in each piece of text. When it's wrong, the weights get nudged slightly in the right direction. When it's right, they stay. Repeat this billions of times across trillions of words, and the model gradually develops a rich statistical intuition about language and the world it describes. This phase is extraordinarily expensive - it requires thousands of GPUs running for months, and costs tens to hundreds of millions of dollars. It's done once, by the AI company.
Phase 2 - Fine-tuning. The pre-trained model is then exposed to curated examples of high-quality behaviour: well-formed questions with excellent answers, helpful explanations, appropriate refusals. This shapes the model's "personality" and makes it more useful in conversation.
Phase 3 - RLHF (Reinforcement Learning from Human Feedback). Human evaluators compare pairs of responses and vote on which is better. The model is trained to produce responses more like the preferred ones. This is how AI companies instil values like helpfulness, honesty, and safety into the model's behaviour.

One critical insight: training is a one-time process. When you use the model, you are not retraining it. You're sending your text through fixed weights and getting a prediction back. The model doesn't learn from your conversation. It doesn't remember what you said last week. Any memory features you see in consumer AI products are built on top of the model - not inside it.
Temperature: The Dial Nobody Explains Properly
When you start building with LLMs - through APIs or no-code tools - you'll encounter a setting called temperature. It controls one simple thing: how adventurous the model is when choosing the next token.
Technically, the model doesn't pick just one next token - it calculates probabilities for every possible continuation and samples from that distribution. Temperature adjusts the shape of that distribution.
At low temperature (close to 0), the model almost always picks the highest-probability token. The result is consistent, predictable, factual output. An HR policy chatbot answering "what is the notice period?" should run at low temperature - you want the same correct answer every time.
At high temperature (1 and above), the model occasionally picks lower-probability tokens. The result is more creative, varied, and sometimes surprising. A tool that generates training content, brainstorms ideas, or writes personalised messages might benefit from higher temperature.
For most enterprise use cases - anything where accuracy and consistency matter - you want temperature low. For generative or creative tasks, you can nudge it up.
Hallucination: Why Confident Can Mean Wrong
This is possibly the most important concept in this entire guide for anyone building AI tools in a professional context.
Because LLMs predict what sounds right rather than looking things up, they can - and regularly do - state things that are entirely false with complete confidence. This phenomenon is called hallucination.
An LLM doesn't have a fact-checker running in the background. It has weights that encode patterns. If the pattern fires strongly enough, the model produces the associated text - regardless of whether that text is true. Ask an LLM about a moderately obscure regulation, a specific company's internal policy, or a recent event it hasn't been trained on, and there is a meaningful chance it will fabricate a plausible-sounding answer.
This is not a bug the companies will eventually fix. It is a structural property of how these systems work. The pattern-matching mechanism that makes them so fluent is the same mechanism that makes them capable of confident error.
The implications for enterprise use are significant. An HR chatbot answering questions about leave policy, a legal assistant summarising regulations, a financial tool explaining tax rules - any of these, if built naively, will hallucinate. The fix (which we'll cover in detail in Part 2) is a technique called RAG - Retrieval Augmented Generation - which grounds the model in your actual documents rather than letting it rely on its training.
Not All Models Are the Same
The landscape of LLMs is genuinely competitive, and the right choice depends on your use case. Here's a practical overview of the major families:
GPT-4o (OpenAI) is the most widely used model in enterprise settings. It's a strong general-purpose reasoner with broad support across tools and integrations. If you're building something that needs to work with the widest range of software, GPT-4o is often the default choice.
Claude (Anthropic) excels at processing long documents, following nuanced instructions, and maintaining appropriate behaviour in sensitive contexts. For HR use cases - policy Q&A, performance feedback, employee communications - Claude's attention to tone and its handling of long, complex documents makes it a particularly strong fit.
Gemini (Google) is deeply integrated with Google Workspace, making it a natural choice for organisations already living in Google Docs, Gmail, and Drive. It also has strong multimodal capabilities, meaning it can work with images and other media alongside text.
Llama / Mistral (open source) are models whose weights have been made publicly available. This means any organisation can download them and run them on their own hardware - without sending data to a third-party server. For organisations with strict data residency requirements (more on this shortly), open-source models are often the only viable path.
One deeply liberating insight: these models are swappable. Because they all accept text in and return text out through a standardised interface, you can build your HR copilot with Claude today and switch to GPT-4o tomorrow with relatively minor changes. You are not locked in. This matters enormously when negotiating with AI vendors.
The Data Privacy Problem Nobody Talks About Clearly
One of the most common questions in enterprise AI discussions goes something like: "Should I worry about uploading sensitive employee data to ChatGPT?" The answer is more nuanced - and more alarming - than most people realise.
There are four distinct risks when you upload sensitive data to any consumer AI platform, and only one of them is about whether your data trains the model.
Risk 1: Data in transit. Your data travels over the internet to the AI company's servers. Even with encryption, this is a different risk profile than data that never leaves your building.
Risk 2: Data at rest. Your data is typically stored on the provider's servers - sometimes for days, sometimes for thirty days, sometimes indefinitely depending on the plan and the settings. "Opting out of training" reduces one use of your data; it doesn't make the data disappear from their infrastructure.
Risk 3: Human review. Most AI companies reserve the right to have human employees review conversations, either for safety, quality, or trust and safety purposes. If an employee uploads a salary spreadsheet to ask Claude a question about it, a human at Anthropic could conceivably read it.
Risk 4: Future training use. Even with the training toggle off, your data may be used in future training cycles depending on the company's privacy policy, their legal jurisdiction, and how that policy evolves over time.
The practical answer for enterprise use: Never send compensation data, employee performance records, unreleased financial information, or anything covered by employment law through a consumer AI interface. For these use cases, the correct architecture is either an enterprise plan with contractual data protection guarantees, or a self-hosted open-source model that processes everything inside your own infrastructure.
This connects directly to the model choice discussion above. If your legal team says employee data cannot leave your servers, open-source models like Llama running on your own hardware are not optional luxuries - they are the only compliant path.
When AI Becomes Genuinely New Knowledge
A question worth pausing on: if LLMs are essentially very sophisticated pattern matchers, how do they sometimes produce things that look like genuine discovery or insight?
The answer is that pattern matching at sufficient scale across enough data produces something qualitatively different from what any individual human can do. Human scientists and analysts are also pattern matchers - we're just slow, siloed, and constrained by how much information any one person can hold simultaneously.
Consider what happened with AlphaFold, DeepMind's protein structure prediction system. For fifty years, predicting how a protein folds from its amino acid sequence was considered one of the hardest problems in biology. AlphaFold didn't simulate the folding process - it learned the statistical relationship between sequences and structures from every known protein structure ever catalogued. In 2022, it predicted structures for virtually every known protein. Drug discovery programmes that would have taken decades are now taking months.
Or consider GNoME, another DeepMind system that discovered 2.2 million new stable crystal structures - materials that could exist but haven't yet been synthesised. The entire previous database of known stable materials contained about 48,000. In one computational run, AI expanded humanity's materials knowledge by a factor of 45.
These aren't approximations. They're verified, real outputs - new proteins, new mathematics, new materials - produced by systems whose core mechanism is still, at the computational level, pattern matching.
The implication for enterprise work is worth sitting with. The same principle that enabled AlphaFold to find protein patterns no human biologist was looking for also applies, at smaller scale, to your organisational data. An AI system processing thousands of performance reviews, learning outcomes, and attrition records can surface patterns - "employees who completed this specific course within six months of joining had 40% lower attrition at the two-year mark, but only in client-facing roles" - that no human analyst would have thought to look for because the hypothesis space is too large.
This is why building AI tools on your own company data is more valuable than using generic AI tools. The pattern matching becomes specific to your context, and what it discovers is genuinely yours.
What You Now Know - and What Comes Next
You now have something most people who use AI daily don't have: a genuine mental model of what these systems are and how they work.
You understand that an LLM is a weights file full of numbers, not a database of facts. You understand that it generates text by predicting the most likely next token, not by looking things up. You understand that training happens once and produces a fixed intuition - which is why the model can be confidently wrong, and why it knows nothing about your company. You understand that different models suit different needs and are swappable. And you understand that sending sensitive data to consumer AI platforms carries real risks that "opting out of training" doesn't fully address.
In Part 2, we'll cover the two things that transform this raw capability into something your organisation can actually trust and deploy: APIs (how AI connects to your systems) and prompt engineering (how you shape AI behaviour to match your exact needs). We'll also go deep on RAG - the technique that solves the hallucination problem for enterprise use cases and makes your AI actually know your company.
The foundation is in place. Now let's build on it.
Epilogue
When I decided I needed to learn AI, I did what most people would do.
First, I panicked a little. It felt like I had already missed the bus. Everyone seemed to be talking about AI, building with it, integrating it into their work. And I was still trying to figure out what it actually was.
Then I did what panic usually leads to. I Googled “basics of AI.”
That did not help.
The internet, as expected, gave me everything. Courses, blogs, videos, roadmaps, jargon-heavy explainers. Neural networks. Transformers. Deep learning. Python. APIs. It felt less like learning and more like standing in front of a firehose.
So I shut my laptop.
Lay down.
Stared at the ceiling for a while.
Got up again, because clearly this was not a sustainable strategy.
I tried LinkedIn Learning next. Surely a structured course would help. It did not. There were too many starting points, too many “beginner” courses that assumed you already understood something, too many paths and no clarity on which one actually mattered.
At some point in this process, I also cried. Not dramatically. Just the quiet frustration of knowing this is important and not knowing where to begin.
Then I asked my tech colleagues.
Believe it or not, that made it worse.
Ten people, ten answers. “Start with Python.” “No, understand machine learning first.” “Just build something.” “You need to understand data structures.” “Take Andrew Ng’s course.” “Play with APIs.”
All correct. None useful.
By this point, I was very close to throwing my laptop.
What finally worked was something embarrassingly simple.
I opened an AI tool and wrote one sentence: “I am a non-technical person. I want to understand AI well enough to use it in my work, build simple tools, and speak intelligently with engineers. Teach me step by step.”
That was the first moment things started making sense.
This blog series is an extension of that moment.
If you are in HR, Finance, Sales, or any non-engineering role and you want to understand what AI actually is, what you need to know to use it effectively, and how to work with it without becoming an engineer, this is for you.
This is not a technical guide. It is a clarity guide.




Comments