The Fundamental Idea: Learning From Examples
Traditional computer programs follow instructions written by a human programmer. If you want a program to identify spam emails, you write rules: "If the email contains the words 'free money' and comes from an unknown sender, mark it as spam." This works, but it's brittle. Spammers simply change their wording.
AI works differently. Instead of writing rules, you collect thousands of real examples of spam emails and legitimate emails, feed them all to the AI, and let the system figure out the patterns on its own. The result is a model that can catch new forms of spam that the programmer never anticipated — because it learned what spam looks like rather than following a checklist.
This approach is called machine learning. It's the foundational idea behind virtually every AI system in use today. The core insight is simple: if you have enough examples, a computer can learn the underlying pattern without anyone explicitly telling it what the pattern is.
Training: How AI Models Learn
The process of teaching an AI is called training. Here is how it works, step by step:
Step 1 – Collect Data
You gather a large dataset of examples. For an image recognition AI, this might be millions of labeled photos: "This is a cat," "This is a dog," "This is a car." For a language AI like ChatGPT, this is text from hundreds of billions of web pages, books, and articles. The quality and quantity of this data largely determines how good the AI will be.
Step 2 – The Model Makes Guesses
The AI model starts with random internal settings (called parameters or weights). It looks at an example from the training data and makes a prediction. Early in training, these predictions are almost always wrong.
Step 3 – Measure the Error
The training system compares the model's prediction to the correct answer and calculates how wrong it was. This is called the loss.
Step 4 – Adjust the Weights
Using an algorithm called backpropagation, the system nudges the model's internal settings slightly in the direction that would have produced a better answer. This step is repeated millions of times across millions of examples.
Step 5 – Repeat Until Good Enough
This loop — predict → measure error → adjust → repeat — runs across the entire dataset hundreds or thousands of times. Gradually, the model's predictions get better and better. When the error is low enough, training stops and the model is considered "trained." What remains is called a trained model, and it's what you interact with when you use a tool like ChatGPT.
Neural Networks: The Architecture Behind Modern AI
The specific type of model used in modern AI is called a neural network — loosely inspired by the structure of the human brain. A neural network is made up of layers of mathematical nodes (called neurons) that process information and pass it to the next layer.
How a Simple Neural Network Works
Imagine you're training an AI to recognize whether a photo contains a cat or a dog. The input layer receives the raw pixel values of the image. Hidden layers in the middle detect increasingly complex features — the first layer might detect edges, the next layer might detect shapes like ears and eyes, and deeper layers might detect overall patterns that distinguish cats from dogs. The output layer produces a final probability: "90% cat, 10% dog."
Deep Learning: Many Layers = More Power
Deep learning refers to neural networks with many hidden layers (sometimes hundreds or thousands). These deep networks can detect far more complex patterns than shallow networks — and they are responsible for all of the impressive AI breakthroughs of the past decade, from voice recognition to image generation to language understanding.
The models behind ChatGPT, Claude, and Gemini are called Large Language Models (LLMs) — extremely large deep neural networks trained specifically on text. They have hundreds of billions of parameters. We cover them in detail in Lesson 3.
Inference: How AI Responds to Your Questions
When you type a message to ChatGPT or generate an image with Midjourney, you're not doing training — you're doing inference. This is the process of using an already-trained model to make predictions on new inputs.
What Happens When You Send a Message to an AI
When you type "Explain climate change in simple terms," here is what happens, simplified:
- Your text is broken into small pieces called tokens (roughly one token per word or word fragment).
- Each token is converted into a numerical representation the model can process.
- The neural network processes these numbers through all its layers, drawing on patterns learned during training from billions of text examples.
- The model predicts the most likely next token, given everything before it. Then the next. Then the next — one token at a time.
- Those tokens are reassembled into the text you read as the AI's response.
The AI is not "thinking" in any meaningful sense. It's performing an extremely sophisticated statistical prediction — estimating, given this input, what tokens are most likely to follow. But because it was trained on so much human text, these predictions often look and feel like genuine understanding.
Why AI Makes Mistakes (And Why That Matters)
Understanding how AI works helps you understand why it sometimes fails. Because AI learns patterns from data — rather than understanding concepts — it can make errors that a human would never make.
Common AI Failure Modes
Hallucination: AI can generate fluent, confident-sounding text that is factually wrong. It's producing what statistically "looks right" based on its training data, not checking against a database of facts. This is why you should always verify AI-generated facts on important matters.
Bias: If the training data contains human biases — and human-generated data always does — the AI will learn and reproduce those biases. This is an active area of research and a real concern for high-stakes applications.
Out-of-distribution failure: AI performs well on inputs similar to its training data but can fail badly on unusual inputs it wasn't trained to handle.
The Practical Takeaway
AI is a powerful tool, not an oracle. Use it to accelerate your work, generate ideas, draft content, and analyze information — but keep your critical thinking engaged. Verify facts. Review AI outputs before using them for important decisions. The best results come from humans and AI working together, each contributing what they do best.