Des concepts à manipuler et ressentir.
Laisse tomber les docs de 40 pages. Chaque explicateur transforme une idée complexe d'IA, de Claude Code, de MCP ou de cloud en un diagramme animé que tu peux faire glisser, scruber et casser — pour que le concept clique en minutes, pas en heures.
Trois étapes. L'idée colle.
Lis l'analogie en 60 secondes
Chaque concept commence par une histoire courte et claire. Pas de jargon, pas de remplissage — juste le modèle mental dont tu as besoin.
Scrube l'animation en direct
Appuie sur play, glisse la timeline ou utilise les flèches. Regarde chaque étape image par image jusqu'à ce que le flux fasse sens.
Pousse les sliders à la limite
Ajuste chaque paramètre. Le diagramme réagit en direct pour que tu sentes les trade-offs et retiennes les limites.
Explicateurs préférés
AI Cost Optimization: Cutting LLM Bills 80%
Most LLM bills can be cut by 50–90% without quality loss. Caching, model routing, prompt diet, and output caps deliver the bulk of it.
AI Observability: Tracing Every Token in Production
Without traces, every LLM bug is a guess. Capture prompts, tool calls, tokens, costs, and latencies for every request — searchable, filterable, alertable.
LLMOps: MLOps for the LLM Era
LLMOps is the operational discipline of running LLM apps in production — prompts as code, evals on every change, observability, cost, and incident response.
Choisis ton prochain concept
Backpropagation: How a Network Actually Learns
Backprop is just credit assignment — blame each parameter for the error, in proportion. Tune learning rate and batch size to see training stabilise or diverge.
Neurons, Layers, and Why Depth Matters
A neuron is a weighted sum followed by a kink. Stack a million in layers and you get a function that approximates almost anything.
Gradient Descent: Rolling Downhill to a Smarter Model
Training is a marble rolling down a wrinkled hill — the loss landscape. Tune learning rate and momentum to see it slide, oscillate, or get stuck.
Fine-Tuning vs RAG: When to Teach, When to Look Up
Fine-tuning changes what the model knows; RAG gives it a reference shelf at query time. Most "make the LLM know our docs" jobs are RAG jobs.
LoRA: Cheap Fine-Tuning Without Touching the Whole Model
LoRA freezes the giant model and trains tiny rank-r adapters next to it. 7B-param model, ~1% of the trainable weights, 99% of the quality.
Knowledge Distillation: Teaching a Small Model to Imitate a Big One
Distillation trains a small student model to mimic a big teacher's soft outputs. You ship the small one — much cheaper, surprisingly close in quality.
Quantization: Shrinking Models Without Killing Them
Store every weight in 4 bits instead of 16, fit a 70B model on one GPU, and lose almost no quality. Tune precision to feel the trade-off.
KV Cache: Why the Second Token Is Faster Than the First
Without a KV cache, every new token re-computes attention over the whole sequence. With it, you reuse all previous work. This is most of LLM serving.
Batching: How Inference Servers Serve a Thousand Users at Once
GPUs are starved on a single request — most of the chip is idle. Batching packs many requests into one forward pass for huge throughput wins.
Speculative Decoding: A Cheap Model Guessing for an Expensive One
A tiny draft model proposes 5 tokens at once; the big model verifies them in a single forward pass. Net effect: 2–3× faster decode at identical quality.
Hallucinations: Why LLMs Make Stuff Up Confidently
Hallucinations are not bugs — they are the model doing exactly what it was trained to do. Plausibility is the loss; truth is not. Understand the trap, then engineer around it.
AI Evals: How to Tell If Your Model Is Actually Better
Without evals, "the new prompt feels better" is just vibes. A good eval suite catches regressions before users do — here is how to build one.
Arrête de lire à propos. Commence à scruber.
Bloqué sur un concept d'IA, de Claude Code ou de cloud ? Dis-moi ce qui ne clique pas — je livre un explicateur interactif gratuit avec analogie, animation et sliders, en général sous une semaine.