Jack's Madness — March Madness 2026

ABOUT THIS PROJECT

The story behind the AI bracket

ABOUT THIS PROJECT

AI-powered March Madness bracket predictor built on March 18, 2026. Uses 6 AI models — Claude (Anthropic), GPT-4o (OpenAI), Gemini (Google), Grok (xAI), Llama (Meta), and DeepSeek — in the most diverse multi-model ensemble ever applied to bracket prediction.

6 AI MODELS. 6 COMPANIES. 1 ANSWER.

We gave the same bracket to Claude (Anthropic), GPT-4o (OpenAI), Gemini (Google), Grok (xAI), Llama (Meta), and DeepSeek — and they agreed on 80.6% of picks. This matters.

If you're paying for multiple AI services, you should know: they often produce the same answers. Not because they're right, but because they all learned from the same internet. Model diversity ≠ information diversity. When six independently-developed models from six different companies converge on the same picks, it tells us more about their shared training data than about basketball.

HOW IT WORKS

1. DATA COLLECTION

KenPom ratings, BartTorvik T-Rank, NCAA NET rankings, injury reports, beat writer intel

2. MULTI-MODEL ENSEMBLE

Each matchup is analyzed by all 6 AI models independently: Claude, GPT-4o, Gemini, Grok, Llama, and DeepSeek. Their predictions are compared using ensemble consensus rules. Agreement rate: 80.6%.

3. CONFIDENCE FORMULA

Weighted formula: 55% model strength, 20% source agreement, 15% lineup certainty, 10% data freshness

4. CROSS-AGENT DEBATE

The 10 most uncertain picks undergo a structured debate where the AI plays devil's advocate against its own predictions. 2 picks were flipped and average confidence dropped 8.4 points, reducing overconfidence.

5. MONTE CARLO SENSITIVITY

250 simulations testing different weight combinations against 2022-2025 historical results to find optimal calibration.

6. HISTORICAL BACKTESTING

Model accuracy scored against actual tournament results from 2022-2025, spanning ultra-chalk (2025) to extreme-chaos (2023) tournaments.

PREDICTION EVOLUTION

How the champion pick evolved through each stage of the pipeline

Initial run (Claude only)

UConn

After ensemble (Claude + GPT-4o)

Duke

After cross-agent debate

Duke (2 R64 picks flipped)

Final prediction

Duke over Houston predicted total score 152

TECH STACK

Python backend with 6 AI models: Anthropic (Claude), OpenAI (GPT-4o), Google (Gemini), xAI (Grok), Meta (Llama via Groq), and DeepSeek. KenPom and BartTorvik for statistical data. GradientBoosting ML for historical calibration. Static HTML dashboard deployed on Vercel. Source code at github.com/elstonj/march-madness-2026.

THE SELDON PARALLEL

"Psychohistory dealt not with man, but with man-masses. It was the science of mobs; globules of the human race... The reaction of one man could be forecast by no known mathematics; the reaction of a billion is something else again." — Isaac Asimov, Foundation

Hari Seldon used psychohistory to predict the behavior of entire civilizations. We're using 6 AI models, 10 years of data, and Monte Carlo simulations to predict 63 basketball games. The math is the same: aggregate enough independent signals and the noise cancels out, leaving the signal. The question Seldon never answered — and neither can we — is what happens when a single individual (a player having the game of their life) overrides the statistical prediction. That's the 8% we can't capture.

"Never let your sense of morals prevent you from doing what is right." — Salvor Hardin

BY THE NUMBERS

AI MODELS USED

From 6 different companies

500+

API CALLS MADE

Across all models

92.1%
ML MODEL ACCURACY
On 10 years of historical data

80.6%

MODEL AGREEMENT

Suggesting convergence, not diversity

23.9%

LINEUP CERTAINTY IMPACT

The #1 prediction driver per Monte Carlo

$22M
KENTUCKY'S NIL SPEND
Most expensive roster, only a 7-seed

THE MONEY QUESTION

Can you buy a championship?

$22M

Kentucky spent $22M on basketball — the most in the country — and is a 7-seed.

#77

Last year's champion Florida ranked 77th in spending.

Can you buy a championship? Mark Cuban bought Indiana football one. Basketball might be different — five players touch the ball, chemistry matters, and the tournament's single-elimination format means one bad night ends everything, no matter the payroll.

THE PREDICTION CEILING

Why can't we predict every game correctly?

Every prediction method hits a wall. Here's where the major approaches land:

83%Chalk

88%KenPom

92%Our ML

~97%Ceiling

The last 5–8% is genuine chaos: a player having the game of their life, a referee's whistle, a lucky bounce. Not even knowing everything at the molecular level would eliminate all variance in human athletic performance. The tournament isn't broken — it's designed to produce uncertainty. That's why they call it Madness.