If you're trying to make sense of the AI gold rush, you've probably heard the term "scaling laws" thrown around. It's the idea that bigger models, trained on more data with more compute, get predictably better. Sounds simple. But the real question isn't if scaling worksâit's how much it costs, how long it will last, and where the ceiling might be. That's where most analysis falls flat, relying on press releases and hype.
For the past few years, I've been using research from a group called Epoch AI to cut through the noise. They're not a company selling chips or models. They're researchers who track, model, and forecast the fundamental trends in AI development. Their work on large-scale models is the closest thing we have to a fact-based map of this uncharted territory. Ignoring it is like investing in oil without looking at geological surveys.
This guide isn't a rehash of their blog. It's my take, as someone who advises funds on tech bets, on how to actually use Epoch AI's data. We'll look at the concrete numbers behind the trends, the mistakes I see investors make, and what the next 18-24 months might realistically hold for anyone with capital on the line.
Whatâs Inside This Guide
Who is Epoch AI and Why Should You Care?
Epoch AI is a research collective focused on forecasting the development of artificial intelligence. Their most cited work revolves around analyzing the "scaling laws" of large language models (LLMs) and other AI systems. Unlike corporate labs (OpenAI, Google DeepMind), their goal isn't to build the next GPT. It's to quantify the resourcesâprimarily compute, data, and algorithmic efficiencyârequired to reach new capabilities.
Think of them as the supply chain analysts for the AI industry. While everyone else is marveling at the shiny new car (the model), Epoch is in the factory, measuring the cost of steel, the efficiency of the assembly line, and the global production rate of tires.
Their methodology is brutally empirical. They scrape papers, technical reports, and even job postings to build datasets on:
- Training Compute: The total floating-point operations (FLOPs) used to train landmark models, from AlexNet to GPT-4.
- Algorithmic Efficiency: How much better we've gotten at getting performance out of the same compute.
- Dataset Sizes: The growth of high-quality text and multimodal data available for training.
- Hardware & Cost Trends: The price-performance curve of AI chips like NVIDIA's GPUs.
By stitching these datasets together, they build time-series models. This lets them ask and answer questions most won't: "If current trends hold, when will we run out of high-quality language data?" or "How much would it cost to train a model 10x larger than today's frontier?"
The Non-Consensus View Most Miss
Here's a subtle point almost every summary misses. Epoch's forecasts aren't destiny. They're conditional projections. The headline from their 2023 compute projection report was that training runs could grow 100x every 2-3 years. But buried in the analysis is a crucial dependency: this assumes sustained exponential investment and no major architectural breakthroughs that radically change efficiency.
The biggest mistake I see is treating their extrapolations as guaranteed. Smart money uses them as a baseline scenario to stress-test against. What if investment slows? What if a new algorithm changes the rules? That's where real strategy starts.
The Three Core Trends Driving Large-Scale Models (According to the Data)
Let's break down the trends that actually matter, moving beyond vague statements to the specific numbers Epoch tracks.
1. The Compute Juggernaut: Costs Are Soaring, But So Is Efficiency
Training compute for frontier models has been doubling roughly every 6 months since 2010. That's the famous "Moore's Law for AI." GPT-3 (2020) used perhaps 3.1e23 FLOPs. Estimates for models like GPT-4 are an order of magnitude higher.
The raw cost is staggering. A single training run for a frontier model today can cost tens to over a hundred million dollars in cloud compute alone. But here's the counter-trend: algorithmic efficiency is also improving. Epoch's data shows we're getting about 2x more performance out of the same compute every 8-9 months. This means a model trained today for $50M might match the performance of a model that would have cost $100M a year ago.
For investors, this creates a brutal dynamic. You're running on a treadmill that's speeding up. Spending a lot doesn't guarantee a lasting lead because your competitors are benefiting from the same efficiency gains.
| Model (Representative) | Estimated Training Compute (FLOPs) | Implied Cloud Cost (Approx.) | Key Efficiency Driver |
|---|---|---|---|
| GPT-3 (2020) | ~3.1e23 | $4-5M | Dense Transformer Scaling |
| Chinchilla (2022) | ~5.8e23 | $7-9M | Optimal Model/Data Ratio |
| GPT-4 Class (2023) | ~2.0e25 (est.) | $50-100M+ | Mixture of Experts, Better Data |
| Projected 2025 Frontier | ~1.0e26 (est.) | $200-500M+ | Scale + Architectural Innovations |
Notice the jump? The table isn't just for showâit shows why the barrier to entry is now measured in hundreds of millions, not millions. It also hints that the companies controlling the hardware (NVIDIA, cloud providers) and those with the capital to fund these runs (Big Tech) have a structural advantage.
2. The Data Wall: We're Not Running Out, But Quality is the New Bottleneck
A few years ago, a common panic was "we'll run out of text data on the internet by 2025." Epoch's more nuanced research, like their analysis on data stocks, pushed back on this. The total stock of digital data is still growing. The real constraint is high-quality, machine-learning-ready data.
Most of the obvious, clean English text from books, websites, and code has been ingested. What's left is lower-quality, redundant, or in non-English languages. This forces a shift in strategy:
- Synthetic Data: Models generating their own training data. Promising but unproven at the scale needed to replace internet scraping.
- Multimodal Data: Using images, video, and audio as complementary training signals. This is a huge, less-tapped reservoir.
- Data Curation & Filtering (The Quiet Winner): Companies like Semianalysis have reported that OpenAI's key advantage with GPT-4 wasn't necessarily more data, but vastly better data filtering and mixing. This is a software and process moat, not a raw resource one.
The investment implication? Betting on companies with unique, hard-to-replicate data pipelines (e.g., GitHub for code, scientific publishers, proprietary video libraries) might be smarter than betting on whoever has the most generic web scrape.
3. Algorithmic Efficiency: The Silent Multiplier
This is the trend that gets the least press but might matter most for mid-tier players. Raw compute scaling is expensive. Getting more out of what you have is the great equalizer.
Epoch tracks this through "compute-equivalent" benchmarks. Their data suggests innovations like the Chinchilla scaling laws (optimizing model size vs. data size), mixture-of-experts architectures (only using parts of a model at once), and better optimizers have delivered consistent, compounding efficiency gains.
Why does this matter for you?
It means a well-funded startup with a brilliant research team can potentially compete with a tech giant's brute force approachâfor a while. It creates windows of opportunity. However, these efficiency gains eventually get baked into the baseline. The giants then apply their scale on top of the new, more efficient baseline, widening the gap again. It's a cyclical race.
A Practical Guide for Investors and Strategists
So, you've read the trends. How do you turn this into an investment or business decision? Here's a framework I use, informed by Epoch's data.
Scenario Planning for the Next 3 Years
Don't predict one future. Plan for three, based on how the core trends might interact.
- The Continued Scaling Scenario (Baseline): Investment remains fervent, efficiency gains continue at current rates. Outcome: Frontier model costs hit ~$1B/train by 2026. Only 3-5 entities globally can play at the very frontier. The market consolidates around them. Your move: Identify the likely winners in hardware and the "model-as-a-service" layer. Be wary of pure-play model startups without a clear path to this scale.
- The Efficiency Breakthrough Scenario: A new architecture (beyond transformers) or training method delivers a 10-100x efficiency leap. Outcome: The cost curve flattens or drops suddenly. Incumbents' scale advantage is temporarily nullified. A new wave of startups emerges. Your move: Maintain a broad watch on academic research (not just corporate labs). Allocate a portion of capital to venture bets in novel AI research.
- The Investment Slowdown / Diminishing Returns Scenario: The returns from pure scale start to visibly diminish for commercial applications. Investors get fatigued by the capital burn. Outcome: Growth in training compute slows to doubling every 2-3 years. The focus shifts fiercely to fine-tuning, application, and monetization of existing model families. Your move: Focus on companies with strong product-market fit, distribution, and the ability to use existing models effectively (the "picks and shovels" of AI deployment).
Most portfolios are over-weighted for Scenario 1. A robust strategy allocates across all three.
Specific Due Diligence Questions
When evaluating an AI company claiming to build large-scale models, move beyond the demo. Ask their technical leadership:
- "Based on your target performance, what's your estimated training compute budget? How does that align with Epoch's scaling projections for that capability level?" (Tests their realism)
- "What's your strategy for high-quality data beyond common web crawls? Do you have exclusive partnerships or synthetic data pipelines?" (Tests their data moat)
- "What specific algorithmic efficiencies are you betting on to reduce your cost-per-unit-performance compared to the open-source baseline?" (Tests their technical edge)
If they dismiss these questions or give vague answers about "proprietary technology," that's a red flag. Teams grounded in reality know these are the fundamental constraints.
Common Questions About AI Scaling and Investment
You don't need to understand FLOPs. Focus on the high-level trajectories. Use their charts on compute growth and cost as a reality check. When a startup pitches you a plan to train a "GPT-5 competitor" for $10 million, you can point to the public data showing frontier runs are in the $100M+ range. Ask them to explain the discrepancy. The data gives you a baseline to separate plausible ambition from fantasy.
It might be, but it's a massive unknown. Think of synthetic data like fusion energyâpotentially limitless, but perpetually 10 years away from solving our problems at scale. The issue is "model collapse" or "inbreeding": if you only train a model on data generated by previous models, errors and biases can amplify over generations. Epoch's data-driven approach rightly treats synthetic data as an unproven variable at the exascale needed. Relying on it in a financial model today is highly speculative.
It makes them a very strong, logical investment within the "Continued Scaling" scenario. They are the literal picks and shovels. However, it's a crowded trade, and valuations reflect that. The more interesting, non-consensus angle might be companies working on the complement to raw compute: those specializing in the software for extreme efficiency (better compilers, sparsity management), novel chip architectures designed for specific AI workloads (not just GPUs), or cooling/power solutions for massive data centers. The chipmakers win, but the ecosystem around them might offer higher growth multiples.
This is a critical limitation of purely technical forecasts. Epoch's models typically assume technological and economic trends continue. Geopolitical friction is a shock that can bend or break those trends. For example, strict export controls could segment the global AI market, creating a separate, possibly slower-moving scaling trajectory in regions cut off from leading-edge hardware. A savvy strategist must layer these geopolitical risks on top of Epoch's technical baseline. It doesn't invalidate their work; it means you have two layers of analysisâtechnical potential and political accessibility.