DeepSeek Breaks the AI Paradigm

I’ve received emails from readers asking my thoughts on DeepSeek. I need to start with two warnings. First, the usual one: I’m a generalist value investor, not a technology specialist, so my knowledge of AI models is superficial. Second, and more unusually, we don’t have all the facts yet.

DeepSeek Breaks the AI Paradigm

I’ll address DeepSeek in a moment, but first I want to make a few announcements.

The Intellectual Investor Conference

The Intellectual Investor Conference (formerly known as VALUEx Vail) will be held June 11–13 in Vail. The only thing that has changed is the name. It’s still a not-for-profit, knowledge-sharing event capped at 40 attendees, designed for value investors to get together in wonderful Vail and exchange ideas. You can apply and look at past presentations at https://conference.investor.fm/.

IMA Is Hiring an AI Engineer

IMA is looking to add AI expertise to our team. We’re not 100% certain of the exact skill set, but ideally we want someone with a strong engineering/AI background who also has a mild obsession with value investing (no need to be an investing expert). If you know anyone who is interested, please forward them to here.

DeepSeek Breaks the AI Paradigm

I’ve received emails from readers asking my thoughts on DeepSeek. I need to start with two warnings. First, the usual one: I’m a generalist value investor, not a technology specialist (last week I was analyzing a bank and an oil company), so my knowledge of AI models is superficial. Second, and more unusually, we don’t have all the facts yet.

But this story could represent a major step change in both AI and geopolitics. Here’s what we know:

DeepSeek—a year-old startup in China that spun out of a hedge fund—has built a fully functioning large language model (LLM) that performs on par with the latest AI models. This part of the story has been verified by the industry: DeepSeek has been tested and compared to other top LLMs. I’ve personally been playing with DeepSeek over the last few days, and the results it spit out were very similar to those produced by ChatGPT and Perplexity—only faster.

This alone is impressive, especially considering that just six months ago, Eric Schmidt (former Google CEO, and certainly no generalist) suggested China was two to three years behind the U.S. in AI.

But here’s the truly shocking—and unverified—part: DeepSeek claims they trained their model for only $5.6 million, while U.S. counterparts have reportedly spent hundreds of millions or even billions of dollars. That’s 20 to 200 times less.

The implications, if true, are stunning. Despite the U.S. government’s export controls on AI chips to China, DeepSeek allegedly trained its LLM on older-generation chips, using a small fraction of the computing power and electricity that its Western competitors have. While everyone assumed that AI’s future lay in faster, better chips—where the only real choice is Nvidia or Nvidia—this previously unknown company has achieved near parity with its American counterparts swimming in cash and datacenters full of the latest Nvidia chips. DeepSeek (allegedly) had huge compute constraints and thus had to use different logic, becoming more efficient with subpar hardware to achieve a similar result. In other words, this scrappy startup, in its quest to create a better AI “brain,” used brains where everyone else was focusing on brawn—it literally taught AI how to reason.

Enter the Hot Dog Contest

Americans love (junk) food and sports, so let me explain with a food-sport analogy. Nathan’s Famous International Hot Dog Eating Contest claims 1916 as its origin (though this might be partly legend). By the 1970s, when official records began, winning competitors averaged around 15 hot dogs. That gradually increased to about 25—until Takeru Kobayashi arrived from Japan in 2001 and shattered the paradigm by consuming 50 hot dogs, something widely deemed impossible. His secret wasn’t a prodigious appetite but rather his unique methodology; He separated hot dogs from buns and dunked the buns in water, completely reimagining the approach.

Then a few years later came Joey Chestnut, who built on Kobayashi’s innovation to push the record well beyond 70 hot dogs and up to 83. Once Kobayashi broke the paradigm, the perceived limits vanished, forcing everyone to rethink their methods. Joey Chestnut capitalized on it.

DeepSeek may be the Kobayashi of AI, propelling the whole industry into a “Joey Chestnut” era of innovation. If the claims about using older chips and spending drastically less are accurate, we might see AI companies pivot away from single-mindedly chasing bigger compute capacity and toward improved model design.

I never thought I’d be quoting Stoics to explain future GPU chip demand, but Epictetus said, “Happiness comes not from wanting more, but from wanting what you have.” Two millennia ago, he was certainly not talking about GPUs, but he may as well have been. ChatGPT, Perplexity, and Google’s Gemini will have to rethink their hunger for more compute and see if they can achieve more with wanting (using) what they have.

If they don’t, they’ll be eaten by hundreds of new startups, corporations, and likely governments entering the space. When you start spelling billions with an “M,” you dramatically lower the barriers to entry.

Until DeepSeek, AI was supposed to be in reach for only a few extremely well-funded companies, (the “Magnificent Ones”) armed with the latest Nvidia chips. DeepSeek may have broken that paradigm too.

The Nvidia Conundrum

The impact on Nvidia is unclear. On one hand, DeepSeek’s success could decrease demand for its chips and bring its margins back to earth, as companies realize that a brighter AI future might lie not in simply connecting more Nvidia processors but in making models run more efficiently. DeepSeek may have reduced the urgency to build more data centers and thus cut demand for Nvidia chips.

On the other hand (I’m being a two-armed economist here), lower barriers to entry will lead to more entrants and higher overall demand for GPUs. Also, DeepSeek claims that because its model is more efficient, the cost of inference (running the model) is a fraction of the cost of running ChatGPT and requires a lot less memory—potentially accelerating AI adoption and thus driving more demand for GPUs. So this could be good news for Nvidia, depending on how it shakes out.

My thinking on Nvidia hasn’t materially changed—it’s only a matter of time before Meta, Google, Tesla, Microsoft, and a slew of startups commoditize GPUs and drive down prices.

Likewise, more competition means LLMs themselves are likely to become commoditized—that’s what competition does—and ChatGPT’s valuation could be an obvious casualty.

Geopolitical Shockwaves

The geopolitical consequences are enormous. Export controls may have inadvertently spurred fresh innovation, and they might not be as effective going forward. The U.S. might not have the control of AI that many believed it did, and countries that don’t like us very much will have their own AI.

We’ve long comforted ourselves, after offshoring manufacturing to China, by saying that we’re the cradle of innovation—but AI could tip the scales in a direction that doesn’t favor us.

Let me give you an example. In a recent interview with the Wall Street Journal, OpenAI’s product chief revealed that various versions of ChatGPT were entered into programming competitions anonymously. Out of roughly 28 million programmers worldwide, these early models ranked in the top 2–3%. ChatGPT-o1 (the latest public release) placed among the top 1,000, and ChatGPT-o3 (due out in a few months) is in the top 175. That’s the top 0.000625%! If it were a composer, ChatGPT-o3 would be Mozart.

I’ve heard that a great developer is 10x more valuable than a good one—maybe even 100x more valuable than an average one. I’m aiming to be roughly right here. A 19-year-old in Bangalore or Iowa who discovered programming a few months ago can now code like Mozart using the latest ChatGPT. Imagine every young kid, after a few YouTube videos, coding at this level. The knowledge and experience gap is being flattened fast.

I am quite aware that I am drastically generalizing (I cannot stress this enough), and but the point stands: The journey from learning to code to becoming the “Mozart of programming” has shrunk from decades to months, and the pool of Mozarts has grown exponentially. If I owned software companies, I’d become a bit more nervous—the moat for many of them has been filled with AI.

Adapting, changing your mind, and holding ideas as theses to be validated or invalidated—not as part of your identity—are incredibly important in investing (and in life in general). They become even more crucial in an age of AI, as we find ourselves stepping into a sci-fi reality faster than we ever imagined. DeepSeek may be that catalyst, forcing investors and technologists alike to question long-held assumptions and reevaluate the competitive landscape in real time.


Key takeaways

  • DeepSeek, a Chinese startup, has achieved what seemed impossible – creating an LLM that performs on par with top US models while allegedly spending only $5.6 million (20-200x less than US counterparts) and using older-generation chips, potentially breaking the paradigm of “more compute = better AI”
  • Like Kobayashi revolutionizing hot dog eating contests by reimagining the approach rather than just eating more, DeepSeek may have cracked the code by teaching AI to reason more efficiently rather than throwing more computing power at the problem
  • The implications for Nvidia are complex – while this could reduce the urgent demand for latest-gen chips, the lowered barriers to entry could actually increase overall GPU demand as more players enter the space and AI adoption accelerates
  • Geopolitically, this suggests US export controls may have backfired by spurring innovation, and the US might not have the stranglehold on AI development that many assumed – countries that “don’t like us very much” could soon have their own capable AI
  • The AI revolution is flattening knowledge gaps at an unprecedented pace – when tools like ChatGPT can code in the top 0.000625% globally, we’re seeing the traditional moats of software companies and technical expertise being filled with AI faster than anyone imagined

Please read the following important disclosure here.

Related Articles

Escaping Stock Market Double Hell

Escaping Stock Market Double Hell

Over the last few years, our portfolio has skewed more international. Today, if you only invest in the US, you're experiencing two stock market hells.

Embracing Stock Market Stoicism

2024 brought me back to a core Stoic principle that I hold close to my heart: the dichotomy of control. We can apply it in investing.

Thoughts from the Consumer Electronics Show

My son Jonah and I were at CES (the Consumer Electronics Show) in Las Vegas. I wanted to attend CES to shake myself out of my comfort zone.
Q&A Series: Money Habits for Kids and the Power of Writing

Q&A Series: Money Habits for Kids and the Power of Writing

In this Q&A excerpt, we'll explore teaching money habits to young people and how writing has improved my investment approach.

11 thoughts on “DeepSeek Breaks the AI Paradigm”

  1. Just like all other products, looks like China now makes a cheaper lower quality copy of an AI product. But some people will care about the brand and quality and will only use ChatGPT while others will be fine with the Made in China cheaper, lower quality copy. Eventually, there’ll be more AI brands just like the smartphones, an Apple AI, Android AI, Microsoft AI, Facebook AI, ChatGPT, DeepSeek e.t.c each with it’s own cult following and fans. Some will use Nvidia, some will use AMD, some will use Intel e.t.c

    Reply
  2. As a computer engineer, I am really interested in reading your letter on AI. There is one point that you did not mention is that Deepseek does not make their system a trade secret but make it open source instead. That means anyone can download the computer code, run it on their own hardware, modify it and improve upon it. As if that is not enough, they publish a research paper, explaining all the internal working. Imagine there will be a lot of new inspired kids who would build upon it or use it as a stepping stone to bring AI to the next level and doing things we never think of before. So your job as an investor is to look out for new kid who can run with it.

    Reply
  3. Well, I am an engineer so my reference for breaking a paradigm is not hot dog eating contests. I have been expediting something like this to happen for quite a while (admittedly not from a Chinese start-up but form an American one) because what the US AI companies have been doing is a brute force solution to the problem. Brute force solutions in technology are typically eventually replaced/obsoleted by smart solutions. What has happened now in AI reminds me the paradigm shift that happened when fast Fourier transformation was invented which enabled practical real time signal processing. Similarly, when sparse matrix matrix inversion was discovered it enabled much faster network analysis by speeding up matrix inversion substantially. In general, I think the problem with the brute force AI solution is that you need these huge and expensive resources because the training algorithm looks at all available data on the web. Narrowing the data set to only include the relevant and reliable data would immensely reduce the computational need and increase the speed (of course the trick is how you decide what is relevant and reliable). I am not AI expert and the information from Deep Seek is obviously incomplete but I believe that it is at least part of the secret sauce that enabled DeepSeek’s quantum leap in time and their reduction of hardware requirements.

    Reply
  4. My thoughts on the “learn to code” snide advice given to laid-off auto, steel, mining, etc. workers for years, has been all along, great! The profession that works on making itself obsolete is the one the elitists were touting! Now we’ve begun to see real tangible proof of this. Your statement of the kid in Bangalore or Iowa hits to the heart of many possible outcomes. The one thing I think won’t change very soon is the plumber still charges $100 plus per hour, and small plumbing company owners who have been in the business for 10+ years are multimillionaires.
    AI isn’t going to replace your toilet anytime soon…..

    4
    Reply
    • The toilet of the current day may not be the toilet of the future. There could be a lot more technology in it and may not even use water or may not even need a flushing mechanism eliminating the need for pipes and drainage. Even self-healing or self-repairing pipes and construction materials could be developed as well where things are built by robots and then upgraded or fixed electronically through software.

      Reply
    • Don’t be so sure. Do you know that in Japan, they have robots to take care of their elderly? So AI enabled robot can do plumbing with ease if the market is large enough.

      Reply
  5. Rather than “AI filling the moat around software companies”, a more apt metaphor for the disruption DeepSeek represents is the scene in “World War Z” where zombies climb and overwhelm the fortress walls in Jerusalem at superhuman speed.

    Reply
  6. Not surprised that the Chinese(or anyone) figured out a way around the cost and availability issues with chips re AI. If someone wants something badly enough, they will either invent something, steal it, or make a better version of what they have. When the 4 minute mile was deemed an impossibility, Roger Banister ran the first sub 4 minute mile, 7 weeks later his record was broken…Once perceived barriers are removed, progress is swift, as you know.

    Reply
  7. Thanks so much for such a lucid and entertaining (as usual) explication of DeepSeek and its effects on AI stocks. I’ve owned the CEF, STK and it just took a big hit given its tech focus. I think I’ll hold on for now and it does generate regular income.

    Reply
  8. I am reminded of this story from the Space Race. While the U.S. spent lots of money to have an ink pen developed that would function in zero gravity, the Soviet Union just used a pencil. The U.S. astronauts also had to have car to tour the moon. As a child, I happily remember drinking plenty of Tang, the powdered orange drink of the astronauts, to get the toy moon buggy that was a premium from Tang so long ago.

    We in the United States have a lengthy history of substituting money and horsepower for results that might have been achieved with less of both. A Pinto or Vega got you from point A to point B just as quickly, when driving the posted speed limit, as a Mustang or Camer. Howver, the Pinto and Vega were derided as cheap transportation, and ultimately dropped by their respective manufacturers, while the Mustang and Camero have remained objects of lustful admiration for generations.

    Reply

Leave a Comment