CWEB Business Chinese AI Startup DeepSeek Challenges Industry Norms with Cost-Effective Model Training

A research paper published by Chinese artificial intelligence firm DeepSeek has revealed that the company trained its advanced reasoning model, DeepSeek-R1, for just $294,000. This figure stands in stark contrast to the multi-million-dollar training costs typically associated with large language models developed by leading U.S. rivals. The study, which appeared in the prestigious journal Nature, details how the startup utilized slightly more than 500 of Nvidia’s H800 GPUs to achieve this breakthrough, potentially reshaping perceptions of China’s competitive stance in the global AI race.

The astonishingly low cost claim has not been without controversy. Earlier this year, several prominent figures in the technology sector, including Scale AI CEO Alexandr Wang, publicly questioned the feasibility of such an achievement.

Tesla CEO Elon Musk appeared to endorse these doubts, suggesting that undisclosed access to a much larger reservoir of computing power, specifically around 50,000 of Nvidia’s H100 chips, could be a factor—a possibility difficult to address openly due to stringent U.S. export controls on advanced technology to China.

In a significant disclosure within a supplementary document, DeepSeek acknowledged for the first time that it also employed Nvidia’s A100 GPUs during preliminary development phases. The company stated that these chips were used to run experiments on a smaller, 30-billion-parameter model, the promising results of which gave them the confidence to scale up to the full 660-billion-parameter R1 model.

The paper’s methodology centers on a novel approach to enhancing AI reasoning. DeepSeek researchers demonstrated that a model’s reasoning capabilities can be significantly advanced using pure reinforcement learning (RL), a machine learning technique where an AI learns optimal decisions through trial and error.

This process eliminates the traditional and costly dependency on vast datasets of human-labeled examples. The resulting DeepSeek-R1 model exhibited superior performance in verifiable domains like mathematics, coding competitions, and STEM fields, outperforming models trained through conventional supervised learning.

The announcement of the R1 model in January sent ripples through financial markets, triggering a sharp but temporary decline in the market capitalization of AI-related stocks, including industry bellwether Nvidia. DeepSeek CEO Liang Wenfeng, who is listed as an author on the paper, has positioned the company at the forefront of a more efficient and accessible path to advanced artificial intelligence.

CWEB Inc.

CWEB Business Chinese AI Startup DeepSeek Challenges Industry Norms with Cost-Effective Model Training

LEAVE A REPLY Cancel reply

Subscribe to get Latest News Updates

Latest News

CWEB Business News Dell Stock Surges as AI Demand Ignites Major Financial Outlook Upgrade

CWEB Business News Massive Data Extortion Campaign Targets Salesforce Clients, Threatens Unprecedented Breach

CWEB Business News Tesla Accelerates Mass Market Push with New Model Y Variant and...

CWEB Business News Anthropic and Deloitte Forge Landmark AI Alliance

CWEB Business News AI Alliance: OpenAI’s Multi-Billion Dollar Bet on AMD Chips Ignites Stock...

Strategic Fiscal Prudence Under Trump Administration Catalyzes Gold’s Meteoric Ascent: CWEB Business News

CWEB Business News Surging Mortgage Rates Defy Fed’s Rate Cut, Squeezing Homebuyer Budgets

CWEB Business News: Tesla Sales Jump 7% as Electric Car Credits End

CWEB Business News Buffett’s Berkshire Makes Bold $9.7 Billion Bet on Petrochemicals with OxyChem...

CWEB Business News Microsoft Recalibrates Consumer AI Strategy with New 365 Premium Bundle, Pitting...

CWEB Business News Despite Tariffs Toyota’s Global Sales Momentum Continues in August, Fueled by...

CWEB Business News Vercel Secures Massive $300 Million Investment to Accelerate AI-First Development Platform

You may like more
more

CWEB Business News Dell Stock Surges as AI Demand Ignites Major Financial Outlook Upgrade

CWEB Business News Massive Data Extortion Campaign Targets Salesforce Clients, Threatens Unprecedented Breach

CWEB Business News Tesla Accelerates Mass Market Push with New Model Y Variant and “Sentient” AI Update

CWEB Business News Anthropic and Deloitte Forge Landmark AI Alliance

About us

Company

Account

Subscribe to get Latest News Updates

Subscribe to get Latest News Updates

Subscribe to get Latest News Updates

CWEB Inc.

Subscribe to get Latest News Updates

Subscribe to get Latest News Updates

CWEB Business Chinese AI Startup DeepSeek Challenges Industry Norms with Cost-Effective Model Training

LEAVE A REPLY Cancel reply

Subscribe to get Latest News Updates

Latest News

You may like moremore

About us

Company

Account

Subscribe to get Latest News Updates

You may like more
more