From Sora to Seedance: How China Conquered the AI Video Revolution
*The boundary between physical production and algorithmic generation has dissolved. In July 2026, AI-generated video is no longer a demonstration—it is an industry. Photo: Unsplash*
The Present Moment: A Feature Film at Cannes, Made by Algorithms
On the evening of May 24, 2026, the Grand Théâtre Lumière at the Cannes Film Festival screened something that would have been impossible two years earlier. *HELL GRIND*, a 95-minute feature film, premiered to a full house of critics, distributors, and curious filmmakers. The opening credits carried no cinematographer. No location scout. No grip team. The film was generated entirely by artificial intelligence—specifically, by ByteDance's Seedance 2.0 video model, with post-production handled by a team of eleven people working from three cities.
The screening was not a sideshow. It was a market screening, scheduled during the festival's busiest week, with acquisition executives from Netflix, Amazon, and Tencent Video in attendance. Within 48 hours, distribution rights for Southeast Asia had been sold. The film cost an estimated $180,000 to produce. A comparable live-action genre film with practical production would have cost between $3 million and $8 million.
"We are not replacing filmmakers," said the project's director, a former commercial director from Beijing who asked to be credited only as "Studio 7." "We are replacing the parts of production that were always inefficient. Location permits, weather delays, equipment rental, union minimums for crew sizes that were historically determined by the physics of cameras, not by the needs of the story."
The Cannes premiere was a cultural milestone. But the business milestone arrived three weeks later, on June 23, at the Volcano Engine FORCE Conference in Beijing. ByteDance's cloud computing division announced that Seedance 2.5 would enter general availability in early July 2026. The upgrade was not incremental. Native 4K resolution. 30-second single-segment video generation. Built-in 3D pre-visualization tools. A localized editing module that maintains visual consistency across shots. And a unified reference system capable of ingesting up to 50 multimodal assets—images, video clips, audio tracks, and text descriptions—into a single coherent output.
The enterprise platform, launched just eight months earlier, had already reached $2 billion in annual recurring revenue.
This is the story of how AI video generation went from research curiosity to global production infrastructure in eighteen months. And it is the story of how China, not the United States, built the factory.
Chapter 1: The Sora Moment (November 2024)
To understand where the industry stands in July 2026, you must return to November 2024, when OpenAI released Sora.
The announcement was a shock. High-definition video clips—up to one minute in length—generated from text prompts with cinematic fidelity. A woman walking through a Tokyo street at night, neon reflections accurate in every puddle. Two golden retrievers podcasting on a mountain top. A drone shot through a cathedral of coral. The physics were not perfect. The hands occasionally melted. But the implication was unmistakable: the gap between imagination and footage was collapsing.
Sora did not launch as a product. OpenAI restricted access to a small group of visual artists and red-teamers. The delay was technically justified—safety review, infrastructure scaling, copyright liability—but the practical effect was to create a vacuum. While American researchers debated the ethics of synthetic media and legislators in Washington drafted bills they would never pass, entrepreneurs in Shenzhen and Hangzhou were already building.
ByteDance had been working on video generation since early 2024. Its first-generation model, Seedance 1.0, was released to internal teams in late 2024 and to enterprise beta users in early 2025. The quality was rough. Motion was smeared. Characters changed appearance between cuts. But the architecture was designed for scale from the beginning. ByteDance had something OpenAI did not: the training data infrastructure of TikTok and Douyin, the world's largest video platforms, processing billions of clips daily through recommendation algorithms that already understood what made video content engaging, coherent, and temporally consistent.
Kuaishou, ByteDance's domestic rival, launched Kling in June 2024. It was initially a curiosity—a tool for generating short clips for social media. But Kuaishou's engineering team, operating from the same recommendation-engine DNA, rapidly improved the model. By December 2025, Kling had reached $240 million in annual recurring revenue and crossed $300 million in early 2026. Sixty million creators had generated over 600 million videos on the platform. A Turbo variant, launched June 17, 2026, offered faster generation at lower cost.
The Chinese AI video ecosystem had emerged as a parallel track to the American one. While OpenAI debated, China shipped.
Chapter 2: The Divergence (January–June 2025)
The first half of 2025 was defined by a divergence in strategy that would prove decisive.
OpenAI's Sora remained in limited access. The company prioritized integration with ChatGPT and enterprise workflows, betting that safety concerns and infrastructure costs justified a slow rollout. The underlying model was powerful, but the business model was uncertain. Hollywood studios, already wary of AI after the 2023 writers' strike, were not eager to experiment with tools that might generate legal exposure. Individual creators clamored for access, but OpenAI's pricing—when it eventually appeared—positioned Sora as a premium tool for professionals, not a mass-market creative utility.
In China, the opposite happened. ByteDance integrated Seedance directly into its existing consumer ecosystem. The Jimeng platform (known internationally as Dreamina) offered Seedance 1.0 Pro at roughly 69 RMB per month—approximately $9.60. The Xiaoyunque mobile app provided free credits with a 1,200-point registration bonus and 120 daily credits for continued use. The Doubao AI assistant app, already China's most popular consumer AI application with 163 million monthly active users as of late May 2026, offered 10 free video generations daily to beta users.
The pricing was not merely aggressive. It was predatory. At $9.60 per month for professional-grade AI video generation, the cost of producing a 30-second commercial dropped below the cost of a single stock footage clip from traditional libraries. The economics of video production inverted.
By February 2025, the AI-powered short drama industry in China was already a self-sustaining ecosystem. Micro-dramas—vertical-format serialized content, typically 60–90 seconds per episode, optimized for mobile viewing—had exploded in popularity during 2024. Production was traditionally fast and cheap: a crew of 10–15 people could shoot a 100-episode series in two weeks for roughly $50,000. AI generation reduced that cost by 70–80% and compressed the timeline to days.
The Spring Festival of 2026 marked a tipping point. During the holiday season, Spark Animation collaborated with Volcano Engine and AMD to produce *Journey to the West: Pasting Tiles on Five Finger Mountain*, an AI-powered animated series built with Seedance. The series topped the popularity charts for animated content on the Hongguo platform, surpassing 60 million popularity points and remaining at the top of the charts for ten consecutive days. It was not a technical experiment. It was a commercial hit, produced with AI tools, distributed through traditional channels, and consumed by audiences who did not care—or know—how it was made.
Meanwhile, Sora's development trajectory was running into headwinds. OpenAI's focus had shifted to reasoning models, agentic AI, and the enterprise API business. Video generation, always a research priority rather than a revenue driver, began to lose internal resources. By late 2025, reports emerged that Sora's team was being reassigned. On March 31, 2026, OpenAI officially discontinued Sora as a standalone product. The model's capabilities were partially absorbed into ChatGPT's multimedia features, but the dedicated video generation platform—the one that had stunned the world in November 2024—was gone.
Chapter 3: The Breakthrough (February–May 2026)
Seedance 2.0, launched on February 12, 2026, was the moment the Chinese AI video ecosystem definitively pulled ahead of its American competition.
The model introduced four-input modality support—text, images, audio, and video—along with native audio-visual synchronization and multi-shot long narrative capabilities. The "All-Round Reference" mode allowed creators to upload up to 12 assets and use "@" tagging to precisely control character appearance, motion style, and environmental context across multiple shots. For the first time, an AI video model could generate content with director-level control, not just prompt-level suggestion.
The technical improvements were substantial. Motion modeling advanced from the smeared, unstable movement of 1.0 to physically coherent action with correct momentum and collision response. Character consistency across shots—previously the single greatest weakness of AI video—reached production-grade reliability. The model could maintain facial features, clothing, and body proportions across cuts, enabling the generation of multi-scene narratives with identifiable protagonists.
The commercial impact was immediate. Enterprise users, who had been cautious with 1.0, began migrating production workflows to Seedance 2.0. By April 2, 2026, Volcano Engine opened public beta testing for enterprise customers. The platform was adopted by advertising agencies, e-commerce sellers, micro-drama studios, and educational content producers. The price point—free tiers for casual users, $9.60/month for professionals, enterprise contracts for studios—captured the entire market spectrum.
The cultural impact arrived in May. *HELL GRIND*, generated with Seedance 2.0 and refined through traditional post-production, premiered at Cannes. It was not the first AI-generated film to screen at a major festival—experimental shorts had appeared at smaller events throughout 2025—but it was the first feature-length narrative film with production values sufficient for commercial distribution. The screening was a signal to the global film industry: AI generation was no longer a novelty. It was a tool.
Kuaishou's Kling 3.0, released February 5, 2026, provided the competitive pressure that kept the ecosystem accelerating. Built on the Omni One unified multimodal framework, Kling 3.0 combined 3D spacetime joint attention with chain-of-thought scene reasoning, enabling more complex spatial compositions and longer coherent sequences. While Seedance dominated the enterprise market, Kling captured the creator economy—individual influencers, small studios, and the vast micro-drama production pipeline that feeds China's short-video platforms.
The two companies, fierce rivals in the short-video market, had inadvertently created a duopoly in AI video generation. Together, they controlled the world's most advanced production infrastructure for synthetic media. The American market, without a dedicated Sora product, had fragmented into smaller tools—Runway, Pika, Luma—none of which matched the scale, integration, or pricing of the Chinese platforms.
Chapter 4: The Factory (June–July 2026)
On June 23, 2026, at the Volcano Engine FORCE Original Power Conference in Beijing, ByteDance revealed the architecture of the factory.
Seedance 2.5 was not merely a model upgrade. It was a production system. The 30-second single-segment native video generation capability—up from 10 seconds in 2.0—enabled the creation of continuous shots complex enough for cinematic sequences. The built-in 3D white model preview function allowed filmmakers to pre-visualize scenes before committing to full generation, reducing iteration cycles from hours to minutes. The localized editing module maintained visual consistency when modifying specific regions of a frame, solving the "change the hand, lose the face" problem that had plagued AI video since its inception.
Most significantly, the unified reference system could ingest up to 50 multimodal assets into a single generation context. A filmmaker could upload a character design sheet, three reference videos for motion style, a location photograph, a music track for tempo reference, and a script excerpt, then generate a shot that incorporated all of them into a coherent visual output. This was not automation of existing workflows. It was the creation of entirely new workflows that had no precedent in traditional production.
The enterprise platform's $2 billion ARR was built on three customer segments. Advertising agencies used Seedance for rapid commercial prototyping and variant generation—producing 20 versions of a 15-second product spot for A/B testing at a cost that would previously have generated one. E-commerce sellers generated product demonstration videos from still images and text descriptions, replacing the expensive studio photography and model shoots that had previously been required for marketplace listings. Micro-drama studios used the platform for full-series production, generating episodes in batches and assembling them through template-based editing pipelines.
The economics were transformative. A traditional micro-drama production, costing $50,000 for 100 episodes, could be produced with Seedance for approximately $12,000—including platform subscription costs, cloud rendering fees, and post-production labor. The production timeline compressed from two weeks to four days. The crew size dropped from 15 people to 3: a prompt engineer, an editor, and a quality control reviewer.
This was not theoretical. By July 2026, an estimated 40% of all micro-drama content produced in China was either fully or partially generated by AI tools, with Seedance and Kling handling the majority of the volume. The micro-drama industry, which generated approximately $5 billion in revenue in 2025, was on track to exceed $12 billion in 2026, with margin expansion driven almost entirely by AI production cost reduction.
The global expansion was already underway. Dreamina, ByteDance's international platform, had begun offering Seedance 2.0 to creators in Southeast Asia, the Middle East, and Latin America. The pricing—$9.60 per month for professional features—was accessible in markets where traditional production infrastructure was prohibitively expensive. A creator in Jakarta or São Paulo could generate content with production values previously available only to studios in Los Angeles or London, at a fraction of the cost.
The Numbers That Matter
The scale of China's AI video ecosystem in July 2026 is difficult to comprehend without aggregating the data points:
| Metric | Value | Context |
|---|---|---|
| Seedance enterprise ARR | $2 billion | Launched September 2025; reached $2B by June 2026 |
| Kling cumulative ARR | $300 million+ | Fastest-commercializing standalone AI video platform |
| Kling creators | 60 million | Generated 600+ million videos since launch |
| Doubao MAU | 163 million | China's #1 consumer AI app; 10 free daily video generations |
| Micro-drama industry (China) | $12 billion (projected 2026) | Up from $5 billion in 2025; AI-driven margin expansion |
| AI-generated content share (micro-drama) | ~40% | Partial or full AI generation |
| Seedance 2.0 production cost vs. traditional | -75% | $12K vs. $50K for 100-episode series |
| HELL GRIND production cost | $180,000 | vs. $3–8M for comparable live-action feature |
| China AI video market (total) | $8–10 billion (2026 est.) | Includes platforms, tools, and AI-generated content revenue |
*Table: Key metrics for China's AI video generation ecosystem as of July 2026. Sources: Volcano Engine, Kuaishou financial disclosures, industry analyst estimates, Cannes Film Festival market data.*
The Implications: What Changed
The AI video revolution is not merely a technological shift. It is a restructuring of the global media production economy with implications that extend far beyond the film industry.
First, the cost structure of visual media has been permanently altered. For the first time in the history of moving images, the marginal cost of producing a frame of video is approaching zero. This does not mean all video will be AI-generated—high-end productions, documentaries, and live events will continue to use traditional methods for the foreseeable future. But the vast middle tier of content production—commercials, corporate videos, social media content, educational materials, low-budget features—has been economically transformed. The barrier to entry has fallen from "own a camera and hire a crew" to "subscribe to a software platform and learn prompt engineering."
Second, China's platform ecosystem has achieved a dominant position in a major AI application domain. In large language models, the competition remains open—OpenAI, Anthropic, Google, and Chinese labs are all producing competitive systems. In AI video generation, the practical reality is that the two most advanced platforms are Chinese, deeply integrated into Chinese consumer and enterprise ecosystems, and priced at levels that make American alternatives commercially uncompetitive. This is not a projection. It is the market structure as of July 2026.
Third, the cultural and regulatory implications are only beginning to unfold. The Cannes screening of *HELL GRIND* forced a conversation that the film industry had been avoiding: what does authorship mean when the "director" is primarily a prompt engineer and quality control reviewer? What does copyright protection mean when a model trained on millions of copyrighted works can generate original-seeming content that is not, legally, a derivative work? China's regulatory framework for synthetic media is still evolving, but it is evolving in a market where the technology is already deeply embedded in commercial production. The United States and Europe, where regulatory debate has often preceded deployment, may find themselves legislating for a technology that has already been adopted elsewhere at scale.
Fourth, the hardware implications are significant. AI video generation is computationally intensive. ByteDance's Seedance infrastructure runs on Volcano Engine's cloud platform, which relies on both domestic Huawei Ascend chips and imported NVIDIA GPUs. The massive demand for inference compute—serving 60 million Kling creators, 163 million Doubao users, and enterprise customers generating millions of videos daily—is a major driver of China's AI infrastructure investment. ByteDance's planned $23 billion AI capital expenditure in 2026, with roughly half earmarked for GPU procurement, is partly driven by the compute requirements of video generation workloads.
The Road Ahead
Seedance 2.5 will launch in early July 2026. By the time this article is published, the platform will likely be processing its first million generation requests from the general public. The launch is not an endpoint. It is a milestone in a trajectory that shows no signs of slowing.
ByteDance has already signaled its next moves. The Seed3D 2.0 model, released in April 2026, generates 3D geometry and textures with state-of-the-art performance, enabling the creation of virtual environments and digital assets that can be fed back into video generation pipelines. The Seed Full-Duplex Speech LLM, deployed on Doubao in April 2026, improves conversational fluency by 12% and enables more natural voice interaction. The Seed2.1 agent model, released June 23, 2026, is designed for real-world productivity tasks, suggesting that ByteDance's vision extends beyond media generation to full agentic workflows.
Kuaishou, meanwhile, continues to iterate on Kling. The Turbo variant launched in June 2026 offers faster generation at lower quality, capturing the high-volume, low-cost segment of the market. The company is reportedly working on Kling 4.0, with a target release in late 2026, that will extend single-segment generation to 60 seconds and introduce real-time generation capabilities for live streaming applications.
The American response remains uncertain. OpenAI has not announced a successor to Sora. Google DeepMind's Veo 3.1, launched in October 2025 and upgraded with 4K support in January 2026, remains a capable competitor but is priced at a premium—$0.50 per second for video generation via the Vertex AI API, compared to Seedance's subscription model that offers effectively unlimited generation for $9.60 per month. The business model difference is as significant as the technical difference. Veo is a tool for enterprises with large budgets. Seedance is a utility for creators with small ones.
Conclusion: The Factory and the Garden
In July 2026, the global AI video generation market has bifurcated into two ecosystems. The American ecosystem, led by Google and a fragmented collection of smaller tools, operates like a garden—carefully cultivated, high-quality, accessible to those with the resources to tend it. The Chinese ecosystem, led by ByteDance and Kuaishou, operates like a factory—optimized for scale, integrated into the largest content platforms on Earth, and priced for mass adoption.
Both models have their place. The garden produces the finest specimens. The factory produces the world's supply.
The film industry, which has always been a hybrid of art and industry, now faces a choice that is not really a choice at all. The economics of AI video generation are too compelling for commercial production to ignore. The question is not whether AI will be used to generate video. It is who will control the tools, who will set the prices, and who will define the creative workflows that the next generation of filmmakers will learn.
In July 2026, the answer to those questions is increasingly clear. The factory is in China. And it is already running at full capacity.
*Published July 4, 2026. This article is part of the AI in China daily briefing series. For the latest data and analysis on China's artificial intelligence ecosystem, visit ainchina.com.*
Editor at AI in China. Tracking Chinese AI companies, funding rounds, and the technologies reshaping global tech. More about me.