Vinci

Think Twice Before Hitting 'Generate'

Video models don't spring from thin air. They're trained on massive datasets, often scraped from the open web without explicit permission.If your brand video subtly echoes a creator's uncompensated work, who bears the liability? You, the vendor, or both?

S
Siddhant Singh
Think Twice Before Hitting 'Generate'

I've spent the last few years watching AI transform industries at a pace that feels both exhilarating and a little unnerving. Back when we launched DALL-E at OpenAI, the sheer magic of turning words into images was mind-blowing - it was like giving everyone a personal studio artist. Fast forward to today, and video generation tools like Google's Veo 3 are pushing that boundary even further. Suddenly, anyone can conjure polished ads, explainer videos, or even full campaigns in minutes. It's the kind of progress that makes you believe in abundance: more creativity for everyone, faster, cheaper.

But here's the thing I've learned from building AI systems: the road to that abundance is littered with unintended consequences. And right now, for brand managers and creative teams, one of those consequences is staring you in the face - legal risks baked into the very foundation of these tools. It's not just a hypothetical; it's lawsuits, regulatory reports, and real-world exposures that could turn your next viral campaign into a courtroom drama. In this post, I'll walk through why this matters, drawing from the latest developments, and share some thoughts on how we can build forward without the baggage.

The Allure of AI Video and the Data Dilemma

Let's start with the good stuff. Tools like Veo 3 aren't just gimmicks; they're game-changers. Imagine briefing your team on a new product launch, feeding in some assets, and out pops a 30-second spot that's on-brand, emotionally resonant, and ready for A/B testing. No endless revisions, no ballooning budgets. At OpenAI, we've seen similar leaps with Sora, and the feedback from creators has been electric - they're iterating faster, experimenting bolder.

The catch? These models don't spring from thin air. They're trained on massive datasets, often scraped from the open web without explicit permission. Take Google's Veo 3: the company openly confirmed it's using YouTube's vast library - over 20 billion videos - to power its training. A spokesperson told CNBC, "We've always used YouTube content to make our products better," which sounds innocuous until you unpack it. Creators uploading to YouTube in good faith suddenly find their work fueling a competitor's tech stack, often without compensation or even awareness.

This isn't unique to Google. Leading labs, including ours at OpenAI and others like xAI, have grappled with similar questions about data sourcing. But the scrutiny is intensifying. Just look at MiniMax, the Chinese AI firm behind the Hailuo video generator: In September 2025, Disney, Universal, and Warner Bros. Discovery slapped them with a lawsuit alleging systematic infringement through unauthorized scraping of copyrighted characters and footage. The complaint paints a picture of a company treating Hollywood IP like free-for-all fodder, using it to market "a Hollywood studio in your pocket."

For brands, the uncertainty is the killer. You don't know which clips from that 20-billion-video trove made it into the model. Even a tiny fraction - say, 1% - equates to billions of minutes of footage, much of it copyrighted. Tools like Vermillio, which specialize in similarity detection, are already flagging AI outputs that mirror originals with 90%+ overlap, turning what looks like a fresh ad into potential infringement bait. If your brand video subtly echoes a creator's uncompensated work, who bears the liability? You, the vendor, or both?

Lawsuits Are Here - and They're Targeting Outputs, Not Just Training

If the data opacity feels abstract, the lawsuits make it concrete. Earlier this year, in June 2025, Disney and Universal filed a blockbuster suit against Midjourney, the AI image (and now video) generator. It wasn't just about the training data; the 110-page complaint zeroed in on user-generated outputs - those "countless" images and clips riffing on Star Wars, The Simpsons, and more, all without a license. Warner Bros. Discovery piled on in September, accusing Midjourney of willful infringement by letting subscribers conjure Superman or Scooby-Doo on demand.

These aren't fringe cases. They're setting precedents that ripple to every AI vendor. Brands using these tools aren't insulated; vendors' indemnification clauses often come with asterisks. OpenAI's terms, for instance (and they're industry standard), exclude coverage if you "knew or should have known" the output was infringing. Without transparency into the training black box, how do you prove due diligence? It's a high bar, and one that could leave your marketing budget defending a deposition instead of scaling campaigns.

This shift - from blaming the model to blaming the user - isn't theoretical. It's how courts are framing it: AI as a tool that amplifies infringement when wielded carelessly. And as video gen matures, expect more. MiniMax's case is a harbinger; it's not just U.S. studios vs. one firm - it's a signal that global IP holders are done waiting for self-regulation.

The Authorship Puzzle: Who Owns What You Create?

Layer on top of this the existential question of ownership. Can you even copyright that slick AI video you just generated? The U.S. Copyright Office tackled this head-on in its January 2025 report, "Copyright and Artificial Intelligence, Part 2: Copyrightability." The verdict: No, not without substantial human creative input. Prompts alone? That's not authorship - it's direction. You need to demonstrate editing, arrangement, or meaningful modification to claim protection.

Think about that for a brand. Your team spends hours tweaking a Veo 3 output to fit your visual guidelines, but without documentation, a competitor could rip it off legally. The report underscores that pure AI works fall into the public domain by default, eroding the moat around your content. It's a reminder that AI amplifies human creativity, but it doesn't replace the spark that makes work protectable.

What This Means for Brands - and the Regulatory Backdrop

Pull it all together, and the stakes for brand teams are clear: Impressive outputs are table stakes, but unchecked models invite liability. Larger brands, with deeper pockets and higher scrutiny, feel this acutely - your legal team isn't just reviewing contracts; they're auditing algorithms.

California's AB 316, signed into law earlier this year, sharpens the blade. It explicitly bars defendants from claiming "AI acted autonomously" as a defense for harm caused by the tech. No more passing the buck to the machine; developers, modifiers, and users alike must vet tools rigorously. In a state that's home to Hollywood and Silicon Valley, this isn't niche - it's the new normal.

I've talked to CMOs who love the speed of AI video but dread the audits. One Fortune 500 exec told me last month: "We paused our pilot because we couldn't trace the data lineage. It felt like building on quicksand." That's the feedback loop we need more of - not fear-mongering, but honest reckoning.

Building a Safer Path Forward

So, how do we fix this? Optimism is my default, and I believe we can. The key is intentional design: models trained on licensed, auditable data; workflows that bake in human oversight; and indemnification that actually holds up.

At places like Vinci, they're leading the charge for enterprise-grade solutions. Their approach - exclusive use of commercially licensed datasets with full provenance, "human-in-the-loop" reviews for creative input, and RAG-based systems that enforce brand consistency - turns risks into guardrails. (Full disclosure: We're not directly involved, but it's the kind of rigor that aligns with how we think about safe deployment at OpenAI.) Tools like Vermillio can slot in for ongoing monitoring, scanning outputs against known IP.

For any vendor, the litmus test is simple: Ask, "Can you document training on licensed or public domain materials - no scraped copyright?" If the answer's fuzzy, walk. And internally, build those human guardrails: Mandate edit logs, creative briefs, and legal sign-offs. It's a small price for defensible magic.

Toward an Abundant, Accountable Future

AI video isn't going anywhere - it's accelerating. In five years, I suspect it'll be as routine as stock photos are today, democratizing storytelling in ways we can't yet imagine. But we'll only get there if we prioritize trust over haste. The lawsuits and reports aren't roadblocks; they're course corrections, forcing us to align tech with human values like fairness and ownership.

What gives me hope? The creators, brands, and builders already stepping up. If you're in this space, what's one question you're asking your AI partners about data and IP? How are you weaving humans back into the loop?

Sources:

Associated Press. (2025, September 5). Warner Bros. sues Midjourney for copyright infringement. AP News. https://apnews.com/article/warner-bros-midjourney-ai-copyright-lawsuit-dc-studios-b87d80d7b4a4dfdcf0ee149d30830551

California State Legislature. (2025). Bill text: CA AB316 | 2025-2026 | Regular Session | Amended. LegiScan. https://legiscan.com/CA/text/AB316/id/3223647

Lee, E. (2025, June 11). Disney and Universal sue A.I. firm for copyright infringement. The New York Times. https://www.nytimes.com/2025/06/11/business/media/disney-universal-midjourney-ai.html

OpenAI. (2025, June 4). Service terms. https://openai.com/policies/service-terms/

Reuters. (2025, September 16). Disney, Universal, Warner Bros Discovery sue China's MiniMax for copyright infringement. https://www.reuters.com/legal/litigation/disney-universal-warner-bros-discovery-sue-chinas-minimax-copyright-infringement-2025-09-16/

U.S. Copyright Office. (2025, January 17). Copyright and artificial intelligence, Part 2: Copyrightability report. https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-2-Copyrightability-Report.pdf

Vallese, Z. (2025, June 19). Creators say they didn't know Google uses YouTube to train AI. CNBC. https://www.cnbc.com/2025/06/19/google-youtube-ai-training-veo-3.html

Vermillio. (n.d.). Home. Retrieved October 3, 2025, from https://vermill.io/

Digital Information World. (2025, June 19). Google used YouTube videos to train AI, creators left in the dark. https://www.digitalinformationworld.com/2025/06/google-used-youtube-videos-to-train-ai.html