AI-powered media editing app Descript lands fresh cash from OpenAI

Descript, the audio and video editing platform founded in 2017 by former Groupon CEO Andrew Mason, has raised $50 million in a Series C round led by the OpenAI Startup Fund, a tranche through which OpenAI and its partners, including Microsoft, are investing in early-stage companies. Descript is the second startup to receive a cash infusion from the fund after AI note-taking app Mem, and Mason says it reflects OpenAI’s belief in the future of Descript’s AI-powered features.

“I founded Descript with the idea of building a simple, intuitive, fully-powered editing tool for video and audio — an editing tool built for the age of AI,” Mason told TechCrunch in an email interview. “We’re on the verge of a generational change in the way we create content — fueled by AI. That includes the kind of tools like creators are already using in Descript, and emerging stuff like generative AI. The challenge for companies like ours is how to make that technology useful and accessible.”

Mason wouldn’t reveal Descript’s valuation post-money, but he noted that the funding — which also had participation from Andreessen Horowitz, Redpoint Ventures, Spark Capital and ex-Y Combinator partner Daniel Gross — brings the company’s total raised to $100 million. According to a report from The Information in October, OpenAI had agreed to lead funding valuing Descript at around $550 million, over double the startup’s valuation as of January 2021 ($260 million).

“We started the OpenAI Startup Fund to accelerate the impact companies building on powerful AI will have on the world, and we’re particularly excited about tools that empower people creatively,” OpenAI COO Brad Lightcap, who manages the OpenAI Startup Fund, said in a press release. “It’s clear from using Descript and talking to customers that Descript is breaking down barriers between idea and creation by extending video editing capabilities to an entirely new class of creators.”

Descript was created as a spin-off of Mason’s audio-guide business Detour, which Bose acquired in 2018. The platform, geared toward podcasters and videographers unfamiliar with professional-level editing tools, lets users create instant transcriptions of audio and video that can then be cut and paired with music, photos and other content using drag-and-drop tools.

Coinciding with the new cash, Descript today unveiled a host of editing features — some powered by AI — and a redesign intended to make editing video “as easy as editing a doc or slides,” in Mason’s words. That might be overpromising a bit. But the new capabilities do indeed streamline aspects of content creation that have historically been arduous.

For example, Descript now offers a background removal feature that lets users put their videos in any setting they want. And with write mode, users can edit scripts in Descript, tapping the platform’s Overdub voice cloning tech to scratch a voiceover.

Descript

Descript’s redesigned editing interface, rolling out today, which adds features like templates and background removal for video. Image Credits: Descript

Other highlights in the latest release of Descript, called Descript Storyboard, include multitrack screen recording — the recorder is now integrated into the editor, with separate tracks for screen and camera — and free access to stock sound effects, videos, images and music tracks. Descript also now provides new video transitions and animations and various templates, including layouts, titles sequences and social clips, along with the ability to create custom project templates.

With the redesign, Mason says that the goal was to both complement and augment Descript’s transcript-based editor while leaving core functionality intact. A new experience called Scenes allows users to break scripts composed in write mode into scenes and then arrange the visuals they same way they’d work with slides in a deck. Scenes keeps voiceovers from Overdub aligned with the script, letting creators swap a scratch clip with the final recording, for example, without having to worry about the tracks falling out of alignment.

“We believe video should be in every communicator’s toolkit, as ubiquitous as docs and slides. The tools are the only things preventing that, and we intend to change it,” Mason said. “We think of our main competition as non-editors — people who aren’t making video because the tools are too complex and time consuming.”

Descript isn’t the only company competing in the audiovisual content editing space. Besides incumbents like Adobe, there are startups such as Reduct.Video, which uses AI, natural language processing and other tech to automatically create editable transcripts.

San Francisco-based Descript, which employs about 100 people, has been aggressively expanding, however, acquiring AI company Lyrebird in 2019 to power its Overdub feature. Initially focused on audio editing, Descript launched its first video editing features two years ago, chasing after a digital video market that’s estimated to be worth more than $20 billion.

The strategy appears to be working for Descript so far, which counted NPR, VICE, The Washington Post and The New York Times among its customers as of 2021. While Mason wouldn’t answer questions about revenue, he says that Descript’s client base has expanded in recent months to “major universities and nonprofits,” as well as organizations in the public sector.

“The pandemic changed the way we all create and collaborate — a lot of people cooped up at home got more curious about video, and a lot of people started exploring the creator economy,” Mason said. “Companies started using video for more of their async communications. Around the same time, individual creators stopped respecting the boundaries between media; YouTubers started podcasts, podcasters flocked to TikTok and so on. Our new funding, plus the fact that all those things I just mentioned are only gaining energy, puts us in a great position to weather any headwinds.”