Enterprise

Superconductive, creators of Great Expectations, nabs $40M for a commercial version of its open source data quality tool

Comment

Glowing light blue wire mesh network
Image Credits: Yuichiro Chino / Getty Images

Data quality — the practice of testing and ensuring that the data and data sets you are using are what you expect them to be — has become a key component in the world of data science. Data may be the “new oil”; but if it’s too crude, you may not be able to use it.

Today, a startup building tools to make it easier to measure and ensure the quality of the data you are using is announcing some funding, a sign of how attention has been shifting to this area.

Superconductive — a startup best known for creating and maintaining the Great Expectations open source data quality tool — has raised $40 million in a Series B round of funding. It will be using the capital both to keep building out its open source product and community, and to ready its first commercial product — a less-technical, and more accessible version of Great Expectations that can be used more than just engineers and data scientists — set to launch later this year.

Once the commercial offering is released, it will be named Great Expectations Cloud.

As Abe Gong, the CEO and co-founder of Superconductive describes it, data quality has long been a priority for engineering and data science teams. But as data usage and access become increasingly democratized in increasingly digitized organizations — thanks in part to low-code and no-code software — data quality becomes a point of consideration (not an “issue” or “challenge”, Gong is quick to point out) for more people. The thinking goes that having data quality tools that more people can use and understand will give people the ability to understand limitations or gaps, and fix them.

“The broader question is, how does everyone in the organization get to a point where they trust what the data does and what it is trying to do,” he said. “The engineering team might trust it but it might not be aligned with other teams. It doesn’t matter if it’s correct, it’s still doubting that data is fit for the purpose I want to use it for.”

Even without a commercial product, Salt Lake City-based Superconductive is getting a lot of attention from high places. Tiger Global is leading the round, with previous backers Index, CRV and Root Ventures also participating. The company is not disclosing its valuation, but we understand that the dilution is less than 15%, which puts it at over $267 million.

The funding is coming less than a year since Superconductive raised a $21 million Series A, in May 2021. Part of the reason investors have come knocking so soon after the last round is because of the strong traction for its open source tools.

Great Expectations is currently seeing over 2.5 million monthly downloads (closer to 3 million, Gong told me), while members of its community, which it maintains on Slack, has now crossed 6,000 (the downloads are based on machines running Great Expectations, while the Slack users are engineers actively working with the tools). Companies adopting it include Vimeo, Heineken, Calm and Komodo Health; and it also finds its way into use via ecosystem partners Databricks, Astronomer, Prefect and more.

Great Expectations got its start when Gong and his co-founders Ben Castleton and James Campbell — engineers with decades of experience between them — initially were building tools to address the issue of data quality for organizations working in healthcare. They eventually pivoted the business to tackle the bigger opportunity: the issues healthcare organizations faced were the same as those faced by companies in other verticals.

The crux of the matter is that when engineers are building analytics or other tooling to work with data, they may not be taking into account whether the data being ingested by those tools is in the right state to be used correctly (as one example, are dates entered in the same, consistent formats, or if not how best to reorganize them). Or, they may not have considered the different ways that users of the analytics might end up using them. For instance, what happens when an end-of-month analytics dashboard is suddenly looked at in the middle of the month? will the insights still be consistent or will they throw people off completely because of how the formula and processes have been set up?).

“By the end of month, the numbers would be correct, you might see a drop in sales in mid-month,” Gong said. “The engineering team might say that it’s correct because the system is still calculating, but from a business perspective a lot might get confused, even if the system is working correctly.”

Great Expectations sets out to “fix” these situations with tools that help set parameters on data to ensure it stays consistent, and at the same level of quality. The so-called “expectations” repository — some built by Superconductive, and many built by the community — are declarative statements that are set up to both make sense to humans, but also computers so that they can do the work behind the commands.

Superconductive cites figures from Gartner that support the idea of data quality being a growing issue for organizations. The analysts estimate that currently organizations see costs of $12.9 million annually because of poor data quality — both because the data hasn’t performed as it should, but also because of the decisions that the poor data has led to. Gartner predicts that this year, 70% of organizations will turn to tracking data quality levels to address this.

That also means Superconductive has competition. Companies like Microsoft, SAS, Talend and others have built data quality tools as a complement to other data services that they provide. Gong also said that a lot of companies build “homegrown” solutions, although these can run into limitations as internal tools often do. Superconductive believes that it has a lot of opportunity in the space for a few different reasons.

First is the fact that it already has a large community using its open source tools, which becomes a funnel for users of the commercial product. Second is that it’s dedicated to the task of data quality.

“Others tend to slice it differently,” he said. “Sometimes you hear about data quality in the context of data observability and so it’s focused on engineers and not looking at the wider role. We see ourselves as different, a bottom-up open solution looking at the broader scope of this as our mission, not just an engineering problem.”

Investors, especially those who have had experience themselves with the pain points of debugging software, and knew the same issues existed with data, seem to agree.

“The vision was simple, yet ambitious: to create a single place to observe, monitor, and collaborate on the quality of your data, at any level of granularity, on any system,” Bryan Offutt of Index Ventures wrote at the time of their first investment in the company in 2021. “By giving data teams an end-to-end way to monitor quality from pipeline to production, Abe wanted to bring the same ability to pinpoint and resolve issues that exists in traditional software to the world of data. Finally, data teams could catch issues before they made their way to end users. It was as if Abe had read the book on every single problem I had experienced as an Engineer working on data pipelines. It felt like the data world had its own DataDog.”

Updated with the correct names of the co-founder.

More TechCrunch

Lina Khan says the FTC wants to be effective in its enforcement strategy, which is why it has been taking on lawsuits that “go up against some of the big…

FTC Chair Lina Khan says the agency is going after the ‘mob bosses’ in Big Tech

With dozens of antitrust cases and close to a hundred on the consumer protection side, the agency is now turning to innovative tactics to help it fight fraud, particularly in…

FTC Chair Lina Khan shares how the agency is looking at AI

The ability to pause your activity rings is a minor feature update for most, but for those of us who obsess about such things to an unhealthy degree, it’s the…

Apple Watch is finally adding a feature I’ve been requesting for years

Featured Article

Why Apple is taking a small-model approach to generative AI

It’s a very Apple approach in the sense that it prioritizes a frictionless user experience above all.

2 hours ago
Why Apple is taking a small-model approach to generative AI

When generative AI tools started making waves in late 2022 after the launch of ChatGPT, the finance industry was one of the first to recognize these tools’ potential for speeding…

Linq raises $6.6M to use AI to make research easier for financial analysts

In addition to the federal funding, the state of New Mexico — where SolAero is based — committed to providing financing and incentives that value $25.5 million.

Biden administration looks to give Rocket Lab $24M to boost space-grade solar cell production

Some of the new Apple Intelligence features that Apple debuted at WWDC 2024 don’t even feel like AI, they just feel like smarter tools. 

Apple’s AI, Apple Intelligence, is boring and practical — that’s why it works

The TechCrunch team runs down all of the biggest news from the Apple WWDC 2024 keynote in an easy-to-skim digest.

Here’s everything Apple announced at the WWDC 2024 keynote, including Apple Intelligence, Siri makeover

Jordan Meyer and Mathew Dryhurst founded Spawning AI to create tools that help artists exert more control over how their works are used online. Their latest project, called Source.Plus, is…

Spawning wants to build more ethical AI training datasets

After leading the social media landscape, TikTok appears to be interested in challenging Google’s dominance in search. The company confirmed to TechCrunch that it’s testing the ability for users to…

TikTok comes for Google as it quietly rolls out image search capabilities in TikTok Shop

General Motors is investing $850 million into Cruise as the autonomous vehicle subsidiary slowly makes its way back to testing in Phoenix, Dallas and, as of Tuesday, Houston. GM’s CFO…

GM gives Cruise $850M lifeline as it relaunches robotaxis in Houston

These messaging features, announced at WWDC 2024, will have a significant impact on how people communicate every day.

At last, Apple’s Messages app will support RCS and scheduling texts

Welcome to TechCrunch Fintech! This week, we’re looking at Rippling’s controversial decision to ban some former employees from selling their stock, Carta’s massive valuation drop, a GenZ-focused fintech raise, and…

Rippling’s tender offer decision draws mixed — and strong — reactions

Google is finally making its Gemini Nano AI model available to Pixel 8 and 8a users after teasing it in March.

Google’s June Pixel feature drop brings Gemini Nano AI model to Pixel 8 and 8a users

At WWDC 2024, Apple introduced new options for developers to promote their apps and earn more from them in the App Store.

Apple adds win-back subscription offers and improved search suggestions to the App Store

iOS 18 will be available in the fall as a free software update.

Here are all the devices compatible with iOS 18

The acquisition comes as BeReal was struggling to grow its user base and was looking for a buyer.

BeReal is being acquired by mobile apps and games company Voodoo for €500M

Unlike Light’s older phones, the Light III sports a larger OLED display and an NFC chip to make way for future payment tools, as well as a camera.

Light introduces its latest minimalist phone, now with an OLED screen but still no addictive apps

Since April, a hacker with a history of selling stolen data has claimed a data breach of billions of records — impacting at least 300 million people — from a…

The mystery of an alleged data broker’s data breach

Diversity Spotlight is a feature on Crunchbase that lets companies add tags to their profiles to label themselves.

Crunchbase expands its diversity-tracking feature to Europe

Thanks to Apple’s newfound — and heavy — investment in generative AI tech, the company had loads to showcase on the AI front, from an upgraded Siri to AI-generated emoji.

The top AI features Apple announced at WWDC 2024

A Finnish startup called Flow Computing is making one of the wildest claims ever heard in silicon engineering: by adding its proprietary companion chip, any CPU can instantly double its…

Flow claims it can 100x any CPU’s power with its companion chip and some elbow grease

Five years ago, Day One Ventures had $11 million under management, and Bucher and her team have grown that to just over $450 million.

The VC queen of portfolio PR, Masha Bucher, has raised her largest fund yet: $150M

Particle announced it has partnered with news organization Reuters to collaborate on new business models and experiments in monetization.

AI news reader Particle adds publishing partners and $10.9M in new funding

Mistral AI has closed its much-rumored Series B funding round, raising €600 million (around $640 million) in a mix of equity and debt.

Paris-based AI startup Mistral AI raises $640M

Cognigy is helping create AI that can handle the highly repetitive, rote processes center workers face daily.

Cognigy lands cash to grow its contact center automation business

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved…

ChatGPT: Everything you need to know about the AI-powered chatbot

Featured Article

Raspberry Pi is now a public company

Raspberry Pi priced its IPO on the London Stock Exchange on Tuesday morning at £2.80 per share, valuing it at £542 million, or $690 million at today’s exchange rate.

14 hours ago
Raspberry Pi is now a public company

Hello and welcome back to TechCrunch Space. What a week! In the same seven-day period, we watched Boeing’s Starliner launch astronauts to space for the first time, and then we…

TechCrunch Space: A week that will go down in history

Elon Musk’s posts seem to misunderstand the relationship Apple announced with OpenAI at WWDC 2024.

Elon Musk threatens to ban Apple devices from his companies over Apple’s ChatGPT integrations