Cloud Data Lakes - The Future of Large Scale Data Analysis

Tomasz Tunguz

Cloud Data Lakes are a trend we’ve been excited about for a long time at Redpoint. This modern architecture for data analysis, operational metrics, and machine learning enables companies to process data in new ways. A cloud data lake is a repository of data in the cloud, with the tools and infrastructure to analyze it securely. We’re all storing data at increasing rates because every team inside a company needs data to succeed.

Data Lake Engines - The Essential Layer of the Next Generation Data Architecture

Tomasz Tunguz

We shared a vision for a new way of working with data. More data is being stored in data lakes like Amazon S3 and Azure Data Lake Storage. Analysts and product managers and sales operations teams deploy Tableau, Power BI, Looker, Superset, and many other tools to parse their data. There needs to be a layer between them to make all that data accessible to these users - a data lake engine. Amazon operates its data lakes in this way.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

How to Build Authentic Customer Relationships That Spark Innovation

Entrepreneurs' Organization

Along with establishing formal ways to capture customer feedback through surveys and data analysis, look for opportunities to engage in conversations. Put it into context by combining it with all of the customer insights and market data you’re capturing. Prioritizing customer conversations and data gathering is what keeps innovation flowing, so don’t miss the boat. Written for EO by Jason McCann , a lifelong entrepreneur and experienced founder.

Unsung Hero Spotlight: Steven Rodriguez

Ecosystem Builder Hub

Someone who works at the intersection of economic/community/ecosystem development, is data-driven, collaborates and connects diverse stakeholders, focuses on being the right kind of busy and has a give #GiveFirst attitude. Data analysis and science in ecosystem work. Steven is data-driven, focusing on measurable outcomes and impact, and minimizing noise and resources; being the right kind of busy essentially. To you, what is an Ecosystem Builder?

The Power of Open Source to Solve the Data Fragmentation Challenge

Tomasz Tunguz

Most modern data architectures employ many different data stores and processing engines. Data analysts looking to unearth insights within these data stores must move data back and forth between different systems and different data formats. As the number of new open source projects continues to grow geometrically, this data fragmentation is likely to splinter further. ” Arrow promises data engineers three things.

Millennials May Not Be ‘the Entrepreneurs of Today’ Everyone Thinks They Are

Wesley Cherisien

According to a Federal Reserve data analysis by the Wall Street Journal , the number of people under the age of 30 who own businesses has dropped by 65 percent since the 1980s and is currently at a quarter-century low.

Data, Data Everywhere, Not a Second to Think

Tomasz Tunguz

More and more companies realize their proprietary data contains insights that drive tremendous competitive advantage. Enabling an organization to make data driven decisions is a long term process. Below is the current big data adoption process and where we are within it: Companies generate proprietary data whose volumes can’t be handled by existing tools. Companies build or buying the tools and expertise to store and process that data.

EnsoData Raises $9M Series A Financing to Empower Clinicians with Waveform AI

Dream It

EnsoData (Healthtech - Spring 2019) has a platform that transforms billions of waveform data points collected from sensors in medical devices and wearables into an easy-to-read report, so clinicians can make faster, more accurate diagnoses. Heartbeats on an EKG, eye movements through an EOG, and brain waves through an EEG all output as waveform data, with over 1.5 Their flagship product, EnsoSleep, empowers clinicians with automated AI scoring and analysis.

[Zoomcar in Express Drives] ‘Zoomcar Mobility Services’ software platform launched: Promises cost reduction for fleet operators


Greg Moran, CEO & Co-Founder Zoomcar, states that with ZMS, fleet operators stand to reduce operating costs, improve safety and vehicle monetisation through features like real-time data analysis, and also improve customer engagement.

Winning with Data

Tomasz Tunguz

There’s a new class of company that wield data to create long-term competitive advantage. TheRealReal uses this morning’s sales data to inform this afternoon’s marketing campaigns. Zendesk’s data team educates and trains its employees to use data in meetings to prioritize key product management and marketing efforts. I first saw the impact of this type of data informed decision-making at Google. The best run companies use data to win.

My Pal Dave: A Triumph of Substance Over Style

Both Sides of the Table

He had a philosophy that the future competition for startups would be design led and based on data analysis. My pal Dave has blogger Tourette’s. He has it on stage, too, at conferences. He can’t help himself: He’s Dave. My pal Dave has problems. Not the ones you’d imagine. His biggest problems are with language, colors, fonts and spacing. Not much more. I think he could say “no” a bit more.

The Future of Human Data Interaction

Tomasz Tunguz

On the day of Tableau’s IPO, a company known for innovating in data visualization, I thought I would share the most impressive HCI concept I’ve seen in a long time. In the first two or three minutes of this video at Stanford, he demonstrates his home-built software that combines data analysis with visualization. In my view, Bret Victor is on the forefront of human computer interaction design.

The Technology that's Taking Data Science by Storm

Tomasz Tunguz

Amazon’s Redshift, an elastic data-warehousing solution launched in late 2012 is the most salient example. Redshift’s ability to process huge volumes of data is breathtaking. When running Redshift on solid state drives (SSDs), one team at FlyData queried 1 terabyte of data in less than 10 seconds. AirBnB’s data science team wrote about their experiences contrasting Redshift and Hive.

[TechSee in PR Newswire] TechSee Releases Results of Study on Visual Assistance’s Impact on Customer Service KPIs


TechSee, a global leader in Visual Customer Assistance powered by AI and augmented reality, today released the results of an extensive data analysis it conducted to explore the impact of its technology on contact center and customer service KPIs.

73.6% of all Statistics are Made Up

Both Sides of the Table

One of our core tasks was “market analysis,&# which consistent of: market sizing, market forecasts, competitive analysis and then instructing customers on which direction to take. We were looking at all sorts of strategic decisions that Sony was considering, which required analysis and data on broadband networks, Internet portas and mobile handsets/networks. I was leading the analysis with a team of 14 people: 12 Japanese, 1 German and 1 Turk.

Startup Trends from YCombinator's Demo Day

Tomasz Tunguz

This increase in activity seems to be driven by advances in data analysis for drug discovery and novel sensors. If this data is any indication, we should see more commerce and consumer finance companies; and more vertical software businesses in the next few years I’ve been to many YC Demo Days and I always look forward to them. This year was no exception. There are so many wonderful ideas and companies founded by terrific entrepreneurs.

Series A SaaS Startup Benchmarks for 2018

Tomasz Tunguz

But the average MRR has increased substantially from the last time I analyzed the data. The usual caveats to this data analysis apply. The sample size is on the smaller side; there are companies who raise Series Bs at less ARR than the median A for other factors; this analysis ignores space, competitive dynamics, team composition and auction pressure of financings How far along was the typical SaaS Series A in 2018? The median business was at $1.8M

SaaS 68

The Future of Advertising will be Integrated

Both Sides of the Table

some data sources have this estimate much higher.) And naturally we have built in quality controls like: frequency capping, automated measurement so we can pull ads that people respond poorly to, A/B testing tools, data analysis to tell celebrities & brands which products will resonate, etc. We already have the data that proves it. On products where I’ve seen data the “ad free&# versions have converted at 4-6% of the user base at maximum.

media 291

Against All Odds in Startupland

Tomasz Tunguz

Win probability charts like the one above have become the icons of popular predictive data analysis. I love data, but let me whisper a heresy to you. The problem with predictive analysis like this is they never capture all the variables. Predicting the future is damn hard, and no matter how much data we jam onto disk, or how sophisticated our adversarial neural networks become, we still won’t be able predict the future accurately.

Optimize for authentic relationships, not bluster

This is going to be BIG.

Hilary has had both a major impact on my personally--by helping to push the envelope on the kind of data analysis work we were doing at Path 101. When we hired her, we leveled up in terms of our technology and Big Data chops. I was just some clueless kid when I asked Kara to come speak at my school and was even more clueless about data structures when I got Hilary to come work for me. Tweet.

Your Startup's Competitive Advantage

Tomasz Tunguz

A better chat experience ; a data modeling layer for data analysis, near-instant transcription of expenses. Startups fail when they run out of money. Startups run out of money when they lack focus. Without a maniacal focus on serving customer needs in a unique way, startups can flounder amidst competition. Without product market fit, the business is challenged to generate strong metrics and faces fundraising challenges.

The 12 Things I Know About You

Tomasz Tunguz

But as I’ve learned writing this blog, experimentation and data analysis will lead authors to share those insights in the most generalizable way possible. I know 12 things about you. You have a great need for other people to like and admire you. You have a tendency to be critical of yourself. You have a great deal of unused capacity which you have not turned to your advantage. While you have some personality weaknesses, you are generally able to compensate for them.

Do VC Platforms Make Sense?

Both Sides of the Table

We felt we wanted “management heavy” where we’d try to put more effort into tracking systems, data analysis, event management, content creation and the like. In the VC insider baseball world a discussion has gone on about “VC platforms” over the past 5 or so years. While firms define platforms differently, let’s just say they are the services that a VC offers outside of investment capital and partner time on boards or providing intros.

Benchmarking Tableau's S-1 - How 7 Key SaaS Metrics Stack Up

Tomasz Tunguz

Today, we’ll examine Tableau, the market leader for data visualization software. The company went public in 2013 and we’ll use data from their S-1 through 2013 to benchmark the business. Tableau sells their desktop client to one or two data analysts within a company. Sometimes, teams buy a Tableau server license to collaborate internally on data analysis. And they have been investing at this rate for as long as we have data to measure it.

What to Look for When Hiring a Head of Marketing for Your Startup

Tomasz Tunguz

These teams are by nature technical, often performing significant data analysis to maximize return-on-investment of their marketing spend. This data informs the product and engineering roadmap. When a startup is confronted with the prospect of hiring a head of marketing, founders heads often spin. What should be the day-to-day tasks for this person? What skill sets are important?

On historical returns and venture capital flavours

Fred Destin

The reality is: no amount of historical market data analysis is going to tell you which fund to invest in next , especially given the speed of market disruption. Story of of the last few years in Venture Fund Land : large institutional investors concentrate money with fewer managers and flagship brands AND / OR find emerging managers.

Startup Best Practices 21 - Your Startup's Recruiting Scorecard

Tomasz Tunguz

For example, a data engineering role may require familiarity with data analysis tools. The same data analysis job would require an ability to learn new technologies and simplify complex data into comprehensible insights for the rest of a team. Last night, SaaS Office Hours at Redpoint welcomed Maia Josebachvilli , the VP of People and Strategy of Greenhouse. Maia is a thought leader in human resources.

What to look for when hiring a data scientist

Tomasz Tunguz

But a more important driver has been the need to better understand how to qualify, evaluate and hire data scientists because data science is a massive competitive advantage. And many of the companies I work with are hiring data scientists. Finding the right person to model your data and generate insights can provide massive leverage for your company. Data processing. Data analysis. After data is processed, it must be analyzed. Data modeling.

How to Combat Inaccurate Data and Faulty Statistics When Making Decisions

Tomasz Tunguz

The conclusions were results of bad experimental design, biases in the data , and statistical tools used incorrectly. One of the major problems with data analysis are the imperfect methods we use. But p-values doesn’t answer the question to the answer most people care about: what are the odds the hypothesis about the data is correct? In addition to dissolving faith in the research process, bad data encourages wrong decision-making.

Why Personas Are Critical Product Development and Go To Market Tools for Startups

Tomasz Tunguz

When the data analytics team took the stage, I listened with great interest as the chief of the group described their internal struggles with data and the areas where startups might help them achieve their goals. I’ve summarized these personas below: The Three People That Matter in Data. Analyst Reveal trends in data Create and propagate data silos Visualization tools, Spreadsheets, BI. For example, I’d never heard the term data steward before.

An Exceptional Story with Exceptional Data

Tomasz Tunguz

Even if you’re not a soccer/football fan, the article is worth reading because it’s one of the finest examples of synthesizing data and a story to convey a point I’ve read in a very long time. Data is one of the most powerful ingredients to supporting a point of view. It’s one of the reasons I publish data analysis on this blog. But data alone isn’t enough - not nearly enough - to be compelling.

Cohort Analysis for Startups: Six Summary Reports to Understand Your Customer Base (With Code)

Tomasz Tunguz

Cohort analysis provides deep insight into customer bases because cohorts expose how customer accounts grow, evolve and churn. Plus, cohort analysis provides a framework to evaluate product releases, marketing pushes and advertising campaign performance. Average Revenue Per Customer Over Time - Chart monthly revenue over time to contrast with cohort data. Cohort analysis is difficult to perform in a database or in Excel so I turned to R.

The Optimal Price to Maximize Sales Efficiency for a SaaS Startup

Tomasz Tunguz

To eliminate bias, I whittled the dataset to a subset of the companies who had data in all three periods. In fact, within any of the three periods, and across the four different ACV categories, the data today shows that there is no optimal ACV that would enable maximum sales efficiency. There is another important conclusions from the data: sales efficiency is monotonically decreasing. But, I don’t have the data to prove that hypothesis.

What a Dog and a Monkey Taught Me About Management at Google

Tomasz Tunguz

It might have been a mishandled customer case, a forgotten internal data analysis or causing a car accident on the way to work. At all hands meetings on Tuesday afternoons, our 75 person AdSense Ops team reviewed the most important metrics for the business: top-two box customer satisfaction scores, revenue growth and customer churn. But unlike every other all hands meeting I attended, these meetings ended with a monkey and a dog.