Maximise Return On Data: The SCALE Principle

An easy to follow guide to overcome data pains and generate analytics ROI quickly.

Published in

Entrepreneurship Handbook

8 min readJan 20, 2021

Tell me if this sounds familiar, as a startup founder, you are constantly juggling many hats: marketing guru, product owner, growth hacker, sales master, accountant, and the list goes on.

To keep things in check, you frequently consult reports from a range of third party sources such as Shopify dashboards or Google Analytics. Periodically, you review cash flow and profit and loss statements consolidating more numbers from suppliers, service providers and internal databases.

However, all these numbers tend to be operational in nature and do not generate returns from your own data by providing valuable insights on how to accelerate growth.

At 173Tech, we are often approached by founding teams experiencing many data pains. We created an easy to follow five step principle to help you quickly master your data and generate analytics ROI.

Common Data Pain Points

Below are the top five data pains experienced by many startup founders.

Messy and inconsistent data

To answer a simple question, you have to dig through many sources and each source reports numbers in a different style. After numerous copy-pasting and manipulating with Excel formulas, you realise that things do not add up.

A typical scenario, Facebook claims its campaigns generated £5,000 sales last month and Google another £5,000. However, your total monthly sales was £9,000. You then check internal databases for channel attribution. A completely different story again. 30% of sales came from organic sources.

What can you trust and how do you make decisions on next month’s marketing budget?

Data silos

Numbers from different aspects of your business are scattered around in various sources, gathered in different frequencies, and in some instances, being held hostage by third-party tools.

If you are an eCommerce business, you might receive fulfilment costs from your 3PL on a monthly basis, product sales from Shopify daily, and digital marketing spend weekly from your agency, while cost of goods sold is stored in an Excel spreadsheet and updated as required.

How do you keep track of profit margins on SKU level and decide on which one(s) to scale? How would you evaluate it across demographics?

Error-prone & time-consuming manual processes

To answer the profitability question above, you need to log into each source, query the data for the relevant time period, copy and paste into a spreadsheet, then create a master sheet with a range of manually entered formulas applied to various chunks of data. This process is lengthy and difficult to debug with human mistakes possible at every step of the way.

As you scale, this setup will soon become impossible to maintain. We have seen spreadsheets that take minutes to open and even longer to respond to a single change. Various versions of the original spreadsheet are then created to cater for slightly different scenarios, introducing more errors. Additional errors reduce the reliability of your data and result in more time spent on rechecks.

If you cannot trust your numbers, you cannot build a data-driven culture and make data-informed decisions.

Unable to respond to urgent request

Your app is experiencing a steep drop in user activities since last week. This could be due to a wide range of possible factors: new campaigns bringing in low-quality users, bugs in the latest release, new features users do not understand, or server error preventing push notifications. It could be specific to a country or platform or other dimensions. It requires data interrogations from multiple angles.

Your team starts frantically pulling data from various places to pinpoint the issue, while precious time is being lost and your customers continue to churn.

Disconnection between data and business

Often when checking a dashboard from one of your service providers, you are left with the feeling that it was not very helpful. They contain a number of nice-looking charts and total figures but fail to tell you anything you did not already know.

This is expected due to a number of reasons:

These charts typically visualise simple aggregations (e.g. sum, average or count) over generic metrics (e.g. sales or number of customers), without deep reflections on the true health of your growth.
Data is not processed and summarised into meaningful insights based on your unique business model. It is designed as a one-size-fits-all solution and easy to switch on for everyone.

The main issue with generic and out-of-the-box solutions is that sooner rather than later you will grow beyond the box.

The SCALE Principle

By applying modern analytics, we successfully help companies grow efficiently at all stages, from MVP all the way to over $100 million run rate and unicorn status.

We formulated our best practice data strategies into the five components of the SCALE principle. It ensures maximum insight generation from your data while removing all your data pains.

Standardise

The first step towards a world-class analytics stack is to standardise business metrics and apply best practices over your entire analytics processes.

Data Dictionary

To standardise business metrics, start with a list of KPIs core to your success and their definitions. Take monthly revenue for example, it is an important metric with many caveats. Does it include taxes, refunds, or deferred revenue?

Create a data dictionary for all your KPIs. Ensure that everyone in your company is well-versed in its contents, and that it is constantly updated as your business and analytics stack evolves. A consistent data vocabulary allows for clearer communications and goal setting between different teams.

Add technical definitions to each item in your data dictionary. This bridges the gap between business and data. It also provides the technical translation for the automation step later.

Analytics Processes

First, select the right tool stack tailored to your usage needs and existing tech ecosystem. Your tool stack should cover automated data delivery, data warehousing, data modelling, reporting visualisation, and version control. Next, formalise your implementation processes covering requirements gathering, prioritisation, development, peer review, and production release.

Centralise

This is where all your data sources come together. Data from all customer touchpoints (marketing, sales, customer support, CRM, fulfilment, costs etc.) is gathered in one centralised location whilst retaining all the information you care about.

The key here is to select the right tools for your data infrastructure for optimum performance and scalability. For your data warehouse, some options include Snowflake, Redshift or BigQuery, depending on your existing tech ecosystem and intended usage. If you have large volumes of data, you might also want to have a data lake to store raw data, using tools like Hadoop or Amazon S3 buckets.

Another important aspect of centralisation is your code base and reporting. Ideally, all your analytics scripts (e.g. SQL, Python tasks) should be easily accessible from one location. We also recommend a single visualisation platform for all your reporting dashboards to avoid disjointed insights.

Automate

Once you have defined the data sources to collate and their destination, you should consider automating the following aspects of your pipeline:

Data extraction
Data modelling
Data visualisation

Data extraction ensures that information is delivered consistently from all sources at your desired time interval. Some tool options include Stitch or Fivetran. Consider it your logistic guy who picks up the data package every morning and deposits it into your data warehouse.

For all the data packages arriving in your data warehouse, perform data modelling for two key purposes:

Ensure all sources are linked via unique identifiers to create your own single customer view.
Apply the business logic defined in your data dictionary and transform raw data into meaningful KPIs.

We are the creators and maintainers of the open-source data processing and modelling framework SAYN. It covers many task types including Python and SQL transformations and helps analytics teams improve data engineering efficiency by easily orchestrating and automating data processes.

Once created, you can utilise the data models to create dashboards that clearly visualise your KPIs. Best practice is to have a top-level dashboard that summarises key trends and quickly unveils opportunities or issues in your business. Then design a dashboard per business vertical (e.g. marketing performance dashboard, finance P&L dashboard etc). Metabase, a free tool, is a good option as you start your data journey. As your team and data capabilities scale, consider moving to a more robust solution such as Looker.

Learn

Now with your own centralised data gold mine, you can start learning from it.

With an efficient data pipeline, you can train data science models to segment, predict and influence customer behaviours. Customer lifetime value (CLTV) predictions, churn propensity scores, recommender systems, automated consumer sentiment with natural language processing (NLP), your power here is unlimited.

These models can be integrated into other parts of your products and services to create unique competitive advantages. It creates a constant and dynamic learning loop from observing user patterns, creating algorithms, feeding it into your product development and observing new feature usage.

Another element of the learning loop is user testing. To encourage a desired behaviour on your app, your team came up with a number of ideas to achieve it. How do you know which one will be the most effective? Run an experiment and test these options against a control group. Test results should be modelled, automated, and visualised in a dedicated testing dashboard. This will allow you to capitalise on the winning variant early and stop any poor performing test promptly.

Make sure you have a process where all learnings feed into your internal knowledge base and are shared across all teams.

Empower

In our experience, we have seen many companies succeed through leveraging their data and integrating continuous learning and testing into their agile product development. An efficient and democratised data stack empowers all teams and individuals. It is a game-changer for companies embracing it.

For your data science and analytics team, data efficiency and reliability ensure little time is wasted on digging and cleansing data. Instead, they can focus on mining deep product and behavioural insights, and building state-of-the-art algorithms.

For product and marketing teams, a well-structured data warehouse and user-friendly visualisation tool enable everyone to create ad-hoc reporting tailored to their changing needs. It provides timely feedback on current projects and efforts, so one can pivot and adapt quickly as new data insights stream in.

So make sure to train all teams on your chosen visualisation tool. One efficient way is to appoint a data champion per team, who will act as the data power user.

Either you are looking to set up analytics from scratch or upgrade existing infrastructures, I hope this article provides a structured and easy to follow plan to build your own world-class data stack.

If you have any questions or feedback, reach us at hello@173tech.com. We are always happy to chat!