A few weeks ago, the Guesstimate beta came out. It's pretty cool; it’s like Excel with Crystal Ball built right in. You can input a single number or a range of values and build models with it. Guesstimate’s release and the holiday season gave me the perfect chance to explore an idea on the startup industry. I had been meaning to building a model to understand the formation and development of a startup to its eventual failure or exit.
This is one in of a long line of attempts to try and quantify an often-times opaque industry. Two prominent examples of data-driven approaches to venture financing are Aileen Lee’s TechCrunch article that popularized the term ‘unicorn’ and a recent Cambridge Associates research report on venture returns becoming less concentrated. While both of these reports are good attempts to understand an aspect of the startup formation and funding, it’s often hard to understand how a startup in moved along through this process if you are new to the industry.
The startup industry model in Guesstimate takes inspiration from Sam Gerstenzang's Open Source Venture Model and Bryan Johnson’s OSF Playbook. Like any model, my startup model is an attempt to make explicit assumptions and beliefs about the world to be tested. It allows you to change values to see how each element can push and pull on each other. It follows one cohort of companies started in a year and follows them through their life cycle. It assumes a set amount of capital available at each stage that is always spent on financing that set of companies. You can play with the model here. Right now, you can make changes the model on Guesstimate, but no changes are saved once you leave the page. Varying the “exit multiple” and the number of deals participated in by VCs have the most dramatic effect on the model.
Some key things learned and reinforced in the course of building the model:
- There’s a huge amount of disagreement in just how many startups are started every year. The Kauffman Foundation says that ~6,000,000 new businesses are created, while not stating how many are high growth startups. Marc Andreessen says there are 4,000 startups that are created. In addition, people still don’t agree on the definition of a startup.
- It’s really hard to build startup. So, so many fail. The vast majority of new businesses fail to attract any angel or VC funds at all.
- Power law distributions are still not internalized by people (and not well represented by this model). The magnitude and difference of returns that one company can generate is just astounding. WhatsApp raised a total of $60 million while exiting at a total valuation of $19 billion, a 316x return on invested capital. 50% of startups will fail to return anything, and the next 40% of startups above that will hopefully return the the total invested capital of investors. It is the WhatsApps of the world, the top 1% that bring home meaningful returns.
- Angel Investors make up a huge not-as-often-recognized pool of capital to startups. $20 billion is invested per year by Angels into startups. Their importance is hard to overstate at the earliest stages where they enter 50 to 70k deals per year. This prominence has grown since the 2000s due to the low cost of doing a startup provided by AWS and other related services. Since the costs of starting a software startup have dropped so low, VCs aren’t able to deploy such little capital in one deal. Their model does not work like that. Angel Investors, do in fact generate a nice return, in line with VC returns.
As previously said, Guesstimate has only two distributions normal and uniform distribution and isn’t able to capture much of the statistical reality of startups. While normal distributions may be a good way to model the likelihood of a startup moving onto the next stage of funding, it’s not a very good way to measure the return generated at exit. Right now, the model is merely descriptive (and barely so). In the future, I’d like to move towards a prescriptive model to answer the question: “how can we change the current system to create more impactful innovation in the world?”. Questions such as "Do we need a more diverse group of VCs to allocate capital to different startups?" or "Is the most effective way to create innovation to pump more money towards VCs or to lower the cost of starting startups?" may be more easily answered with this model.
With that said, here are a few directions I’d like to explore:
- Exploring how broader macroeconomic trends influence the startup industry. At the midpoint of 2015, China was on pace to invest $30 billion through venture capital. How will 2016 China influence funding this year, and how will these impact the startup ecosystem 5 - 10 years down the line? (Thanks Daniel)
- How the industry (and cost of doing a startup) affects the rate of formation. While we’ve seen a veritable boom in the formation of software startups, the same can’t be said for life science startups, where the number of initial financings by VCs has remained unchanged. As the cost of doing startups comes down, we should see a pattern of more hardware and biology startups being funded at the early stages. PCH International and Transcriptic are working to do their part to lower costs in their respective industries.
- Making this model more of a simulation to see how the ecosystem evolves over time. I would like to see how exits by the large companies are able to seed the next generation of angel investors and provide landing grounds for acquisitions. Silicon Valley wasn't built overnight. The dynamic process of companies exiting and investors passing on advice to the next generation is an important to creating huge companies and innovative ecosystems.
- Add more data! I’d like to see how individual firms, investors, and entrepreneurs are able to influence the growth of a startup instead of aggregated statistics provided by reports.
Thanks for reading! Drop a note on Twitter if you found this interesting!
Thanks to Daniel Kao, Jonathan Zong, and Reed Rosenbluth for reading a draft.