In Defense of Young Founders

Sometime during the summer, a friend of mine questioned if young founders (let's say younger than 26) would be able to develop the biggest startups of the future. The argument was that startups of the future will trend towards hard tech. Technologies like biotech, robotics, AI, and material science each take years to build domain expertise, not to mention capital intensive. Both those form barriers for young founders to get started. Contrast this with the recent history of companies centered in information technology/internet startups. We all have the image of genius hacker developing applications as a teenager. This was (and still is) an open industry, where the tools for development are literally on everyone's desktop. With all that said, it sounds like we have to say goodbye to the garage startup. So are there any reasons for us to be optimistic about the young founder of the future? 

In the past 20 years, there have been many examples of student founders. Michael Dell, Bill Gates, Woz, and Steve Jobs all come to mind. Yet, it's hard to think of examples that stretch outside of this range, but we should fall prey to availability bias. 

A quick survey of Wikipedia shows that in each technological era, young founders have always been able to make a name for themselves. This list is highly biased towards US companies and not comprehensive by any means. However, it's no guarantee that this trend of young founders will continue just because of this past trend--just ask Nassim Taleb. Startups are a uniquely creative pursuit. They sit between, mathematics, a totally abstract pursuit, and history. In "Age and Outstanding Achievement", Simonton examines the age of peak creative/leadership output of different fields. Poetry, pure mathematics, and theoretical physics --which exhibit a peak age in one's late 20s or early 30s -- and novel writing, history, philosophy, medicine, and general scholarship -- exhibit a peak age in one's late 40s or early 50s. I think entrepreneurship skews towards the younger side, but why? Naval Ravikant and Marc Andreessen have already written two great blog posts about this, and I'll quote liberally from them here. 
"The first set comprises problems that are solved by an emotional state (poetry, painting), by loading a very difficult single framework into your head (math, physics, coding), and / or competition (driven by sex drive and time-sensitive). The latter set are more rational, are systems problems rather than point problems, and don’t have time-sensitive competition. " - Naval
Compared to internet startups.
"Modern entrepreneurship, especially web entrepreneurship, is extremely competitive / time sensitive, requires enormous amounts of iteration even withina single product life-cycle, and often requires solving many challenging technicaland business problems one after the other in a public view (with the opposite sex watching). So, it favors the young and single." - Naval
While Naval says that the young founder phenomena may be limited to the modern age, I'm making the generalization using the list built above that entrepreneurship has historically and will for the foreseeable future maintain this youthful skew. Another biological factor that may cause the youthful skew is the difference in peaks of fluid and crystallized intelligence. The young founder's combination of enthusiasm
and peak in fluid intelligence help her with identifying new markets, iterating on products, and more. Yet founders are not alone sufficient to create huge startups. Networks of other talented people, financing, production infrastructure, and the right knowledge also need to be in the mix. 

Although hard tech startups will always require fundamental knowledge to get started to iterate, knowledge is now easier to acquire than ever. Youtube videos, pirated textbooks, Reddit, and StackOverflow are just a few aggregated knowledge bases. Knowing things within a domain is now easy enough, but young entrepreneurs of today also have the advantage of seeing the non-obvious connections between different fields. arXiv and scihub.org have allowed for academic papers to be shared as soon as they are written. It's amazing to watch when implementations of DeepMind's paper is worked on by communities around the globe simultaneously. Usually in one week you can expect to see code from that paper, and in another week that code doing something as interesting as writing episodes of Friends or analyzing the genome.

Sadly not all fields enjoy the low startup costs of software and AI startups. The hard tech startup often needs lab space or large capital commitments to start building prototypes. Not to mention the speed of iteration for AI is probably some factor of 10x faster than biological experimentation or material science, because you don't have to wait for cells to reproduce (or die). Again, new innovations help are on the young founder's side. Infrastructure is now almost as easy to deploy in hard tech as it is for a developer to use AWS making the speed of iteration 10x and cost 10x less.
  • CRISPR -> 10x easier to gene edit anything "“With CRISPR, literally overnight what had been the biggest frustration of my career turned into an undergraduate side project,” says Reed, of Cornell University. “It was incredible.”
  • Desktop gene sequencing -> 10x cheaper and faster to analyze your genome
  • Cloud experimentation platforms -> 10x faster/cheaper way to run and scale. I compiled some other bio related advancements here.
  • AI applied to VR Content Dev -> 10x faster generation of scenery and characters
  • Open Source CS -> 10x more stable and useful software... for free
  • Physics/material science/chemistry/protein folding -> 10x faster experiments with computer simulation (just wait for quantum computers)
  • Bitcoin/cryptocurrency -> 10x better way to incentivize open protocol adoption. 

After a founder uses those basic tools of infrastructure to find an idea that looks like it could be impactful they leverage new funding mechanisms to can scale more quickly. The funding of innovative ideas has long been concentrated in the hands of a few. Governments once reigned supreme in funding things, as we became wealthier this trickled down to wealthy individuals, then to professional risk investors, and now to individuals in the form of crowd sales, Kickstarters, and most recently app-coin sales. If you accept the idea no one can judge innovation at the earliest of stages--that VCs and angels are using basic heuristics to cull bad startups as opposed to picking winners--then new funding mechanisms can. Free flow of capital through crowdfunding, more diversified risk at the seed stage benefits allows for more companies to get created. 

The internet and associated products should help entrepreneurship in general. If history is any guide, these types of innovation should help those out at the edge the most--today's young founders and others that are resource poor. More young founders can start hard tech companies of the future as the speed of iteration, cost of starting, and intellectual capital get easier to access. The more abstract tools get, the more quickly we can go from insight in mind to project in hand. I for one, am excited about this future.


TLDR

Young founders will win because:
  1. the nature of innovation in has always skewed young
  2. and will the composition of entrepreneurship stay the same change (more geared towards fluid and less towards crystallized)
  3. the inputs of entrepreneurship are increasingly getting easy for young entrepreneurs to access ie: knowledge.
  4. the tools of development and capital are easier for anyone to acquire

Biology in the Coming Years

If I had to compare the development of the synthetic biology/biotech stack to that of the computer, I would say we’re still pretty early. In biology, we’re in the big mainframe era, before the development of the transistor and integrated circuit.


Here's my thinking:


Biology Today Mainframe Era
Long Dev. Cycle Times/Sharing resources Waiting for western blots and gels to run… Waiting for cultures to grow. Few hours to a few days. Trying to get mainframe time to run programs. Few hours to a few days.
Low Debugging No idea if an organism works until actually produced (no in silico modeling) Punch Cards!!! and No compiler
Low reusability/reliability of parts Genes often don’t work outside of their original organism Vaccuum tubes get moths stuck in them
Fragmented community Limited hackers, mostly stuck within universities limited hackers, mostly stuck within universities
Low Abstraction Individual Gene Sequences Punch Cards/Machine Code
Low Complexity of Programs

Today: Yeast that makes beer and a scent

Future: Designer cows??
Then: Computing missile trajectories

Today: Google

And moreover, right now, Ph.D. students and Undergrads are oftentimes just manual labor.

  • Compare:  to 
These student while credentialed as ever don’t touch the interesting problems like experimental design, have much of a say in what projects they work on. I can personally attest to this. For the few short months that I worked in a cancer lab, I was bored to tears. I spent the first week excited from learning to perform different protocols. The next few months were spent being bored to tears. Day in and day out, all I did was move a small amount of liquid from point A to point B. The automation of labor will bring huge headwinds.

It’s not all bad news. Just as the mainframe era evolved into the computer revolution, the bench-work era in biology will give way to a cloud-based, automated version of biology. This is great news for the general public and a great business opportunity. Here are the startups that are bringing a CS approach to biology.


  • The “App” Layer” -> Machine learning applied to discovery: These companies are using large data sets and deep learning techniques to make biological products to sell.
    • Existing drugs: Mine drug databases to find new combinations that will work for treatment on different diseases. This is a huge growth area and makes a lot of sense for a deep learning company firm to enter the market. Since drugs combinations don't have to go through Stage 3 Clinical Trials again, and only have to prove that the drug combination is safe, this can give a capital efficient method to producing cures.
    • Molecular: Companies that are making small molecules to treat disease. Atomwise is the most successful company in this space. This also seems like a type of data that deep learning techniques are able to represent more easily than the complex biological circuits. http://arxiv.org/pdf/1510.02855.pdf
    • Genomics/Biologics: These companies are using ML/DL techniques to create useful DNA Sequences and Antibodies. 
    • Organisms: These companies create functional microbes that do different things. End users buy products that these microbes produce--fragrances for perfumes, oil, and therapeutics. Although these companies might use machine learning, this process is more about trial and error and iterative design, compared to the more automated process of small-drug discovery.
  • The “Backend” -> The "Biological Data Analysis Software: Companies here either sell analysis software or offer specific recommendations based on their proprietary algorithms to clinicians, end consumers, or researchers. I’m not sure who will win in this space, as I don’t think it’s clear that having large datasets are very defensible. I think this mostly because the cost of data acquisition is decaying exponentially. I think this may be a reverse situation to consumer internet companies. Where data is easy to get, but the algorithms are the important things. See Craig Venter’s attempt at monetizing the very first full human genome sequencing that failed. Is the timing right, now? 
    • -Omics: Besides our genes, there is RNA, small molecules (like lipids), proteins that make up our cells, and their own “-omics” which respectively are transcriptomics, metabolomics, proteomics (and don’t forget the microbiome. HLI and iCarbonX are the two largest companies trying to make sense all this stuff.
    • Genomics: Genetic analysis software that goes to researchers and clinicians that help drive better decisions.
    • Consumer: Recommendations are given to end consumers. It’s interesting to see that a large consumer player is transitioning from making money on selling tests/data to developing drugs. Will other players follow?
    • Imaging and Misc: More biological data such as image data, ultrasound, or public health. There’s a lot of interesting things that can happen here. Using MRI data to help doctors diagnose PTSD and other neurological conditions is one big thing that comes to mind.
  • Protocol Layer -> Distribution of existing datasets: These companies provide what data there is, how to share data, and how to compute on data.
    • -Omics: Public organizations provide data sets. Companies like Google Cloud Platform allow you to store large data sets and analyze them to a certain extent.
    • Genetic Variation: Companies here are able mapping out the variation within genes.
    • Circuits: These companies build off the popular iGEMs competition and the synthetic bio movement to provide a reusable set of genes to build with. These are usually free to the public, however, organism discovery companies usually have proprietary gene and circuits that they use.
  • The Internet -> Collaboration Software for People: These are more traditional software products—content platforms, data sharing, and design tools.
    • Literature and the Research Network: There are many attempts at making journal articles easy to find and researchers more accessible.
    • Protocols: These are attempts to make biology more reproducible through the creation of standardized languages to describe experiments in discrete, repeatable steps.
    • Gene Design Tools: The IDEs for biology. Software here is trying to make genes and organisms easy to build with WYSIWYG and visual interfaces. A lot of these products are put out by DNA synthesis companies that want to make the designs scientists produce… for a profit.
  • Creating a Functioning Lab: Funding and bench work are broken. Moving towards a fully automated lab.
    • Funding/Equity Models: Everyone knows that basic research funding is broken. Both the number and average size of grants is decreasing. There are many crowdfunding competitors here. There’s an interesting attempt at creating “equity” with the blockchain.
    • Machine Automation in the Lab: Companies here are looking at the hardware in the lab. Different approaches include an Uber for Lab Experiments, an AWS for experiments, and creating remote access for your own lab.
    • Automating Assays: Taking care of the mixing and matching of assays/reactions within a lab.
    • Lab Management Software: Traditional software that is trying to get a lab functioning better.


My initial thoughts on investment themes:
  • The AWS for lab automation as well as computation will be huge. Automation frees up more than man hours, the lower cost of science will allow scientists to conduct ever more research. Biology has historically been a pretty good adopter of computer techniques to model/simulate/discover organisms. However, historically all three things necessary for machine learning—data, computational capacity, and the algorithms haven’t been able to handle modeling of biological systems. All three areas are now changing. In the past, 1 petaflop would have cost infinite money, now this only costs $400 dollars on AWS. By 2020, we’ll be producing more genomic data than is uploaded to Youtube. All this data will need to be stored safely and computed on. Deep learning in discovery is only going to become more interesting as those algorithms continue to develop.
  • Continuing machine learning’s march into basic research/medicine. There are lots of attempts at making sure research is read, and that people can collaborate, but is that the right approach? Even now, there's not enough time for a biologist to stay on top of current literature. Although early, there are attempts extracting structured data from literature and pushing them through Watson to synthesize finding. After synthesis, researchers or clinicians can use data to create new experiments/make more informed decisions. This will only quicken as adoption of a high-level language used to describe experiments that are machine readable spreads.
  • How to share data is an open problem: There haven’t been many businesses that are trying to build large scale open sharing of genetic info/data sets. Although both HLI and iCarbonX endeavor to aggregate huge data sets to (in the long term) create medicines that extend human lifespan, their short term plan is to sell sequenced consumer data to drug companies thru B2B licensing agreements. This places the valuable data outside the hands of smaller researchers and gives patient data to large companies. I’d be interested in seeing how bitcoin (and especially 21) play into the development of open sharing in biology. With projects like https://github.com/joepickrell/genome-server-21 and https://github.com/joepickrell/phenopredict21 happening, bitcoin shows it's flexibility. Although this was a proof of concept, I think analysis of data, has the potential to put personal health data sharing in the hands of the people rather than doctors and companies.
  • Developing direct relationships between patients and drug companies. Many companies are taking a very new model for finding patients. These companies are directly developing relationships with patients/users of their drugs. Instead of partnering with hospitals and large health care networks to find study candidates, they can do so with a lower cost of capital with the internet. 23andMe is a shining example.
  • Bio is becoming a lot cheaper.  Look at the Perlstein Lab. They're able to do drug and mouse studies on software startup run rates.

Work being done by these companies to bring biology up to software speed is incredible. But what does it really mean for end consumers? What kind of products will we see? Here are my predictions for what we'll see by the end of 2020:

The Startup Game

A few weeks ago, the Guesstimate beta came out. It's pretty cool; it’s like Excel with Crystal Ball built right in. You can input a single number or a range of values and build models with it. Guesstimate’s release and the holiday season gave me the perfect chance to explore an idea on the startup industry. I had been meaning to building a model to understand the formation and development of a startup to its eventual failure or exit.

This is one in of a long line of attempts to try and quantify an often-times opaque industry. Two prominent examples of data-driven approaches to venture financing are Aileen Lee’s TechCrunch article that popularized the term ‘unicorn’ and a recent Cambridge Associates research report on venture returns becoming less concentrated. While both of these reports are good attempts to understand an aspect of the startup formation and funding, it’s often hard to understand how a startup in moved along through this process if you are new to the industry.

 
The startup industry model in Guesstimate takes inspiration from Sam Gerstenzang's Open Source Venture Model and Bryan Johnson’s OSF Playbook. Like any model, my startup model is an attempt to make explicit assumptions and beliefs about the world to be tested. It allows you to change values to see how each element can push and pull on each other. It follows one cohort of companies started in a year and follows them through their life cycle. It assumes a set amount of capital available at each stage that is always spent on financing that set of companies. You can play with the model here. Right now, you can make changes the model on Guesstimate, but no changes are saved once you leave the page. Varying the “exit multiple” and the number of deals participated in by VCs have the most dramatic effect on the model.

Some key things learned and reinforced in the course of building the model:
  • There’s a huge amount of disagreement in just how many startups are started every year. The Kauffman Foundation says that ~6,000,000 new businesses are created, while not stating how many are high growth startups. Marc Andreessen says there are 4,000 startups that are created. In addition, people still don’t agree on the definition of a startup.
  • It’s really hard to build startup. So, so many fail. The vast majority of new businesses fail to attract any angel or VC funds at all.
  • Power law distributions are still not internalized by people (and not well represented by this model). The magnitude and difference of returns that one company can generate is just astounding. WhatsApp raised a total of $60 million while exiting at a total valuation of $19 billion, a 316x return on invested capital. 50% of startups will fail to return anything, and the next 40% of startups above that will hopefully return the the total invested capital of investors. It is the WhatsApps of the world, the top 1% that bring home meaningful returns.
  • Angel Investors make up a huge not-as-often-recognized pool of capital to startups. $20 billion is invested per year by Angels into startups. Their importance is hard to overstate at the earliest stages where they enter 50 to 70k deals per year. This prominence has grown since the 2000s due to the low cost of doing a startup provided by AWS and other related services. Since the costs of starting a software startup have dropped so low, VCs aren’t able to deploy such little capital in one deal. Their model does not work like that. Angel Investors, do in fact generate a nice return, in line with VC returns. 
As previously said, Guesstimate has only two distributions normal and uniform distribution and isn’t able to capture much of the statistical reality of startups. While normal distributions may be a good way to model the likelihood of a startup moving onto the next stage of funding, it’s not a very good way to measure the return generated at exit. Right now, the model is merely descriptive (and barely so). In the future, I’d like to move towards a prescriptive model to answer the question: “how can we change the current system to create more impactful innovation in the world?”. Questions such as "Do we need a more diverse group of VCs to allocate capital to different startups?" or  "Is the most effective way to create innovation to pump more money towards VCs or to lower the cost of starting startups?" may be more easily answered with this model.

With that said, here are a few directions I’d like to explore:
  • Exploring how broader macroeconomic trends influence the startup industry. At the midpoint of 2015, China was on pace to invest $30 billion through venture capital. How will 2016 China influence funding this year, and how will these impact the startup ecosystem 5 - 10 years down the line? (Thanks Daniel)
  • How the industry (and cost of doing a startup) affects the rate of formation. While we’ve seen a veritable boom in the formation of software startups, the same can’t be said for life science startups, where the number of initial financings by VCs has remained unchanged. As the cost of doing startups comes down, we should see a pattern of more hardware and biology startups being funded at the early stages. PCH International and Transcriptic are working to do their part to lower costs in their respective industries.
  • Making this model more of a simulation to see how the ecosystem evolves over time. I would like to see how exits by the large companies are able to seed the next generation of angel investors and provide landing grounds for acquisitions. Silicon Valley wasn't built overnight. The dynamic process of companies exiting and investors passing on advice to the next generation is an important to creating huge companies and innovative ecosystems.
  • Add more data! I’d like to see how individual firms, investors, and entrepreneurs are able to influence the growth of a startup instead of aggregated statistics provided by reports.
Thanks for reading! Drop a note on Twitter if you found this interesting!


Thanks to Daniel Kao, Jonathan Zong, and Reed Rosenbluth for reading a draft.

Observations on Company Culture

I recently visited around fifteen companies in SF — small startups just past series A to 20-year old internet companies — without dropping any names, here are some observations.

Authentic belief in a company’s mission — that one’s work is actually important — is different than the normal lip service that companies pay when talking about “changing the world”. Culture isn’t just letting dogs in your office, or nice couches, or wearing Hawaiian shirts. You can literally smell the culture. It’s in the air, written on people’s faces, in how they speak and act. It’s imbued from the top-down through founding stories and values. As well as also from the bottom-up from the interactions between co-workers and visitors . Everyone’s attitude influenced the overall culture, positively or negatively. We all know that communication is 85% percent body language — culture is communicated non-verbally as well.

It’s seems very, very hard to keep missionary cultures as companies grow. Finding engineers is hard enough, but finding engineers is harder still when they need to believe in the mission. Finding engineers is tripley hard when a company is also quadrupling in size. Everywhere we went had smart people, that was clear. However, challenging them to do great work and getting them to believe is hard. The “craziness” of the mission (not a scientific measure) seemed directly correlated with the quality of people.

A focus on metrics and product direction lent a sense of urgency to everyday activity. We visited a company where in the center of the office, the hockey stick was prominently featured. It’s a visual reminder of where the company is, where the company has been, and how the company is doing. Without a view of the metrics, they could kid themselves into believing that they were doing well. There was a huge difference between the companies talked a big game of growth and those that could actually show outsiders their growth.

With all that said, here are a couple of my suggested ingredients for what makes a great culture: founder myths — the trials and tribulations of what the founders had to do to create change in the world (i.e. hero’s journey), missionary people — people who believe they are doing something for others, heaps of trust, a focus towards continual improvement, and luck.

Getting the culture right seems really, really hard, but seems vital to getting real work done.