“Where do I find the most promising innovations at the university?”
For four years, no other question vexed me more acutely. My job at the the University of Chicago was a fourfold: 1) find the university’s best research, 2) turn that research into startups, 3) invest in the startups, and 4) help them grow. I expected steps 2 through 4 to be hard. But step 1? How hard can it be to find great research at a place like UChicago? I should have known better.
At a place like UofC, great research is abundant. But the type of research that can become the foundation of a great product or company? That is much harder to find.
Throughout the years, people have tried a number of approaches to try to solve this problem. The founders of Arch Venture famously created their first investments by walking the halls of UChicago and convincing the smartest professors and post-docs they could find to start companies. Nowadays, innovation searchers - including VC funds and corporate innovation scouts - rely on a combination of three strategies:
Read the Literature - Subscribe to the top journals in a field of interest, find the most interesting recent papers, and reach out to the authors directly.
Walk the Halls - Show up at conferences, lectures, department meetings, and faculty offices
Phone a Friend - Cultivate a network of reliable contacts in your areas of interest and periodically poll them about what’s exciting
These are tried-and-true approaches that still work, but which have significant drawbacks. Relying on the literature, you run the risk of being beat to the chase by your competitors. Walking the halls, you risk spending too much time in the wrong places. And phoning friends may work for the extremely well connected, but these types of connections take an entire career to build. What to do in the meantime?
A 21st Century Approach
The answer, like many (all?) important innovations, lies at the intersection of several emerging fields and technologies. James Evans, the head of the Knowledge Lab at UChicago, calls this particular intersection “The Science of Science”.
The deluge of digital data on scholarly output offers unprecedented opportunities to explore patterns characterizing the structure and evolution of science. The science of science (SciSci) places the practice of science itself under the microscope, leading to a quantitative understanding of the genesis of scientific discovery, creativity, and practice and developing tools and policies aimed at accelerating scientific progress. - Fortunato et al., Science of Science
Several tools have emerged over the last decade which, when combined, make SciSci possible.
Research Graphs: Scientific innovation leaves a paper trail. A BIG one. Patents, publications, grants, clinical trials protocols, SEC filings… all provide different views into the scientific enterprise and its outputs. But how do we make this all useful? A first wave of innovation took place when Web of Science (early ‘90s) and Google Scholar (early ‘00s) used internet databases to unlock the power of citation relationships - uncovering the ligaments, so to speak, that connect science together. The next wave, driven by advances in AI, is coming into full flower today. These advances - especially language models - allow us not only to draw connections, but to extract meaning from what they mean in new ways.
Language Models: GPT-3 and Stable Diffusion are just the most recent offerings from the field that, arguably, has seen the most transformational progress over the last 10 years: machine language. The same family of tools that now makes it possible to generate a fake painting of Nick Cage by Bob Ross, can also be used to “read” papers and patents, “interpret” what they mean, and potentially even write new ones.
Omics: Yes, I too am weary of the increasingly ubiquitous -ome suffix (spliceosome, secretome... seriously?). But the deluge of tools being developed to, e.g., tease out relationships between gene networks and disease phenotypes, can be repurposed far beyond the realms of biology. Even, perhaps, to identifying the next generation of star researchers.
By combining these technologies, we can now answer a variety of “innovation search” questions with unprecedented specificity and resolution. We can, for example:
Map the top faculty, startups, investors, and scientific competencies of a region
Uncover the key people and trends driving any field or technology
Uncover the hidden patterns linking innovation inputs (funding, people, infrastructure) and outputs (inventions, startups)
Predict “innovation phenotypes” from “research genotypes”
Discover disruptive research or PIs before they hit Nature
Introducing Portal Stargaze™
Imagine innovation as the bubbles within a lava lamp or, alternatively, as the galaxies within the visible universe. One galaxy, perhaps, is clinical informatics. Another is nucleic acid delivery. Another is proximity-induced degradation. The contents of each galaxy are the metaknowledge artifacts of innovation: patents, papers, grants, etc.
As we zoom in on each galaxy, we can see that additional clusters begin to take shape. The nucleic acid delivery galaxy, for example, resolves into several sub-galaxies representing viral, polymer, and lipid-based methods. The opposite process happens as we zoom out - the galaxies appear to recombine into mega-disciplines like physical chemistry, molecular biology, and clinical medicine.
And as we scroll across time, we can see that these galaxies, and the stars within them, move and change shape. “Stars” from the fields of medicine and nuclear physics begin to peel off and then collide to form a new galaxy - radiotherapy - with several sub clusters. A star on the fringes of oncology - Jim Allison - suddenly moves towards it’s centroid and shifts its center of gravity.
Each galaxy has other characteristics as well. Some galaxies seem to “emit” startups at rates higher than others. Others suck up grant dollars without emitting much of anything. And a very few seem to be breeding grounds for supernova researchers who create one VC-backed startup after another.
How we do it
To create innovation galaxies, and visualize them, we utilize a multistep approach, an approximation of which follows.
First, we must collect information on every relevant biotech patent, paper, grant, and startup from the last 20 years. We then link them together to form a Biotech Hypergraph. This is the hardest part.
Second, we create individual fingerprints, or coordinates, for each one. This is where the recent advances in natural language processing come in.
Third, we apply bioinformatics-inspired models to discover knowledge clusters - galaxies - and show how the evolve and interact over time.
Lastly, we train an algorithm to look at all of the “researcher stars” in our galaxy, and to predict the type of star that they are likely to become in their full maturity. For example: basic researcher, inventor, founder, and/or superstar.
Using Portal Stargaze for Innovation Ecology
In the history of exploration, each wave of discovery has been enabled by a new instrument. The compass, the astrolabe, the sextant, the microscope, the optical telescope, the space telescope - each of these allowed humans to chart and explore previously unseen elements of the universe.
With the literal unfurling of the James Webb Telescope, we are witnessing yet again the delight of discovery that a new instrument can unleash, as the previously unseen becomes seen.
We hope that Portal Stargaze and other SciSci tools will play this roll for the next generation of innovation ecologists - allowing us to “see” innovation like never before and to create innovation ecosystems of increasing power and sophistication.
If you’re interested in going along this journey with us, or know someone who might be, please subscribe and or share this piece, below!
Next Post in this Series: Can AI spot patterns in successful innovator's careers?
Great information!
Hi Steve, When you mentioned portal stargaze the other night at portal's event in Atlanta, I really couldn't wrap my head around it. But this essay clarified a lot of stuff. Really nice essay!