Data science at Indigo helps us understand our data from “strain to grain to market”—a process that begins with our in-house microbe collection, extends through research and development, and culminates in the commercialization and sale of Indigo products. Data analyses allow us not only to analyze the qualities of microbial strains, but predict their efficacy, assess their field performance, and advise how best to use them.

Indigo's R&D Data Science team—comprised of computational biologists Cory Colaneri, Max Winston, Nichole Wespe, and Jacob Oppenheim (pictured, left to right)—applies cutting-edge computational analysis to catalog new microbial strains, determine their functional capabilities, and understand their broader communities. This information they ultimately merge with laboratory and field data to drive the rapid discovery and development of new Indigo microbial products. United with the performance of our strains on field and the characterization of our products, these insights enable Indigo to quickly design, improve, and market seed treatments and software tools.

Max: I was finishing my PhD at UChicago in evolutionary genomics, and I was interested in doing something outside of academia but still research-related. Indigo seemed like a great place—is a great place—because I can do the interesting research and data science that exists out there, while working with real biological data and using a lot of what I studied in school. For me, [Indigo] just aligned with a lot of the skills I had and the skills I wanted to develop. And the core mission here—what we’re all about—is something that’s important.

Cory: I was working on my masters at UMass Boston, studying biology with a bioinformatics focus. I was a TA for the bioinformatics undergrad class, and from there was recruited as a summer intern. I thought that sounded great—to get some real-world experience—and at the end of the summer I liked it at then-Symbiota so much that I decided to stay on full time. And that was like two-and-a-half years ago—so here I am!

Nichole: I came to Indigo directly through the Insight Data Science program, which is a seven-week training program for people who’ve completed PhDs—or are in the process of completing PhDs—and want to transition into doing data science. I kind of failed to make that full transition, ending up in a computational biologist role, but that was actually what I was looking for out of grad school, so it worked out perfectly for me.

Jacob: My background’s in physics. I did my PhD at a medical research institute though, so I always knew I wanted to work in industry—I thought biotechnology. I got out of grad school and got a job at a company that did advanced mathematical modeling of clinical trial data to understand subtypes in cancer and things like this. Not long after, I decided that I wanted to be at a place where we generated the data, so I ended up at Indigo, doing data science. I guess I was the first data scientist hired to do analysis of data from the laboratory, from the fields, etc. Over time, that required more and more people. So first Max joined, [and] Cory came back to R&D at some point—which was fantastic, because all the good press for data science comes from things that Cory has done [building bioinformatics tools for our bench scientists]! And then, of course, Nichole is here now, which is great, because I only play at being a biologist, and she actually knows what’s going on.

Max: Well, I think recently, since the team was formed, we’ve had pretty defined niches, which is really nice. I tend to do community sequencing data—so anything that has to do with the microbiome, and then also some machine learning for nominations [of strains into our pipeline].

Cory: A lot of the work that I’m doing is focused on strain IDs; we have all these microbes in the freezer and we want to know what they are, for regulatory, biosafety, and nomination purposes. And I also do general bioinformatics programming that people need.

Nichole: And I’m the genomes person! So that’s looking at all the genes in an organism—but focusing on a single organism versus Max, who looks at all the organisms in a group. And then, I compare strains in our collection, and what genes they have or don’t have.

Max: I’d say there’s a lot of overlap…one of the things we brought up at our offsite was that we need other eyes from other people, right? So this might be my project, but I’ll come and ask—I just asked Cory a programming question 15 minutes ago. Or if I make some decision, I talk to the rest of the group to get some consensus.

All of us have different backgrounds: Cory has a background in bioinformatics and computational biology, Max in quantitative ecology, Nichole from molecular biology, and myself from physics. I think this gives actually a lot of strength to the team. We're in an interesting position, because we're in the Tech team, but we support R&D. So we're always kind of in the middle of things, and so having a group of people with divergent skills—that can help us fulfill our mission of technological solutions for R&D—is best.


Max: The mentor that I had was John Novembre, who was recently a MacArthur Fellow working in human genetics; [he] does a lot of really cool work at U-Chicago combining data science and biology. I think that one of the biggest benefits of the program that I went to at Chicago—the Committee on Evolutionary Biology—is that you have a lot of independence. So that allowed your interests [to] evolve and change. For me, that meant getting more and more computational and statistical as time went on.

Nichole: Sean Eddy went from being a bench biologist to computation, too. I remember hearing about his background one time when he was giving a talk at Harvard, and I was like, “Oh, it’s possible—to go from basically just doing molecular biology at the bench to writing computational programs!”

Myself? There's this "proud tradition" (I'm going to overstate this) of physicists becoming pretty good biologists at some point in their lives. I can think of people like Francis Crick, who ended up doing really fundamental work in biology, not only discovering the fundamental structure of DNA but also trying to figure how exactly you would encode information in DNA. There's also people like Howard Berg. I think he was a physical chemist of some sort, but he ended up figuring out how E. coli "sense" their environment. He figured out how that whole system works—really, just absolutely beautiful work. The sense that [in] applying quantitative principle methods and some basic theory you can really gain insight into complicated biological systems is something that I find very motivating.



Cory: OK, what keeps me up at night…I don’t know, climate change, probably. What gets me up in the morning? Usually my cat, meowing for breakfast.

Max: In the morning I’m excited to get to work. Because despite any frustrations that anyone’s gonna have at any job on the day-to-day, I like what we’re doing. That’s exciting; that’s why I like working here. And the yogurt. Let’s not forget about the yogurt.

Nichole: So, what keeps me up at night is—not in a bad way—planning the next steps in a project I’m working on. And in the morning, I’m definitely motivated by the promise of coffee first. I enjoy coming to work ‘cause of the team, and [to] move forward with analyzing these awesome data sets.

What gets me to work in the morning I think is pretty simple: I love the people that I work with, and I’m really glad that we’re trying to change the world and doing interesting science at the same time. I think that gets me out of bed in the morning, and probably makes me think too much about things on the weekend as well. What keeps me up at night? Not getting enough exercise to clear my mind.

Cory: Anyway, I feel like I should give a better answer for what gets me out of bed in the morning. You can put my silly answer, but I do really like working here, and I like the people I work with. And I feel like the work the company is doing is important.

Max: In terms of getting up in the morning: I’m a pretty curious person, and there are, like, a million questions to answer.

