Predictive Phenomics Projects
The following FY2023 projects support the three thrust areas within the Predictive Phenomics S&T Initiative.
TA1: Enhancing Multi-Scale Phenomics Measurements
PhenoProfiling
Ryan McClure & Nick Reichart
The complex microbiomes that exist within environmental and animal hosts are responsible for countless biochemical activities that are crucial to normal function. Genes and transcripts alone fail to reveal the complex subcellular arrangement of proteins and molecules that elicit a given phenotype. This project integrates the use of activity-based probes with fluorescence-activated cell sorting of microbial communities of various complexity for robust assessment of the biochemical mechanisms that support population-level phenotypes.
Structural Profiling
John Melchior
The functional pathways that define phenotype require carefully regulated interactions of proteins with other proteins, lipids, and metabolites. In essence, this protein “interactome” underpins the key biological processes essential to life. These interactions require conformational adaptations of the protein structure which is now appreciated to be the essential driver of biological function. This project will develop and integrate two complementary technologies that provide readouts on changes in protein structure: thermal proteome profiling and limited proteolysis. Grounded in bottom-up mass spectrometry, these technologies will provide a high-throughput, global readout of the structural proteome of a biological system. A deeper understanding of these protein interactions will ultimately facilitate generation of molecular tools that can predict and control biological phenotype.

Multi-PTM Profiling
Tong Zhang
Proteins are executors of biological functions. The function of a protein can be controlled by various forms of post-translational modifications (PTMs) including phosphorylation, redox modification, acetylation, and ubiquitination. By modulating protein activity, protein-protein interaction, protein turnover, and sub-cellular localization, PTMs provide a versatile mechanism for regulating cell signaling and molecular phenotypes. But most current workflows use lengthy and labor-extensive procedures and only target one single type of PTM. This project aims to develop an analytical capability for large-scale profiling of multiple PMTs. A universal sample processing method that is compatible with multi-PMT will be developed and automated for versatility and scalability. This, this work will integrate multiple PTM-profiling workflows and greatly simplify the sample processing procedure. Multi-PTM data will be generated to advance the understanding of complex biological systems including the host-pathogen interaction between human lung cells and coronavirus as well as PTM-driven light/dark regulation in cyanobacteria.
Phenotype Control Using Nanobody Technology
amy sims & Bojana Leonard
In response to environmental stimuli, biological systems activate and deactivate functional response networks, resulting in a variety of stimuli-specific phenotypes. Understanding the functional response networks that regulate phenotypic outcomes will allow manipulation and control of the observed phenotypes within the biological system. However, identification of these functional responses is challenging because some knowledge of the response network is usually required. To solve this problem, we are testing an approach that screens synthetic libraries of nanobodies (minimal active recognition domain from single-chain antibodies derived from camelids) for ones that bind target antigens only following specific stimuli (i.e., infection). We hypothesize that these stimuli-specific nanobodies will target antigens within the functional response networks and allow us to control the observed phenotypes, without having prior knowledge of the response networks or manipulating the biological system. This new capability can be applied to a variety of biological systems under a wide range of external stimuli.

TA2: Identifying Molecular Patterns of Biological Function
Reducing C and N model
Pavlo Bohutskyi & Kyle Pomraning
All organisms exist on Earth as part of complex and interacting multi-species communities. These natural communities are very productive, adaptive to changes, robust to stresses, and can perform multi-step complicated biological processes. But, it is challenging to comprehend the fundamental principles governing their establishment and function as well as to harness potential of natural communities for biomanufacturing due to their tremendous complexity. Therefore, we have established flexible synthetic microbial communities as metabolically linked, genetically tractable and biomanufacturing- relevant biological platforms for advanced biomanufacturing and predictive phenomics studies. These model communities are subjected to various perturbations and multi-omics data collection studies to identify how metabolic and phenotypic responses are driven by the gene regulatory systems.
Viral Infection model
Amy Sims
To reproduce, viruses take living hosts hostage. Host cells provide the resource rich environment necessary for replication and transmission. Viral infections activate/deactivate key cellular functions to move resources to viral replication factories while “hiding” from host defenses. We will identify proteins within the cell that have altered structure as a surrogate measure for infection induced changes in host functions. At the same time, we will identify cell functions targeted by the virus to modify whether host genes can make proteins. Our data will improve current understanding of how viruses manipulate host environments. We predict that we will identify new host based targets to block virus replication to overcome limitations of current antivirals that target only viral proteins and frequently fail due to rapid viral evolution.
CarbStor
Ryan McClure
With the increase in CO2 in the atmosphere new strategies are needed to remove atmospheric carbon and store it in a stable manner in the soil. Soil bacterial communities can drive this storage through the microbial production of CaCO3 (calcium carbonate). However, the interactions within a microbial community that drive certain species to express this carbon sequestration phenotype are unknown. Here, we are building naturally evolved soil microbial communities that produce CaCO3 and carrying out detailed molecular analysis of the constituent members, their expressed functions, and shared metabolites. We can then map the interactions between the carbon sequestering members, and those in the consortia that drive the carbon storage phenotypes. Identifying these interactions can enable us to augment carbon sequestration even further through modifications to the community, maximizing our ability to store carbon safely in soil.
Genome Code Expansion in diverse bacterial models
Joshua Elmore & Elise Van Fossen
Genetic code expansion (GCE) drives the incorporation of chemical properties not found in nature into proteins. With these ‘new-to-nature’ protein functionalities we gain a predictive understanding of biomolecular processes at a level previously unattainable with conventional methods. This proposal will employ cutting-edge genome engineering techniques to expand the genetic code of diverse organisms as well as generate custom chemically-modified proteins. This will unlock desirable phenotypes in engineered bacteria by expanding the range of molecules that microbes can produce or modify for applications in bioproduction, carbon storage, and bioremediation.
TA3: Computational Methods- Phenotypic Signatures
Predicting cellular states from emergent protein assemblies
Margaret Cheung
Predicting cell regulation phenotypes
Ethan King
Advances in genetics are giving us ever more detailed maps of the chemical reaction pathways available in cells. But, predicting how cells will regulate the use of their reactions for different environmental conditions remains challenging. If we can predict this regulation, we can build more accurate models to accelerate our understanding and guide bioengineering. This kind of prediction is difficult in biology due to the vast diversity within organisms. However, all organisms are under selective pressure to grow and reproduce efficiently, which shapes regulation over the course of evolution. We are developing new theory and tools that leverage this evolutionary perspective combined with knowledge of the energetics of reactions, to predict cell regulation phenotypes.
Bayesian Prediction Framework
Lisa Bramer
The value of ‘omics data, such as proteomics data, is largely determined by the availability of functional interpretation in the context of phenotype. We will develop a multi-level Bayesian modeling framework, which leverages existing biological knowledge to elicit prior distributions, for the prediction of phenotypes or identification of gene targets associated with a phenotypic response of interest. The system framework is built in a manner that allows automatic updating of prior distributions and the integration of data from multiple experiments. Creating an automated workflow for development of organism-specific graphical biological knowledgebases from published literature and databases. We will also develop a Bayesian graphical models pipeline to identify candidate biomolecules related to phenotypes of interest and to integrate multiple omics data types within the Bayesian modeling framework.
Multi-scale modeling Framework
Vlad Petyuk/Jeremy Zucker
Model-based prediction and control of desirable phenotypic traits is a grand challenge in systems and synthetic biology. The object of this project is to build tools and framework for effective multi-scale modeling of microbial consortia. By integrating biological knowledge with multi-omics measurements, we can identify leverage points for inducing the desired phenotype. In this project, we will address these questions through a range of empirical and simulation approaches that leverage the Reducing C and N project and complement the Predicting cell regulation phenotypes initiative project. Here we propose to take advantage of a multi-physics simulation environment Vivarium, that can flexibly address a range of biological questions from whole-cell modeling to population dynamics. We will use Vivarium to predict and control the population dynamics of microbial consortia, extend it to solve inverse problems, and develop a multi-fidelity, closed-loop test harness for evaluating active learning algorithms that suggest genetic modifications for improving the productivity of microbial cell factories.