Technical Review

In silico Modeling

Last modified: 2019-09-18

Copyright notice, citation: Copyright
© 2018-present, Victoria A. Stuart

These Contents

[Table of Contents]


In silico Modeling

In silico modeling – the computational modeling of biochemical, metabolic, pharmacologic or physiologic processes – is a logical extension of in vitro experimentation [In silico Modelling of Physiologic Systems (Dec 2011)]. A natural result of the explosive increase in computing power available to research scientists at continually decreasing cost, in silico modeling combines the advantages of in vivo and in vitro experimentation, not subject to ethical considerations or the lack of control associated with many in vivo experiments (e.g. human or animal experimentation). In silico models also allow researchers to include a virtually unlimited array of parameters, potentially rendering the results more applicable to the organism as a whole.

Examples of recent work in the in silico domain include In vivo and In silico Dynamics of the Development of Metabolic Syndrome (Jun 2018) [code]


[Image source. Click image to open in new window.]

… and Systems Modelling of the EGFR-PYK2-c-Met Interaction Network Predicts and Prioritizes Synergistic Drug Combinations for Triple-Negative Breast Cancer (Jun 2018) [media].


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

The University of Connecticut’s “Virtual Cell,” Compartmental and Spatial Rule-Based Modeling with Virtual Cell (Oct 2017) [project pages here and herecodecommunitymedia]) provides a comprehensive platform for modeling and simulation of cell biology from biological pathways down to cell biophysics. VCell supports biochemical network and rule-based modeling and electrophysiology in compartmental modeling and within cellular geometry.


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

One might inquiry whether knowledge graphs, which naturally embed biochemical pathways and networks, are amenable to silico perturbation. The Cellular Potts Model (CPM ) is a computational biological model of the collective behavior of cellular structures. CPM allows modeling of many phenomena such as cell migration, clustering, and growth taking adhesive forces – taking environment sensing as well as volume and surface area constraints into account. In silico Modeling for Tumor Growth Visualization (Aug 2016) implemented a crude graphical model via a CPM, that can be visualized in Cytoscape via cpm-cytoscape  [project]. Those authors also described their work in Machine Learning for In Silico Modeling of Tumor Growth.


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image sourceECM: extracellular matrix.  Click image to open in new window.]

See also:


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

In the synthetic biology domain, Out-of-Equilibrium Microcompartments for the Bottom-Up Integration of Metabolic Functions (Jun 2018) [media] recently described the analysis of self-sustained metabolic pathways in a microfluidic platform, water-in-oil droplet microcompartments. The authors developed an assay based on nicotinamide adenine dinucleotide (NADH) fluorescence to quantify the metabolic state of the microcompartments. The minimal metabolism was constructed from a reaction converting glucose-6-phosphate into 6-phosphogluconolactone. The reaction was catalysed by glucose-6-phosphate dehydrogenase, an enzyme involved in the pentose phosphate pathway. A key feature integrated the ability to function under conditions where the reaction was sustained independently of the cofactor stoichiometry. The full conversion of the metabolic substrate required the regeneration of the cofactor NAD+: a regeneration module made of inverted membrane vesicles extracted from E. coli.


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

The metabolic pathways exemplified in that microfluidic approach (above) are easily modeled in a knowledge graph, suggesting the possibility of the in silico modeling of pathway reactions in parallel with “wet lab,” systems biology approaches. Other relatively easily-constructed models (e.g.: bioengineered microbes and viruses on microbiological media; transgenic rodents; etc.) could likewise be modeled in a combined systems biology, in silico modeling approach. In this regard, clustered regularly interspaced short palindromic repeats/CRISPR associated protein 9 (CRISPR/Cas9) gene editing technology is especially relevant; for example, note Section 3 in CRISPR/Cas9-Based Genome Editing for Disease Modeling and Therapy: Challenges and Opportunities for Nonviral Delivery (Jun 2017), and The Present and Future of Genome Editing in Cancer Research (Jul 2017).  CRISPR/Cas9 for Cancer Research and Therapy (Apr 2018) provides an excellent review and perspective on this technology.

Cell-Free Gene-Regulatory Network Engineering with Synthetic Transcription Factors (Mar 2019) [data & code here and heremedia (]

Significance. Understanding basic mechanisms and engineering new systems in biology are hampered by the challenge of quantifying and manipulating numerous molecular components and their interactions. In this work, we present an approach to tackle these difficulties, by combining a cell-free system with a microfluidic platform for high-throughput measurements. We apply the system to comprehensively characterize a library of synthetic zinc-finger transcription factors, which are common building blocks of transcriptional regulatory networks. We subsequently use the knowledge gained to engineer highly specific, tunable, and strong cooperative repressors, which can be applied to carry out logical computation on promoters.

Abstract. Gene-regulatory networks are ubiquitous in nature and critical for bottom-up engineering of synthetic networks. Transcriptional repression is a fundamental function that can be tuned at the level of DNA, protein, and cooperative protein-protein interactions, necessitating high-throughput experimental approaches for in-depth characterization. Here, we used a cell-free system in combination with a high-throughput microfluidic device to comprehensively study the different tuning mechanisms of a synthetic zinc-finger repressor library, whose affinity and cooperativity can be rationally engineered. The device is integrated into a comprehensive workflow that includes determination of transcription-factor binding-energy landscapes and mechanistic modeling, enabling us to generate a library of well-characterized synthetic transcription factors and corresponding promoters, which we then used to build gene-regulatory networks de novo. The well-characterized synthetic parts and insights gained should be useful for rationally engineering gene-regulatory networks and for studying the biophysics of transcriptional regulation.”

Reinforcement learning (RL) can be applied in optimizing chemical/biochemical reactions. Optimizing Chemical Reactions with Deep Reinforcement Learning (Dec 2017) [code] showed that their RL model outperformed a state of the art algorithm, and generalized to dissimilar underlying mechanisms. Combined with LSTM to model the policy function, the RL agent optimized the chemical reaction with the Markov decision process (MDP) characterized by $\small \{S, A, P, R\}$, where $\small S$ was the set of experimental conditions (like temperature, pH, etc), $\small A$ was the set all possible actions that can change the experimental conditions, $\small P$ was the transition probability from current experiment condition to the next condition, and $\small R$ was the reward which is a function of the state.

  • Their Deep Reaction Optimizer model iteratively recorded the results of a chemical reaction and chose new experimental conditions to improve the reaction outcome, outperforming a state of the art black box optimization algorithm by using 71% fewer steps on both simulations and real reactions. Furthermore, they introduced an efficient exploration strategy by drawing the reaction conditions from certain probability distributions, which resulted in an improvement on “regret” from 0.062 to 0.039 compared with a deterministic policy (they used “regret” to evaluate the performance of the model ). For the optimization of real-world reactions, the authors carried out four experiments in microdroplets (Scheme 2
    in their paper: synthesis of ribose phosphate, etc.) and recorded the production yield. Combining the efficient exploration policy with accelerated microdroplet reactions, their Deep Reaction Optimizer not only served as an efficient and effective reaction optimizer (optimal reaction conditions were determined in 30 min for the four reactions considered), it also provided a better understanding of the mechanism of chemical reactions than that obtained using traditional approaches.


    [Image source. Click image to open in new window.]


    [Image source. Click image to open in new window.]

In a revolutionary computational/in silico approach, Inferring Regulatory Networks from Experimental Morphological Phenotypes: A Computational Method Reverse-Engineers Planarian Regeneration (Jun 2015) [media: Planarian Regeneration Model Discovered by Artificial Intelligence] applied machine learning to uncover pathways associated with tissue regeneration. Planarian regeneration had been studied for over a century, but despite increasing insight into the pathways that control its stem cells, no constructive, mechanistic model had been found by scientists that explained more than one or two key features of its remarkable ability to regenerate its correct anatomical pattern after drastic perturbations. Those authors presented a method that inferred the molecular products, topology, and spatial and temporal non-linear dynamics of regulatory networks – recapitulating in silico the rich dataset of morphological phenotypes resulting from genetic, surgical, and pharmacological experiments. They demonstrated their approach by inferring complete regulatory networks explaining the outcomes of the main functional regeneration experiments in the planarian literature. By analyzing all the datasets together, their system inferred the first systems-biology comprehensive dynamical model explaining patterning in planarian regeneration. This method provided an automated, highly generalizable framework for identifying the underlying control mechanisms responsible for the dynamic regulation of growth and form.

  • “An artificial intelligence system has for the first time reverse-engineered the regeneration mechanism of planaria - the small worms whose extraordinary power to regrow body parts has made them a research model in human regenerative medicine. The discovery by Tufts University biologists presents the first model of regeneration discovered by a non-human intelligence and the first comprehensive model of planarian regeneration, which had eluded human scientists for over 100 years. …”


    [Image source. Click image to open in new window.]


    [Image source. Click image to open in new window.]

    • See also subsequent similar work (different authors), Cell Type Atlas and Lineage Tree of a Whole Complex Animal by Single-Cell Transcriptomics (May 2018), which mapped the transcriptome for essentially all cell types a planarian (flatworm): dozens of cell types including stem cells, progenitors, and terminally differentiated cells. They then applied a new computational algorithm, partition-based graph abstraction (PAGA), which could predict a lineage tree for the whole animal in an unbiased way. Notably, their approach was applicable to other model and non-model organisms, assuming that their differentiation processes are sampled with sufficient time resolution.


      [Image source. Click image to open in new window.]

    • In turn, that data was used as an example of an approach that determined the intrinsic dimensionality of data. Although large-scale datasets are frequently high-dimensional, their data frequently possess structures that significantly decrease their intrinsic dimensionality (ID) due to the presence of clusters, points being located close to low-dimensional varieties or fine-grained lumping. Estimating the Effective Dimension of Large Biological Datasets Using Fisher Separability Analysis (Jan 2019) [code] tested a dimensionality estimator that was based on analysing the separability properties of data points, on several benchmarks and real biological datasets. They showed that the introduced measure of ID had performance competitive with state of the art measures, being efficient across a wide range of dimensions and performing better in the case of noisy samples. Moreover, it allowed estimating the intrinsic dimension in situations where the intrinsic manifold assumption was not valid. [Note their Fig. 5.]


      [Image source. Click image to open in new window.]

A Hybrid Stochastic Model of the Budding Yeast Cell Cycle (Jul 2019) [Supplemental Material].  “The growth and division of eukaryotic cells are regulated by complex, multi-scale networks. In this process, the mechanism controlling cell cycle progression has to be robust against inherent noise in the system. In this paper, a hybrid stochastic model is developed to study the effects of noise on the control mechanism of the budding yeast cell cycle. The modeling approach leverages, in a single multi-scale model, the advantages of two regimes: (1) the computational efficiency of a deterministic approach, and (2) the accuracy of stochastic simulations. Our results show that this hybrid stochastic model achieves high computational efficiency while generating simulation results that match very well with published experimental measurements.”

The examples above hint at the potential advances afforded by machine learning in the biochemical/medical domain, which also include the prediction of biomolecular secondary structure [e.g. rawMSA: proper Deep Learning makes protein sequence profiles and feature extraction obsolete (Aug 2018)], biodesign (e.g of new anticancer drugs) and inverse molecular design (very well-reviewed in the July 2018 Science paper Inverse Molecular Design using Machine Learning: Generative Models for Matter Engineering), and numerous other applications. These approaches offer excellent opportunities for collaboration in the advancement of our understanding of metabolism, cellular signaling, regulatory mechanisms, and disease (including cancer).

[Table of Contents]

ODE: Ordinary Differential Equations

An ordinary differential equation (ODE) is a differential equation containing one or more functions of one independent variable and the derivatives of those functions. The term ordinary is used in contrast with the term partial differential equation [see my Partial Derivatives entry for a few examples] which may be with respect to more than one independent variable.



Ordinary Differential Equations and Boolean Networks in Application to Modelling of 6-Mercaptopurine Metabolism (Apr 2017).  “We consider two approaches to modelling the cell metabolism of 6-mercaptopurine, one of the important chemotherapy drugs used for treating acute lymphocytic leukaemia: kinetic ordinary differential equations, and Boolean networks supplied with one controlling node, which takes continual values. We analyse their interplay with respect to taking into account ATP concentration as a key parameter of switching between different pathways. It is shown that the Boolean networks, which allow avoiding the complexity of general kinetic modelling, preserve the possibility of reproducing the principal switching mechanism.”

  • “There is no supporting material or special data since all equations, parameters used and algorithm are presented in the main text of the work and could be reproduced by anyone. Thus, paper contains complete self-sufficient information and does not need any additional data files.”

  • “In this work, we have analysed the dynamic behaviour of the metabolic pathways of 6-MP with a focus on revealing the key parameter that switches between the two principal ‘branches’, slow and fast. The results of simulations based on the system of ordinary equations indicate that ATP is the desired ‘key player’ in 6-MP metabolism. This conclusion is supported by a number of phenomenological observations presented in the modern biomedical literature and allows for quantitative clarification of the underlying processes.

    “Based on the results of ODE modelling, we have reformulated the problems in terms of the hybrid Boolean network, which can be considered as a deterministic analogue of the probabilistic Boolean networks. This approach is much simpler in realization since it does not require the knowledge of multiple kinetic parameters but, at the same time, adequately reproduces the key details of the switching principal dynamic regimes as a choice between different possible pathways. Therefore, it can be scaled to a more detailed picture of metabolite interactions in future research of the studied process.”


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Computational Modeling of the Metabolic States Regulated by the Kinase Akt  (Nov 2012).  “… We compared two metabolic states generated by the specific variation of the fluxes regulated by the activity of the PI3K/Akt/mTOR  pathway. One state represented the metabolism of a growing cancer cell characterized by aerobic glycolysis and cellular biosynthesis (condition H), while the other (condition L) represented the same metabolic network with a reduced glycolytic rate, a reduced lactic acid production, but a higher MPM, as reported in literature in relation to a lower activity of PI3K/Akt/mTOR  (DeBerardinis et al., 2008). Some steps of the metabolic network that link glycolysis and PPP, namely those catalyzed by the G6PDH and TKL enzymes, revealed their importance for the cancer metabolic state. Results from our model may provide insight and assist in the selection of drug targets in anticancer treatments. …”  |  “The model is available in BioModels database (Li et al., 2010a) with identifier MODEL1210150000.”


The PI3K/Akt/mTOR  pathway regulates central carbon metabolism.  [Summary](A) PI3K/Akt/mTOR pathway. Signaling through the PI3K/Akt/mTOR pathway begins with the activation of RTKs in response to growth factors, leading to auto-phosphorylation on tyrosine residues and trans-phosphorylation of adaptor proteins. The PI3K is responsible for the production of 3-phosphoinositide lipid second messengers, including PIP3, which contributes to the activation of many downstream targets, such as PDK1 and mTORC2. Both PDK1 and mTORC2 activate, through phosphorylation in different sites, the serine-threonine protein kinase Akt. Akt regulates multiple functions including cellular metabolism, by promoting cell growth and proliferation through the activation of mTORC1, which also enhances the transcriptional activity of HIF-1α. Dashed lines represent the negative regulation of the PI3K/Akt/mTOR pathway by the action of mTORC1 feedback mechanism.
(B) The metabolic network with the main reactions of glucose metabolism. The main pathways involved in the glucose metabolism are considered: glycolysis, PPP, the glycogen synthesis and degradation, lactate, and MPM branches. The metabolic targets regulated by PI3K/Akt/mTOR pathway are represented on the network: the PI3K/Akt/mTOR direct regulation is presented in yellow; the PI3K/Akt/mTOR indirect regulation (via HIF-1α) is presented in pink; the PI3K/Akt/mTOR direct and indirect regulation is presented in orange. All the PI3K/Akt/mTOR direct and indirect targets considered here are positively regulated, with the only exception of the MPM. Allosteric regulators (modifiers), activators (+), or inhibitors (-), are depicted in red.  [ ... snip ... ]
  [Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


... First part of extended Appendix shown ...  uide[Image source. Click image to open in new window.]

Metabolic Flux Analysis and Metabolic Engineering of Microorganisms (Feb 2008).  “Recent advances in metabolic flux analysis including genome-scale constraints-based flux analysis and its applications in metabolic engineering are reviewed. Various computational aspects of constraints-based flux analysis including genome-scale stoichiometric models, additional constraints used for the improved accuracy, and several algorithms for identifying the target genes to be manipulated are described. Also, some of the successful applications of metabolic flux analysis in metabolic engineering are reviewed. Finally, we discuss the limitations that need to be overcome to make the results of genome-scale flux analysis more realistically represent the real cell metabolism.”


[Image source. Click image to open in new window.]


Metabolic stoichiometric matrix (amenable to linear algebra calculations).  [Image source. Click image to open in new window.]

Systems Approaches to Modelling Pathways and Networks (Sep 2011).  “It has become commonly accepted that systems approaches to biology are of outstanding importance to gain understanding from the vast amount of data which is presently being generated by advancing high-throughput technologies. The diversity of methods to model pathways and networks has significantly expanded over the past two decades. Modern and traditional approaches are equally important and recent activities aim at integrating the advantages of both. While traditional methods, based on differential equations, are useful to study the dynamics of small systems, modern constraint-based models can be applied to genome-scale systems, but are not able to capture dynamic features. Integrating different approaches is important to develop consistent theoretical descriptions encompassing various scales of biological information. The rapid progress of the field of theoretical systems biology, however, demonstrates how our fundamental theoretical understanding of biology is gaining momentum. The scientific community has apparently accepted the challenge to truly understand the principles of life.”


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Neural Ordinary Differential Equations (University of Toronto; Vector Institute: Jun 2018, updated Jan 2019;  Best Paper award at NeurIPS 2018]  codeslides;  discussion (reddit) here and herediscussion (The Morning Paper); discussion (Hacker News)]

“We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.”


[Image source. Click image to open in new window.]


Example result obtained with Continuous Normalizing Flows (two moons problem from paper).  [Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

A Hybrid Stochastic Model of the Budding Yeast Cell Cycle (Jul 2019).  “The growth and division of eukaryotic cells are regulated by complex, multi-scale networks. In this process, the mechanism controlling cell cycle progression has to be robust against inherent noise in the system. In this paper, a hybrid stochastic model is developed to study the effects of noise on the control mechanism of the budding yeast cell cycle. The modeling approach leverages, in a single multi-scale model, the advantages of two regimes: (1) the computational efficiency of a deterministic approach, and (2) the accuracy of stochastic simulations. Our results show that this hybrid stochastic model achieves high computational efficiency while generating simulation results that match very well with published experimental measurements.”

ODE: Ordinary Differential Equations:

Additional Reading


[Table of Contents]

Metabolic Models

Time-Resolved Genome-Scale Profiling Reveals a Causal Expression Network (Google AI | Calico Life Sciences: May 2019)

  • “We present an approach for inferring genome-wide regulatory causality and demonstrate its application on a yeast dataset constructed by independently inducing hundreds of transcription factors and measuring timecourses of the resulting gene expression responses. We discuss the regulatory cascades in detail for a single transcription factor, Aft1; however, we have 201 TF induction timecourses that include >100,000 signal-containing dynamic responses. From a single TF induction timecourse we can often discriminate the direct from the indirect effects of the induced TF. Across our entire dataset, however, we find that the majority of expression changes are indirectly driven by unknown regulators. By integrating all timecourses into a single whole-cell transcriptional model, potential regulators of each gene can be predicted without incorporating prior information. In doing so, the indirect effects of a TF are understood as a series of direct regulatory predictions that capture how regulation propagates over time to create a causal regulatory network. This approach, referred to as the Perturbation Inference of Networks (PIN) framework, resulted in the prediction of multiple transcriptional regulators that were validated experimentally.”
Metabolic Models:

Additional Reading

[Table of Contents]

FBA: Flux Balance Analysis

Flux balance analysis (FBA) is a mathematical method for simulating metabolism in genome-scale reconstructions of metabolic networks. In comparison to traditional methods of modeling, FBA is less intensive in terms of the input data required for constructing the model. Simulations performed using FBA are computationally inexpensive and can calculate steady-state metabolic fluxes for large models (over 2000 reactions) in a few seconds on modern personal computers.

FBA finds applications in bioprocess engineering to systematically identify modifications to the metabolic networks of microbes used in fermentation processes that improve product yields of industrially important chemicals such as ethanol and succinic acid. It has also been used for the identification of putative drug targets in cancer and pathogens, rational design of culture media, and more recently host-pathogen interactions. The results of FBA can be visualized using flux maps similar to the image on the right, which illustrates the steady-state fluxes carried by reactions in glycolysis. The thickness of the arrows is proportional to the flux through the reaction.

FBA formalizes the system of equations describing the concentration changes in a metabolic network as the dot product of a matrix of the stoichiometric coefficients (the stoichiometric matrix S) and the vector v of the unsolved fluxes. The right-hand side of the dot product is a vector of zeros representing the system at steady state. Linear programming is then used to calculate a solution of fluxes corresponding to the steady state.


[Image source. Click image to open in new window.]

From Slide 13 in Flux Balance Analysis:

  • One of the techniques used to analyse the complete metabolic genotype of a microbial strain is flux balance analysis (FBA):

    • relies on balancing metabolic fluxes
    • is based on the fundamental law of mass conservation
    • is performed under steady-state conditions (an example of constraint …)
    • requires information only about:
      1. the stoichiometric of metabolic pathways,
      2. metabolic demands,
      3. and a few strain specific parameters
    • it does NOT require enzymatic kinetic data

Dynamic Modeling of Signal Transduction by mTOR Complexes in Cancer (May 2019)

  • Signal integration in the mTOR pathway plays a vital role in cell fate decision making in cancer cells. As a signal integrator, mTOR shows a complex dynamical behavior which determines the cell fate at different cellular processes levels including cell cycle progression, cell survival, cell death, metabolic reprogramming, and aging. The dynamics of the complex responses to rapamycin in cancer cells have been attributed to its differential time-dependent inhibitory effects on mTORC1 and mTORC2, the two main complexes of mTOR. Two explanations were previously provided for this phenomenon: 1- Rapamycin does not inhibit mTORC2 directly, whereas it prevents mTORC2 formation by sequestering free mTOR protein. 2- Components like Phosphatidic Acid further stabilize mTORC2 compared with mTORC1. To understand the mechanism by which rapamycin differentially inhibits the mTOR complexes, we present a mathematical model of rapamycin mode of action based on the first explanation, i.e., Le Chatelier’s principle. Translating the interactions among components of mTORC1 and mTORC2 into a mathematical model revealed the dynamics of rapamycin action in different doses and time-intervals of rapamycin treatment. The model shows that rapamycin has stronger effects on mTORC1 compared with mTORC2, simply due to its direct interaction with free mTOR and mTORC1, but not mTORC2, without the need to consider other components that might further stabilize mTORC2. Based on our results, even when mTORC2 is less stable compared with mTORC1, it can be less inhibited by rapamycin.

  • Supplementary File 1 (pdf)
  • Supplementary Figure 1 (pdf)

FBA: Flux Balance Analysis:

Additional Reading

[Table of Contents]

Synthetic Biology

A CRISPR/Cas9-Based Central Processing Unit to Program Complex Logic Computation in Human Cells (Mar 2019) [discussion (]

  • Significance.   By enabling rational programming of mammalian cell behavior, synthetic biology is driving innovation across biomedical applications. Using Cas9-variants as core as protein-based central processing units (CPUs) that control gene expression in response to single-guide RNAs as genetic software, we have programmed scalable Boolean logic computations such as the half adder in single human cells. Combining orthogonal Cas9-variants enabled the design of multicore genetic CPUs that provide parallel arithmetic computations. The Cas9-based multicore CPU design may provide opportunities in single-cell mammalian biocomputing to provide biomedical applications.”

  • Abstract.  Controlling gene expression with sophisticated logic gates has been and remains one of the central aims of synthetic biology. However, conventional implementations of biocomputers use central processing units (CPUs) assembled from multiple protein-based gene switches, limiting the programming flexibility and complexity that can be achieved within single cells. Here, we introduce a CRISPR/Cas9-based core processor that enables different sets of user-defined guide RNA inputs to program a single transcriptional regulator (dCas9-KRAB) to perform a wide range of bitwise computations, from simple Boolean logic gates to arithmetic operations such as the half adder. Furthermore, we built a dual-core CPU combining two orthogonal core processors in a single cell. In principle, human cells integrating multiple orthogonal CRISPR/Cas9-based core processors could offer enormous computational capacity.”

A Framework for the Modular and Combinatorial Assembly of Synthetic Gene Circuits (Apr 2019)

  • “Synthetic gene circuits emerge from iterative design-build-test cycles. Most commonly, the time-limiting step is the circuit construction process. Here, we present a hierarchical cloning scheme based on the widespread Gibson assembly method [see also] and make the set of constructed plasmids freely available. Our two-step modular cloning scheme allows for simple, fast, efficient and accurate assembly of gene circuits and combinatorial circuit libraries in Escherichia coli. The first step involves Gibson assembly of transcriptional units from constituent parts into individual intermediate plasmids. In the second step, these plasmids are digested with specific sets of restriction enzymes. The resulting flanking regions have overlaps that drive a second Gibson assembly into a single plasmid to yield the final circuit. This approach substantially reduces time and sequencing costs associated with gene circuit construction and allows for modular and combinatorial assembly of circuits. We demonstrate the usefulness of our framework by assembling a double-inverter circuit and a combinatorial library of 3-node networks.”


    [Image source. Click image to open in new window.]


    [Image source. Click image to open in new window.]

A Compact Synthetic Pathway Rewires Cancer Signaling to Therapeutic Effector Release (Science: May 2019) [media]

  • Seeking and Destroying Cancer Cells.  Rather than inhibiting aberrant signaling in cancer cells, what if that signal was put to work in detecting and destroying the cancer cells? Such an anticancer strategy could be based on the ErbB family of receptors that is activated in many cancers. Chung et al. developed a cell-killing circuit that is activated by excessive ErbB signaling and used a viral delivery system to add it to cells. The ErbB receptor proteins are tyrosine kinases that autophosphorylate. The authors designed a protease that would be recruited to such over-phosphorylated receptors. Once in place, the protease cleaves a protein anchored to the cell membrane, releasing it into the cytoplasm where it causes death of the transformed cells.

  • Introduction.  The specific identification and ablation of cancer cells is a long-standing problem in medicine that has not been fully solved. Cancer cells differ from normal cells in their ability to proliferate and survive in an uncontrolled manner, a consequence of mutations that drive the constitutive activation of intracellular signaling pathways. Signals commonly activated in cancer have been targeted for suppression by drugs, but these drugs are limited by toxic effects from inhibiting normal signaling as well. We considered a fundamentally different approach to cancer treatment in which oncogenic signals are detected and then, instead of being suppressed, are co-opted to trigger a therapeutic program. Here, we describe a method, Rewiring of Aberrant Signaling to Effector Release (RASER), in which a compact synthetic signaling pathway detects an oncogenic signal with high specificity and then rewires it to a variety of customizable responses.

  • Rationale.  As oncogenic signaling differs from normal signaling in its constitutive nature, we hypothesized that signal integration over time should provide a measurement specific for cancer states. We also hypothesized that proteolytic release from sequestration could provide a mechanism for both irreversible signal integration and activation of a variety of effector proteins. We chose ErbB proteins, which include EGFR and HER2 and are constitutively activated in a large fraction of solid tumors, as targets for detection and rewiring of oncogenic states. We then designed ErbB-specific RASER on the basis of recruitment of a sequence-specific viral protease to release effector domains from sequestration at the plasma membrane. Modular design of these synthetic proteins facilitated tuning of sensitivity over a broad range and allowed output functions to be generalized to a variety of effectors. Finally, we developed a complete mathematical model of RASER to accurately predict system behavior, enabling the rational identification of strategies for system optimization.

  • Results.  Three rounds of model-guided optimization resulted in a RASER system with three mechanisms contributing to ErbB-induced effector release: protease recruitment to active ErbB, substrate recruitment to active ErbB, and ErbB-dependent stabilization of the protease. The final ErbB RASER system successfully released a reporter protein in a variety of ErbB-driven cancer cells but not in tissue-matched ErbB-normal cells. The fold induction of reporter release in the presence of constitutive ErbB exceeded that of Akt or ERK activation, two endogenous signaling outputs of ErbB. As desired, ErbB RASER output was similar to baseline after activation of ErbB by treatment with its natural ligand, epidermal growth factor (EGF), whereas Akt and ERK were well activated by EGF, indicating that ErbB RASER was specifically responsive to oncogenic ErbB states. We then successfully programmed RASER to link ErbB signaling to a variety of biological outcomes, including apoptosis and activation of endogenous genes of choice via catalytically dead Cas9-mediated RNA-directed transcription. These responses occurred robustly in ErbB-hyperactive cancer cells in an ErbB activity-dependent manner and were again absent in ErbB-normal cells. Finally, we used nonintegrating adeno-associated virus (AAV) to deliver ErbB RASER with an apoptotic cargo to cocultures of ErbB-hyperactive pancreatic cancer cells and ErbB-normal hepatocytes. As desired, AAV-delivered ErbB RASER selectively ablated the pancreatic cancer cells but spared the ErbB-normal hepatocytes.

  • Conclusion.  RASER introduces a new concept for cancer detection and treatment, in which specific oncogenic signals are detected in cancer cells and then used to trigger a programmable therapeutic response. The performance of RASER demonstrates that synthetic signaling pathways based on first principles can detect an oncogenic protein activity with specificity matching or exceeding that of natural signaling pathways. RASER also serves as an example of the ability of mathematical models to accurately predict the behavior of compact synthetic signaling systems. Finally, the ability of AAV-delivered ErbB RASER to selectively ablate ErbB-hyperactive tumor cells suggests the possibility of using RASER-expressing viruses to treat cancer in vivo. Further generalization of RASER to other inputs and outputs could enable the development of a panel of active biological therapies targeted to specific cancerous states.

  • Abstract. An important goal in synthetic biology is to engineer biochemical pathways to address unsolved biomedical problems. One long-standing problem in molecular medicine is the specific identification and ablation of cancer cells. Here, we describe a method, named Rewiring of Aberrant Signaling to Effector Release (RASER), in which oncogenic ErbB receptor activity, instead of being targeted for inhibition as in existing treatments, is co-opted to trigger therapeutic programs. RASER integrates ErbB activity to specifically link oncogenic states to the execution of desired outputs. A complete mathematical model of RASER and modularity in design enable rational optimization and output programming. Using RASER, we induced apoptosis and CRISPR-Cas9-mediated transcription of endogenous genes specifically in ErbB-hyperactive cancer cells. Delivery of apoptotic RASER by adeno-associated virus selectively ablated ErbB-hyperactive cancer cells while sparing ErbB-normal cells. RASER thus provides a new strategy for oncogene-specific cancer detection and treatment.”

  • Image: RASER in cancer cells. (Left) In response to ErbB (EGFR or HER2) activity, RASER proteins (green and blue) release a programmable effector to carry out therapeutic responses. (Right) RASER transforms normal signaling, which is transient, to low and transient accumulation of effector. In contrast, with constitutive oncogenic signaling, effector accumulates until a therapeutic threshold is reached.


    [Image source. Click image to open in new window.]

Rational Design of a Bifunctional AND-Gate Ligand to Modulate Cell-Cell Interactions (Jul 2019).  “Protein ‘AND-gate’ systems, in which a ligand acts only on cells with two different receptors, direct signaling activity to a particular cell type and avoid action on other cells. In a bifunctional AND-Gate protein, the molecular geometry of the protein domains is crucial. Here we constructed a tissue-targeted erythropoietin (EPO) that stimulates red blood cell (RBC) production without triggering thrombosis. EPO was directed to RBC precursors and mature RBCs by fusion to an anti-glycophorin A antibody V region. Many such constructs activated EPO receptors in vitro and stimulated RBC and not platelet production in mice but nonetheless enhanced thrombosis in mice and caused adhesion between RBCs and EPO receptor-bearing cells. Based on a protein-structural model of the RBC surface, we rationally designed an anti-glycophorin/EPO fusion that does not induce cell adhesion in vitro or enhance thrombosis in vivo. Thus, meso-scale geometry can inform design of synthetic-biological systems.”

A synthetic metabolic network for physicochemical homeostasis (Sep 2019) [media].  “One of the grand challenges in chemistry is the construction of functional out-of-equilibrium networks, which are typical of living cells. Building such a system from molecular components requires control over the formation and degradation of the interacting chemicals and homeostasis of the internal physical-chemical conditions. The provision and consumption of ATP lies at the heart of this challenge. Here we report the in vitro construction of a pathway in vesicles for sustained ATP production that is maintained away from equilibrium by control of energy dissipation. We maintain a constant level of ATP with varying load on the system. The pathway enables us to control the transmembrane fluxes of osmolytes and to demonstrate basic physicochemical homeostasis. Our work demonstrates metabolic energy conservation and cell volume regulatory mechanisms in a cell-like system at a level of complexity minimally needed for life.”