### Technical Review

In silico Modeling

# IN SILICO MODELING

In silico modeling – the computational modeling of biochemical, metabolic, pharmacologic or physiologic processes – is a logical extension of in vitro experimentation [In silico Modelling of Physiologic Systems (Dec 2011)]. A natural result of the explosive increase in computing power available to research scientists at continually decreasing cost, in silico modeling combines the advantages of in vivo and in vitro experimentation, not subject to ethical considerations or the lack of control associated with many in vivo experiments (e.g. human or animal experimentation). In silico models also allow researchers to include a virtually unlimited array of parameters, potentially rendering the results more applicable to the organism as a whole.

Examples of recent work in the in silico domain include In vivo and In silico Dynamics of the Development of Metabolic Syndrome (Jun 2018) [code]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

The University of Connecticut’s “Virtual Cell,” Compartmental and Spatial Rule-Based Modeling with Virtual Cell;  [project pages here and herecodecommunitymedia]) provides a comprehensive platform for modeling and simulation of cell biology from biological pathways down to cell biophysics. VCell supports biochemical network and rule-based modeling and electrophysiology in compartmental modeling and within cellular geometry.

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

One might inquiry whether knowledge graphs, which naturally embed biochemical pathways and networks, are amenable to silico perturbation. The Cellular Potts Model (CPM ) is a computational biological model of the collective behavior of cellular structures. CPM allows modeling of many phenomena such as cell migration, clustering, and growth taking adhesive forces – taking environment sensing as well as volume and surface area constraints into account. In silico Modeling for Tumor Growth Visualization (Aug 2016) implemented a crude graphical model via a CPM, that can be visualized in Cytoscape via cpm-cytoscape  [project]. Those authors also described their work in Machine Learning for In Silico Modeling of Tumor Growth.

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image sourceECM: extracellular matrix.  Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

In the synthetic biology domain, Out-of-Equilibrium Microcompartments for the Bottom-Up Integration of Metabolic Functions (Jun 2018) [media] recently described the analysis of self-sustained metabolic pathways in a microfluidic platform, water-in-oil droplet microcompartments. The authors developed an assay based on nicotinamide adenine dinucleotide (NADH) fluorescence to quantify the metabolic state of the microcompartments. The minimal metabolism was constructed from a reaction converting glucose-6-phosphate into 6-phosphogluconolactone. The reaction was catalysed by glucose-6-phosphate dehydrogenase, an enzyme involved in the pentose phosphate pathway. A key feature integrated the ability to function under conditions where the reaction was sustained independently of the cofactor stoichiometry. The full conversion of the metabolic substrate required the regeneration of the cofactor NAD+: a regeneration module made of inverted membrane vesicles extracted from E. coli.

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

The metabolic pathways exemplified in that microfluidic approach (above) are easily modeled in a knowledge graph, suggesting the possibility of the in silico modeling of pathway reactions in parallel with “wet lab,” systems biology approaches. Other relatively easily-constructed models (e.g.: bioengineered microbes and viruses on microbiological media; transgenic rodents; etc.) could likewise be modeled in a combined systems biology, in silico modeling approach. In this regard, clustered regularly interspaced short palindromic repeats/CRISPR associated protein 9 (CRISPR/Cas9) gene editing technology is especially relevant; for example, note Section 3 in CRISPR/Cas9-Based Genome Editing for Disease Modeling and Therapy: Challenges and Opportunities for Nonviral Delivery (Jun 2017), and The Present and Future of Genome Editing in Cancer Research (Jul 2017).  CRISPR/Cas9 for Cancer Research and Therapy (Apr 2018) provides an excellent review and perspective on this technology.

Reinforcement learning (RL) can be applied in optimizing chemical/biochemical reactions. Optimizing Chemical Reactions with Deep Reinforcement Learning (Dec 2017) [code] showed that their RL model outperformed a state of the art algorithm, and generalized to dissimilar underlying mechanisms. Combined with LSTM to model the policy function, the RL agent optimized the chemical reaction with the Markov decision process (MDP) characterized by $\small \{S, A, P, R\}$, where $\small S$ was the set of experimental conditions (like temperature, pH, etc), $\small A$ was the set all possible actions that can change the experimental conditions, $\small P$ was the transition probability from current experiment condition to the next condition, and $\small R$ was the reward which is a function of the state.

• Their Deep Reaction Optimizer model iteratively recorded the results of a chemical reaction and chose new experimental conditions to improve the reaction outcome, outperforming a state of the art black box optimization algorithm by using 71% fewer steps on both simulations and real reactions. Furthermore, they introduced an efficient exploration strategy by drawing the reaction conditions from certain probability distributions, which resulted in an improvement on “regret” from 0.062 to 0.039 compared with a deterministic policy (they used “regret” to evaluate the performance of the model ). For the optimization of real-world reactions, the authors carried out four experiments in microdroplets ( in their paper: synthesis of ribose phosphate, etc.) and recorded the production yield. Combining the efficient exploration policy with accelerated microdroplet reactions, their Deep Reaction Optimizer not only served as an efficient and effective reaction optimizer (optimal reaction conditions were determined in 30 min for the four reactions considered), it also provided a better understanding of the mechanism of chemical reactions than that obtained using traditional approaches.

[Image source. Click image to open in new window.]

[Image source. Click image to open in new window.]

The examples above hint at the potential advances afforded by machine learning in the biochemical/medical domain, which also include the prediction of biomolecular secondary structure [e.g. rawMSA: proper Deep Learning makes protein sequence profiles and feature extraction obsolete (Aug 2018)], biodesign (e.g of new anticancer drugs) and inverse molecular design (very well-reviewed in the July 2018 Science paper Inverse Molecular Design using Machine Learning: Generative Models for Matter Engineering), and numerous other applications. These approaches offer excellent opportunities for collaboration in the advancement of our understanding of metabolism, cellular signaling, regulatory mechanisms, and disease (including cancer).