Software and data

Software available in the Integrated Systems Biology and AI — Tailored Pharmacology and Precision Medicine Lab includes:

ANNE

The artificial neural network (ANN) was initially created to model how the human brain works. Over the past few decades, ANN has evolved into numerous sophisticated algorithms with proven outstanding performance in various recognition tasks.

Artificial neural network encoder (ANNE) is a novel weight engineering deep machine learning method that harnesses the power of autoencoder models and demonstrates that it is possible to decode meaningful information encoded in ANN models trained for specific tasks. We applied ANNE on breast cancer gene expression data with known clinical properties as case studies.

Our work illustrates that the trained autoencoder models are information encoders and that meaningful gene-gene associations with supported evidence can be retrieved. ANNE opens a new avenue in machine intelligence. ANN models will no longer be perceived as tools to perform recognition tasks but rather as powerful tools to extract meaningful information embedded within a sea of high-dimensional data.

Reference

Source code

The source code is available for public access.

ASTAR-seq

Assay for single-cell transcriptome and accessibility regions (ASTAR-seq) is an automated method with high sensitivity used to simultaneously measure whole-cell transcriptome and chromatin accessibility within the same single cell.

References

Source code

The source code is available for public access.

CellNet

CellNet is a network biology-based computational platform that assesses the fidelity of cellular engineering more accurately than existing methodologies do and generates hypotheses for improving cell derivations.

References

Materials available

The web interface and other materials are on the website portal.

CLR

Context likelihood of relatedness (CLR) is a network biology algorithm for reverse engineering and inferring regulatory interactions between master regulators and their targets using a compendium of transcriptome profiles.

Reference

Source code

Download.

Computational drug discovery platform, machine learning, feature selection, AI drug discoveries

Artificial intelligence (AI) and machine learning methods — and feature selection approaches to predict specific pharmacodynamic, pharmacokinetic or toxicological properties of pharmaceutical agents — are useful to facilitate the discovery and development of new drugs. Pharmaceutical agents have been developed and tested for possessing desirable pharmacodynamics and pharmacokinetics and a minimal level of toxicological properties.

Computational methods have been explored to predict these properties, aimed at discovering promising leads and eliminating unsuitable leads in the early stages of drug development. AI and machine learning methods have shown huge potential at predicting these properties for structurally diverse sets of agents. These methods have been used to predict agents with a variety of pharmacodynamic, pharmacokinetic and toxicological properties.

References

Source code

Download.

DPYD-Varifier

The DPYD gene-specific variant classifier DPYD-Varifier is a highly accurate in silico classifier for predicting the functional impact of DPYD variants on dihydropyrimidine dehydrogenase (DPD) activity. DPYD-Varifier has great potential for systems pharmacology and individualized medicine and for improving the clinical decision-making process.

Reference

EDDI

Expression Dosage Dependent Inferelator (EDDI) is a machine learning and systems biology approach used to characterize dosage-based gene dependencies.

Reference

Source code

The source code is available for public access.

GEDI

Gene Expression Dynamics Inspector (GEDI), developed in the lab of Donald E. Ingber, M.D., Ph.D., at the Wyss Institute for Biologically Inspired Engineering at Harvard University, is a computational program that opens a new perspective for analyzing transcriptome data. By treating each high-dimensional sample — such as one transcriptome experiment — as an object, it accentuates and visualizes the genomewide response of a tissue or a patient and treats it as an integrated biological entity.

GEDI honors the new spirit of a systems-level approach in biology and unites a novel holistic perspective with the traditional gene-centered approach in molecular biology.

Reference

Questions?

Contact Dr. Ingber or Dr. Li.

GUM

The Gene Utility Model (GUM) is a novel computational pipeline used to understand the importance of genes under specific cellular contexts. GUM states that it is the utility of genes that provides selective pressure for the survival and fitness of aberrant cells.

It is possible to use GUM to construct a "utility karyotype" by mapping differentially used genes to their respective chromosomal loci. Further, GUM predicts whether the resulting utility karyotype can recapitulate, to a certain extent, the chromosomal aberrancies observed in diseases.

Reference

Hypothesis-driven AI

Hypothesis-driven AI is a new class of AI that has not been previously described. Unlike conventional AI, hypothesis-driven AI is guided by the underlying hypothesis that can explain how a system behaves. This new AI technology offers a way to test a hypothesis and make new discoveries using an AI approach.

Hypothesis-driven AI offers a targeted and informed approach to address many of the challenges in diseases. Hypothesis-driven AI can perform focused investigations by centering on specific hypotheses or research questions and thus uses prior knowledge to guide its exploration. This approach can generate more interpretable and explainable results, compared with those of conventional AI tools. That's because the underlying hypotheses provide a mechanistic framework in which to understand the logic behind certain predictions or outcomes.

Hypothesis-driven AI tends to use resources more efficiently. It encourages the integration of domain-specific knowledge to generate meaningful insights within a specific context. Hypothesis-driven AI allows researchers to test and validate hypotheses via AI-mediated gedankenexperiments, a term coined by Albert Einstein, Ph.D., that means thought experiment. This in turn guides future experimental designs.

Reference

LIFE

Learning-Based Invariant Feature Engineering (LIFE) is a novel feature engineering platform. Symmetry refers to properties that remain invariant upon mathematical transformations. Yet it remains unexplored in biology and medicine.

We set out to explore symmetry relationships in gene expression to distinguish between healthy and disease states. We hypothesize that there are relationships between gene expressions that remain invariant across people displaying the same biological phenotypes.

Our Gene Expression Symmetry Hypothesis (GESH) posits that a set of genes exhibiting specific symmetrical relationships defines the invariant nature of phenotypic traits in cells. We deployed a hybrid machine learning approach and implemented it with two symmetric invariant feature functions (IFFs) to identify invariant feature genes (IFGs). IFGs are gene pairs for which IFF single-value outputs remain invariant across individual samples in each phenotype.

Our multiclass classification identified unique fingerprints across the transcriptomes derived from 25 normal organs, 25 cancer types and blood samples from people with four types of neurodegenerative diseases. We constructed networks from the IFGs (IF-Nets) and found that cancer IF-Net hubs were enriched with approved and clinical trial drugs, highlighting symmetry breaking as a novel treatment approach.

Reference

MALANI

Machine Learning-Assisted Network Inference (MALANI) is a hybrid computational platform that harnesses the power of both machine learning and network biology methodologies to provide new insights and improve the understanding of complex biological systems. MALANI assesses all genes, regardless of expression or mutational status in the context of disease etiology, by building more than 2 million machine learning models for reconstructing gene regulatory networks.

MALANI has the power to uncover "dark" disease genes that are neither mutated nor differentially expressed but play important pathological roles in disease development.

Reference

Source code

Download.

MNI

Mode-of-action by network inference (MNI) is a reverse engineering network biology algorithm that identifies the gene targets and key mediators of a biomedical phenotype based on transcriptome data.

References

Modified RNA

Highly efficient reprogramming to pluripotency and directed differentiation of human cells with synthetic modified mRNA.

Reference

Multiregional GBM imaging and genetics

Integrated molecular and multiparametric MRI mapping of high-grade glioma identifies regional biologic signatures.

Reference

  • Hu LS, D'Angelo F, Weiskittel TM, Caruso FP, Fortin Ensign SP, Blomquist MR, Flick MJ, Wang L, Sereduk CP, Meng-Lin K, De Leon G, Nespodzany A, Urcuyo JC, Gonzales AC, Curtin L, Lewis EM, Singleton KW, Dondlinger T, Anil A, Semmineh NB, Noviello T, Patel RA, Wang P, Wang J, Eschbacher JM, Hawkins-Daarud A, Jackson PR, Grunfeld IS, Elrod C, Mazza GL, McGee SC, Paulson L, Clark-Swanson K, Lassiter-Morris Y, Smith KA, Nakaji P, Bendok BR, Zimmerman RS, Krishna C, Patra DP, Patel NP, Lyons M, Neal M, Donev K, Mrugala MM, Porter AB, Beeman SC, Jensen TR, Schmainda KM, Zhou Y, Baxter LC, Plaisier CL, Li J, Li H, Lasorella A, Quarles CC, Swanson KR, Ceccarelli M, Iavarone A, Tran NL. Integrated molecular and multiparametric MRI mapping of high-grade glioma identifies regional biologic signatures. Nature Communications. 2023; doi:10.1038/s41467-023-41559-1.

Source code

The source code is available for public access.

NetDecoder

NetDecoder is a network biology computational platform used to dissect context-specific biological networks and gene activities. NetDecoder provides freely available source code and web portal resources for researchers to explore genomewide context-dependent information flow profiles and key genes using pairwise phenotypic comparative analyses. NetDecoder also allows researchers to prioritize drug targets for genes that affect pathological contexts.

Reference

Source code

Download.

Pathway modeling and simulation

One of the most commonly used approaches for modeling biological systems is that of ordinary differential equations (ODEs). In general, a differential equation can be used to describe the chemical reaction rate that depends on the change of participating species over time. A set of coupled ODEs can capture the temporal dynamic behavior of molecular species in the biological signaling pathway network.

References

PERMUTOR

Personalized mutation evaluator (PERMUTOR) is a novel computational pipeline that collects potent disease gene cooperative pathways to envision individualized disease etiology and therapies. Our algorithm constructs individualized disease networks and modules de novo, which enables us to elucidate the importance of mutated genes in specific patients and understand the synthetic penetrance of these genes across patients.

Individualized module disruption enables us to devise customized singular and combinatorial target therapies that are highly varied across patients, demonstrating the need for precision therapeutics pipelines. With the first analysis of de novo individualized disease networks and modules, we illustrate the power of individualized disease modules for precision medicine by providing deep novel insights on the activity of diseased genes in people.

Reference

  • Hu LS, D'Angelo F, Weiskittel TM, Caruso FP, Fortin Ensign SP, Blomquist MR, Flick MJ, Wang L, Sereduk CP, Meng-Lin K, De Leon G, Nespodzany A, Urcuyo JC, Gonzales AC, Curtin L, Lewis EM, Singleton KW, Dondlinger T, Anil A, Semmineh NB, Noviello T, Patel RA, Wang P, Wang J, Eschbacher JM, Hawkins-Daarud A, Jackson PR, Grunfeld IS, Elrod C, Mazza GL, McGee SC, Paulson L, Clark-Swanson K, Lassiter-Morris Y, Smith KA, Nakaji P, Bendok BR, Zimmerman RS, Krishna C, Patra DP, Patel NP, Lyons M, Neal M, Donev K, Mrugala MM, Porter AB, Beeman SC, Jensen TR, Schmainda KM, Zhou Y, Baxter LC, Plaisier CL, Li J, Li H, Lasorella A, Quarles CC, Swanson KR, Ceccarelli M, Iavarone A, Tran NL. Integrated molecular and multiparametric MRI mapping of high-grade glioma identifies regional biologic signatures. Nature Communications. 2023; doi:10.1038/s41467-023-41559-1.

Source code

The source code is available for public access.

P-Map

Phenotype mapping (P-Map) is a network-based approach used to identify genes and regulatory networks that modulate drug response phenotypes.

Reference

Source code:

Download.

RSI

Regulostat Inferelator (RSI) is a novel computational algorithm to decipher intrinsic molecular devices called regulostats. Regulostats predetermine cellular phenotypic responses.

Reference

Web interface and source code

Download.

sn-m6A-CT data analysis

Single-nucleus m6A-CUT&Tag (sn-m6A-CT) is for simultaneous profiling of m6A methylomes and transcriptomes within a single nucleus. sn-m6A-CT can enrich m6A-marked RNA molecules in situ without isolating RNA from cells. sn-m6A-CT profiling is sufficient to determine cell identity, and it allows the generation of cell type-specific m6A methylome landscapes from heterogeneous populations.

Reference

Source code

The source code is available for public access.

SPIN-AI

Spatially resolved sequencing technologies help us dissect how cells are organized in space. Several available computational approaches focus on the identification of spatially variable genes (SVGs). These are genes with expression patterns that vary in space.

Detecting SVGs is analogous to identifying differentially expressed genes. It permits us to understand how genes and associated molecular processes are spatially distributed within cellular niches. However, the expression activities of SVGs fail to encode all information inherent to the spatial distribution of cells.

Here, we devised a deep learning model — Spatially Informed Artificial Intelligence (SPIN-AI) — to identify spatially predictive genes (SPGs). These are genes with expression that can predict how cells are organized in space without any prior assumptions of spatial distribution.

We used SPIN-AI on spatial transcriptomic data from squamous cell carcinoma as a proof of concept. Our results demonstrated that SPGs not only recapitulate the biology of squamous cell carcinoma but also identify genes distinct from SVGs. Moreover, we found a substantial number of ribosomal genes that are SPGs but not SVGs.

Since SPGs can predict spatial cellular organization, we reason that SPGs capture more biologically relevant information for a given cellular niche. Hence, SPIN-AI has broad applications for detecting SPGs and uncovering which biological processes play important roles in governing cellular organization.

Reference

Source code

The source code is available for public access.

StemSite

StemSite is a database network of transcriptional gene regulators for identifying and engineering the developmental origin of mouse hematopoietic stem cells.

Reference

Database

Access the database.