Chief among my personal and professional projects is my passion and vision regarding biomolecular knowledge. This has been a life-long journey, that has evolved from my early fascination/basic research with DNA and molecular genetics, through a deeper appreciation of the information encoded within our genomes, and a broader fascination with informatics.
I envision innovative, impactful solutions to scientific and societal issues through the application of science and technology to the betterment of human health.
I accomplish this through
the practical advancement of knowledge in functional genomics (the phenotypic and functional expression of the information contained within genomes), and
knowledge discovery, through natural language processing and machine learning approaches applied to the biomedical domain.
These efforts build on my thorough grounding in biochemistry (B.Sc.), environmental health (M.Sc.), molecular genetics (Ph.D), and post-doctoral experience in informatics and knowledge discovery.
I am especially motivated by information retrieval/extraction, the construction of knowledge stores and graphical models, and the application of that knowledge to real-world problems including: molecular biology, cellular signalling, cancer genomics, and personalized medicine.
The union of my core domains (genomics; programming; natural language processing; machine learning; bioinformatics) enables a better understanding of implicit and explicit relationships and interactions, facilitating translational knowledge discovery.
I hold a Ph.D. in Biology (2000; specialization in molecular genetics), augmented with considerable programmatic and computational expertise. The summary below provides a reasonably complete overview of me and my overlapping personal and professional pursuits.
I possess a unique and well-honed wealth of knowledge and experience, spanning
- biochemistry; cell biology; metabolic pathways; cellular signaling pathways
- molecular genetics and genomics, including cancer biology
- information retrieval and extraction
and more recently programming experience including, in descending order of expertise/experience:
- Linux super-user
- command-line, bash scripting expertise
- Python; virtual environments
- NLP: natural language processing
- Apache Solr: high-performance document indexing / storage / search (backend), + user GUI (frontend): Carrot2 clustering engine; D3.js visualizations
- R (GNU S / CRAN)
- graphical knowledge stores/graphs (KG: Neo4j; Cypher query language)
- some relational databasing (PostgreSQL)
- some machine learning (computer vision; vector space models; …)
- basic, “hands-on” familiarity with:
- webdev (HTML / JS / CSS)
- Java (some basic scripting; IntelliJ IDEA …).
I am a moderate contributor to:
Regarding machine learning (ML), for a period of ~1.5+ years (2015-2017) I immersed myself in that domain (arXiv; reddit ML subreddit; Y Combinator “Hacker News”; RSS feeds). I can follow the literature, install and debug the major platforms (Theano; Caffe; Torch7; TensorFlow; etc. I have posted contributions/solutions associated with various GitHub “Issues” and some StackOverflow.com questions, and I can clone and implement basic models as well as follow the recent literature (that is progressing at a staggering pace).
During that period I implemented various personal, self-taught ML projects, including:
- an image captioning system
- a webcam-based personal identification system: persons identified by name with faces identified by bounding boxes with the persons name, probability
- a webcam-based classifier: backend: ImageNet top-five categories via a 50-layer residual neural network (ResNet-50), + web browser frontend
- computer vision (webcam) age-gender classifier
- other experiments …
That work was fun and rewarding, but my primary motivation in developing those skills and understanding was for supplemental, ML-based approaches to biomedical natural language processing (BioNLP) and bioinformatics; including:
- RNN/LSTM (recurrent neural nets/long short-term memory-based models)
- VSM (vector space models)
- dimensionality reduction
- knowledge discovery:
- topic modeling
- graph traversal
- biomedical named entity recognition (BioNER)
- dependency parsing, applied e.g. to: relation extraction (to populate RDBMS and derived knowledge graphs; …)
- semantic parsing, applied e.g. to question-answering; …
- fact-checking; quality assurance (“noise” issues)
These tools and approaches support my personal and professional goals, summarized below:
- information retrieval
- information extraction
- knowledge stores
- quality assurance
- knowledge discovery
in support of my long-standing and overarching goal of facilitating a greater understanding of functional genomics: the phenotypic and functional expression of the information contained within our genomes.
I envision, ultimately, the creation of virtual networks (pathways; perhaps cells/tissues/organs), amenable to in silico perturbations and interventions for assessing changes in
- cellular signaling
- cellular growth/death
in response to simulated changes in
- mutations; genomic alterations
- epigenetic alterations
- biochemical entities
- cellular signalling pathways
- environmental conditions (stressors)
that in turn could guide, for example:
- personalized/and precision medicine (individualized susceptibilities; therapeutic interventions; …)
- basic research: augmentation of “wet lab” experiments, via identification/ ranking of genomic “variants-of-interest” (SNPs); …
- synthetic biology.
My stepwise approach in this regard has been to model biochemical and molecular biology/genomics data, first/do date as:
- indexed literature (Solr)
- extract high-quality data/relations to a RDBMS (PostgreSQL)
- “on-the-fly” population of a graphical model (Neo4j) from those data, in response to user queries
- GUIs: Carrot2 clustering/visualization engine; D3.js visualizations; …
Subsequent stages could involve
- dynamically linking those bioentities to biochemical, molecular genetic, and biomolecular databases and other data sources
- constructing in silico models of metabolic and cellular signaling pathways, to aid personalized medicine and basic research …
I believe that all of these aims are fully tractable, and I have been working diligently toward their realization. I also believe those aims are also well-aligned with current research areas in biomedical literature processing, bioinformatics, molecular genetics, cancer genomics, pharma and personalized medicine that are of significant academic and commercial interest.
If you find my expertise relevant to this, or another position, please do not hesitate to contact me. I am also willing to offer my services as a Collaborator (academic; commercial). I have experience collaborating and working online, but I am also willing and able to relocate, as needed.
Dr. Victoria A. Stuart, Ph.D.
Vancouver, B.C., Canada)