Technical Review

Explainable (Interpretable) Models

Last modified: 2019-02-07

Copyright notice, citation: Copyright
© 2018-present, Victoria A. Stuart

These Contents

[Table of Contents]


We need to be able to trust and understand how results are generated in NLP and ML based models. Recent, relevant discussions include:

The utility of knowledge graphs is outlined in my Why Graphs? subsection. The ability to model heterogeneous information networks (HINs) – complex networks widely represented in biological domains – may encapsulate higher-order interactions that crucially reflect the complex nature among nodes and edges in real world data. At the same time, network motifs may reveal higher order interactions and network semantics in homogeneous networks.

Augmenting KG with external resources such as textual knowledge stores and ontologies is an obvious approach to providing explanatory material to KG. This is particularly important when machine learning approaches are employed in the biomedical and clinical sciences domains, where high precision is imperative for supporting, not distracting practitioners (What Do We Need to Build Explainable AI Systems for the Medical Domain?), and it is crucial to underpin machine output with reasons that are verifiable by humans.

Paraphrased from What Do We Need to Build Explainable AI Systems for the Medical Domain?” (Dec 2017):

  • “The only way forward seems to be the integration of both knowledge-based and neural approaches to combine the interpretability of the former with the high efficiency of the latter. To this end, there have been attempts to retrofit neural embeddings with information from knowledge bases as well as to project embedding dimensions onto interpretable low-dimensional sub-spaces. More promising, in our opinion, is the use of hybrid distributional models that combine sparse graph-based representations with dense vector representations and link them to lexical resources and knowledge bases.

    “Here a hybrid human-in-the-loop approach can be beneficial, where not only the machine learning models for knowledge extraction are supported and improved over time, the final entity graph becomes larger, cleaner, more precise and thus more usable for domain experts. Contrary to classical automatic machine learning, human-in-the-loop approaches do not operate on predefined training or test sets, but assume that human expert input regarding system improvement is supplied iteratively.”

Regarding the “hybrid human-in-the-loop approach” mentioned in What Do We Need to Build Explainable AI Systems for the Medical Domain?, the Never-Ending Language Learner [NELLproject] occasionally interacts with human trainers, mostly for negative feedback identifying NELL’s incorrect beliefs (see Section 7 and Figs. 6 & 7
in that paper), though this human-machine interaction has been decreasing as NELL gets more accurate.

Knowledge graph models attain state of the art accuracy in knowledge base completion, but their predictions are notoriously hard to interpret. The OpenKE project  [code] is an open-source framework for knowledge graph embedding (KGE) based on the TensorFlow machine learning platform. A very interesting, heavily modified fork of the OpenKE  GitHub repository is XKE  (XKE : eXplaining Knowledge Embedding models), which contains implementations of XKE-PRED and XKE-TRUE described in Interpreting Embedding Models of Knowledge Bases: A Pedagogical Approach (ICML 2018). XKE adapts “pedagogical approaches” to interpret embedding models, by extracting weighted Horn rules from them.

  • In mathematical logic and logic programming, a Horn clause is a logical formula of a particular rule-like form which gives it useful properties for use in logic programming, formal specification, and model theory.

  • A pedagogical approach is one where – intuitively speaking – a non-interpretable but accurate model is run, and an interpretable model is learned from the output of the non-interpretable one.

  • OpenKE was used in Knowledge Representation Learning: A Quantitative Review (Dec 2018).

State of the art in knowledge base completion typically relies on embedding models that map entities and relations into low-dimensional vector space. The existence of a relation triple is determined by some pre-defined function over these representations. More importantly, embedding models turn complex space of semantic concepts into a smooth space where gradients can be calculated and followed. One difficulty with embeddings is their poor interpretability viz-a-viz its users.

Addressing this challenge, in Interpreting Embedding Models of Knowledge Bases: A Pedagogical Approach  [ICML 2018; code] the authors proposed two models:

  • KGE models with predicted features (XKE-PRED), and
  • KGE models with observed features (XKE-TRUE).

  • XKE-PRED treated the embedding model as a black box and assumed no other source of information for building the interpretable model. By changing the original classifier’s inputs and observing its outputs, the pedagogical approach constructed a training set for an interpretable classifier from which explanations were extracted.

  • XKE-TRUE is a variation of XKE-PRED that assumed an external source of knowledge (regarded as ground truth of their relational domain) besides the embedding model, from which XKE-TRUE extracted interpretable features.

Skipping over additional detail (provided in the paper), the results in Table 3
in that paper are most the interesting with respect to this subsection (Explainable (Interpretable) Models): the application of those models to a Freebase dataset. Five examples were shown (ID #1-5), along with input triples (head-relation-tail), their explanations (reasons: weighted rules and the bias term) with the interpretable classifier’s score (XKE-PRED; XKE-TRUE), and with the labels predicted by the embedding model (trained as a binary classifier: 1 = True; 0 = False). While the results (discussed in Section 4.4) vary – a consequence of an early-stage research effort – the models performed remarkably well in explaining (through the explanations: weighted paths) the selected, embedded triples.


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Knowledge-based Transfer Learning Explanation (Jul 2018) [code] likewise sought to boost machine learning applications in decision making by making more human-centric explanations, relevant to transfer learning (utilizing models developed in one domain as the starting point for training in another domain). The authors proposed an ontology-based knowledge representation and reasoning framework for human-centric transfer learning explanation. They first modeled a learning domain in transfer learning [including the dataset and the prediction task, with OWL (Web Ontology Language) ontologies], for which they complemented the prediction task-related common sense knowledge using an individual matching and external knowledge importing algorithm. The framework further used a correlative reasoning algorithm to infer three kinds of explanatory evidence with different granularities (general factors, particular narrators and core contexts) to explain a positive feature or a negative transfer from one learning domain to another.


[Image source. Click image to open in new window.]

Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation (Nov 2018) [code] addressed the provision of model-generated explanations in recommender systems in structured KG, using a knowledge base representation learning framework (KBE4ER) to embed heterogeneous entities for recommendation. Based on the embedded knowledge base, a soft matching algorithm was proposed to generate personalized explanations for recommended items. The authors designed a novel explainable collaborative filtering (CF) framework over knowledge graphs [collaborative filtering is also discussed on pp. 10-11 in Implementing Recommendation Algorithms in a Large-Scale Biomedical Science Knowledge Base (Oct 2017)]. The main building block was an integration of traditional CF with the learning of knowledge base embeddings. They first defined the concept of a user-item knowledge graph (illustrated in Fig. 1 in their paper), which encoded knowledge about the user behaviors and item properties as a relational graph structure. The user-item knowledge graph focused on how to depict different types of user behaviors and item properties over heterogeneous entities in a unified framework. Then, they extended the design philosophy of CF to learn over the knowledge graph for personalized recommendation. For each recommended item, they further conducted fuzzy reasoning over the paths in the knowledge graph based on soft matching, to construct personalized explanations.


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Some of the methods for constructing KG and knowledge discovery over KG can also provide evidence to support the understanding of new facts. For example:

  • MOLIERE: Automatic Biomedical Hypothesis Generation System (May 2017) [projectcode] is a system that can identify connections within biomedical literature. MOLIERE finds the shortest path between two query keywords in the KG, and extends this path to identify a significant set of related abstracts (which, due to the network construction process, share common topics). Topic modeling, performed on these documents using PLDA+, returns a set of plain text topics representing concepts that likely connect the queried keywords, supporting hypothesis generation (for example, on historical findings MOLIERE showed the implicit link between Venlafaxine and HTR1A, and the involvement of DDX3 on Wnt signaling).

  • Finding Streams in Knowledge Graphs to Support Fact Checking (Aug 2017) viewed a knowledge graph as a “flow network” and knowledge as a fluid, abstract commodity. They showed that computational fact checking of a (subject, predicate, object) triple then amounted to finding a “knowledge stream” that emanated from the subject node and flowed toward the object node through paths connecting them. Evaluations revealed that this network-flow model was very effective in discerning true statements from false ones, outperforming existing algorithms on many test cases. Moreover, the model was expressive in its ability to automatically discover useful path patterns and relevant facts that may help human fact checkers corroborate or refute a claim.

  • Statistical relational learning (SRL) based approaches [reviewed by Nickel et al. (Kevin Murphy; Volker Tresp) in A Review of Relational Machine Learning for Knowledge Graphs (Sep 2015)] can be used in conjunction with machine reading and information extraction methods to automatically build KG. In SRL, the representation of an object may contain its relationships to other objects. The data is in the form of a graph, consisting of nodes (entities) and labelled edges (relationships between entities). The main goals of SRL include prediction of missing edges, prediction of properties of nodes, and clustering nodes based on their connectivity patterns. These tasks arise in many settings such as analysis of social networks and biological pathways.

An excellent follow-on article (different authors) to Nickel et al., A Review of Relational Machine Learning for Knowledge Graphs (Sep 2015) is On Embeddings as an Alternative Paradigm for Relational Learning (Jul 2018), which systematically compared knowledge graph embedding and logic-based SRL methods on standard relational classification and clustering tasks – including discussion of the Path Ranking Algorithm (PRA). Relation paths can be regarded as bodies of weighted rules (more precisely, Horn clauses), where the weight specifies how predictive the body of the rule is for the head.

  • “A strong advantage of KGEs is their scalability, at the expense of their black-box nature and limited reasoning capabilities. SRL methods are a direct opposite – they can capture very complex reasoning, are interpretable but currently of a limited scalability.”

  • The aim of statistical relational learning is to learn statistical models from relational or graph-structured data. Three of the main statistical relational learning paradigms include weighted rule learning, random walks on graphs, and tensor factorization. These methods were mostly developed and studied in isolation, with few attempts at understanding the relationship among them or combining them. For example, in their survey, [A Review of Relational Machine Learning for Knowledge Graphs (Sep 2015)], Nickel et al. described weighted rules and graph random walks as two separate classes of models for learning from relational data.

  • Kazemi and Poole (citing Nickel et al.’s work) noted that relational models based on weighted rule learning could easily be explained to a broad range of people.

The Path Ranking Algorithm (PRA) was introduced by Ni Lao and William Cohen in Relational Retrieval Using a Combination of Path-Constrained Random Walks (2010) [code], and extended by those authors in Random Walk Inference and Learning in A Large Scale Knowledge Base (2011). PRA is an easily-interpretable extension of the idea of using random walks of bounded lengths for predicting links in multi-relational knowledge graphs. Discussed in A Review of Relational Machine Learning for Knowledge Graphs (Sep 2015), the key idea in PRA is to use these path probabilities as features for predicting the probability of missing edges.

  • Interpretability. A useful property of PRA is that its model is easily interpretable. In particular, relation paths can be regarded as bodies of weighted rules - more precisely Horn clauses – where the weight specifies how predictive the body of the rule is for the head. For instance, Table VI shows some relation paths along with their weights that have been learned by PRA … to predict which college a person attended, i.e., to predict triples of the form $\small \text{(p, college, c)}$. The first relation path in Table VI can be interpreted as follows: “it is likely that a person attended a college if the sports team that drafted the person is from the same college.” This can be written in the form of a Horn clause as follows: $\small \text{(p, college, c) $\leftarrow$ (p, draftedBy, t) $\land$ (t, school, c)}$. By using a sparsity promoting prior on $\small w_k$ , we can perform feature selection, which is equivalent to rule learning.”


    [Image source. Click image to open in new window.]

A (Feb) 2018 paper by Kazemi and Poole, Bridging Weighted Rules and Graph Random Walks for Statistical Relational Models, finally addressed the relationship between weighted rules and graph random walks. They studied the relationship between the Path Ranking Algorithm (PRA; one of the best known relational learning methods in the graph random walk paradigm) and Relational Logistic Regression (RLR; one of the recent developments in weighted rule learning). Their result improved the explainability of models learned through graph random walk, by providing a weighted rule interpretation for them.

An Interpretable Reasoning Network for Multi-Relation Question Answering (Jun 2018) addressed multi-relation question answering via elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. Their Interpretable Reasoning Network (IRN) model dynamically decided which part of an input question should be analyzed at each hop, for which the reasoning module predicted a knowledge base relation (relation triple) that corresponded to the current parsed result. More interestingly regarding this subsection, IRN offered traceable and observable intermediate predictions (see their Figure 3
), facilitating reasoning analysis and failure diagnosis (thereby also allowing manual manipulation in answer prediction).

The adoption of machine learning in high-stakes applications such as healthcare requires explanations that are comprehensible to the domain user, who often holds the ultimate responsibility for decisions and outcomes. In 2018 IBM Research Teaching Meaningful Explanations (Sep 2018) proposed an approach [TED : Teaching Explanations for Decisions] to generate such explanations in which training data was augmented to include – in addition to features and labels – explanations elicited from domain users. A joint model then learned to produce both labels and explanations from the input features. This simple idea ensured that explanations were tailored to the complexity expectations and domain knowledge of the user. This new approach was particularly well-suited for explaining a machine learning prediction when all of its input features were inherently incomprehensible to humans, even to deep subject matter experts. Evaluations on a chemical odor, a melanoma and other datasets showed that their approach was generalizable across domains and algorithms, demonstrating that meaningful explanations could be reliably taught to machine learning algorithms (in some cases, also improving modeling accuracy).

“For the present discussion, we define an explanation as information provided in addition to an output that can be used to verify the output. In the ideal case, an explanation should enable a human user to independently determine whether the output is correct. The requirements of meaningful information have two implications for explanations:

  1. Complexity match: the complexity of the explanation needs to match the complexity capability of the user. For example, an explanation in equation form may be appropriate for a statistician, but not for a nontechnical person.

  2. Domain match: an explanation needs to be tailored to the domain, incorporating the relevant terms of the domain. For example, an explanation for a medical diagnosis needs to use terms relevant to the physician (or patient) who will be consuming the prediction.”

Given a neural network, we are interested in knowing what features it has learned for making classification decisions. Network interpretation is also crucial for (computer vision) tasks involving humans, like autonomous driving and medical image analysis (and in the NLP domain, clinical diagnosis and recommendation, …). In an interesting approach, Neural Network Interpretation via Fine Grained Textual Summarization (Sep 2018), the authors introduced the novel task of interpreting classification models using fine grained textual summarization: along with (image classification) label prediction, the network generated a sentence explaining its decision. For example, a knowledgeable person looking at a photograph of a bird might say,

    "I think this is a Anna's Hummingbird because it has a straight bill, a rose pink throat and crown. It's not a Broad-tailed Hummingbird because the later lacks the red crown."

This kind of textual description carries rich semantic information and is easily understandable, illustrating the use of natural language as a logical medium in which to ground the interpretation of deep convolutional models. Tasks that combine text generation and visual explanation include image captioning and visual question answering (VQA). Although this paper addresses those tasks, the method described (that leverages attention, summarization and natural language) could also be applied in the NLP domain as well (this is addressed in Sections 3 and 4.3 in their paper).


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Machine learning models appear to be particularly sensitive to adversarial challenges; for example, note Table 1 and the accompanying text in Semantically Equivalent Adversarial Rules for Debugging NLP Models (2018).


[Image source. Click image to open in new window.]

Those authors also published the heavily-cited LIME algorithm, described in ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier [codediscussion], which supports explaining individual predictions for text classifiers or classifiers that act on tables (numpy arrays of numerical or categorical data) or images. A more recent approach from the same authors is described in Anchors: High-Precision Model-Agnostic Explanations [codediscussion].


[Image source. Click image to open in new window.]


[Image source. Click image to open in new window.]

Other work in this domain includes A Unified Approach to Interpreting Model Predictions (Nov 2017) [code], which described a unified approach (SHAP) to explain the output of any machine learning model. SHAP – which united six existing models including LIME – assigned an importance value for a particular prediction to each feature, using game theory to guarantee a unique solution that was better aligned to human intuition than existing methods. SHAP is well-discussed in these three blog posts.

An interesting extension to LIME is Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME (Nov 2018) [code]. LIME takes a black-box model (such as a neural network, random forest, etc.) and a data sample, and outputs a list of weighted features that contribute most to the classification decision. By going over the same process for all training samples, we can generate a transformed version of the original data set that only contains the relevant features. The algorithm developed in this paper, UFOLD, learns concise logic programs from a transformed data set that is generated by storing the explanations provided by LIME. Specifically, they presented a new inductive logic programming (ILP) algorithm capable of learning non-monotonic logic programs from local explanations of boosted tree models provided by LIME.


[Image source. Click image to open in new window.]

Global Explanations of Neural Networks: Mapping the Landscape of Predictions (Feb 2019) [code] presented an approach for generating global attributions called GAM, which explained the landscape of neural network predictions across subpopulations. GAM augmented global explanations with the proportion of samples that each attribution best explained and specified which samples were described by each attribution. Global explanations also had tunable granularity to detect more or fewer subpopulations. They demonstrated that GAM’s global explanations yielded the known feature importances of simulated data, matched feature weights of interpretable statistical models on real data, and were intuitive to practitioners through user studies. With more transparent predictions, GAM could help ensure neural network decisions were generated for the right reasons.

  • This paper also discusses LIME.


[Image source. Click image to open in new window.]