Interpretable recurrent neural networks in continuous-time control environments

Hasani, Ramin

doi:10.34726/hss.2020.78942

Datensatz Zitierlink:

https://doi.org/10.34726/hss.2020.78942
http://hdl.handle.net/20.500.12708/1068

Titel:

Interpretable recurrent neural networks in continuous-time control environments

Zitat:

Hasani, R. (2020). Interpretable recurrent neural networks in continuous-time control environments [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.78942

reposiTUm-DOI:

10.34726/hss.2020.78942

CatalogPlus:

AC15644999

Publikationstyp:

Hochschulschrift - Dissertation

Sprache:

Englisch

Autor_innen:

Hasani, Ramin

Betreuer_in:

Grosu, Radu

Organisationseinheit:

E191 - Institut für Computer Engineering

Datum (veröffentlicht):

2020

Umfang:

167

Keywords:

recurrent neural networks; interpretable AI; brain-inspired machine learning; deep learning; explainable AI; continuous-time neural networks; neural ODE; robot control; control theory dynamical systems

Abstract:

Intelligent agents must learn coherent representations of their world, from high-dimensional sensory information, and utilize them to generalize well in unseen situations. Although contemporary deep learning algorithms have achieved noteworthy successes in variform of high-dimensional tasks, their learned causal structure, interpretability, and robustness were largely overlooked. This dissertation presents methods to address interpretation, stability and the overlooked properties of a class of intelligent algorithms, namely recurrent neural networks (RNNs), in continuous-time environments. Accordingly, the contributions of the work lie into two major frameworks: I) Designing interpretable RNN architectures — We first introduce a novel RNN instance that is formulated by computational models originally developed to explain the nervous system of small species. We call these RNNs liquid time-constant (LTCs) because they possess nonlinear compartments that regulate the state of a neuron through a variable time-constant. LTCs form a dynamic causal model capable of learning causal relationships between the input, their neural state, and the output dynamics directly from supervised training data. Moreover, we demonstrate that LTCs are universal approximators and can be advantageously used in continuous-time control domains. We then combine LTCs with contemporary scalable deep neural network architectures and structural inspirations from the C. elegans connectome, to develop novel neural processing units, that can learn to map multidimensional inputs to control commands by sparse, causal, interpretable and robust neural representations. We extensively evaluate the performance of LTC-based neural network instances in a large category of simulated and real-world applications ranging from time-series classification and prediction to autonomous robot and vehicle control. II) Designing interpretation methods for trained RNN instances — In this framework, we develop a quantitative method to interpret the dynamics of modern RNN architectures. As opposed to the existing methods that are proactively constructed by empirical feature visualization algorithms, we propose a systematic pipeline for interpreting individual hidden state dynamics within the network using response characterization methods. Our method is able to uniquely identify neurons with insightful dynamics, quantify relationships between dynamical properties and test accuracy through ablation analysis, and interpret the impact of network capacity on a network’s dynamical distribution. Finally, we demonstrate the scalability of our method by evaluating a series of different benchmark sequential datasets. The findings of this dissertation notably improves our understanding of neural information processing systems in continuous-time environments.

Lizenz:

Urheberrechtsschutz

Enthalten in den Sammlungen:

Thesis

Volltext (Version of Record (published version))

Adobe PDF

(12.45 MB)

Urheberrechtsschutz

Zur Langanzeige

Google Scholar^TM

Check

Google ScholarTM

Google Scholar^TM