Hasani, R. (2020). Interpretable recurrent neural networks in continuous-time control environments [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.78942
recurrent neural networks; interpretable AI; brain-inspired machine learning; deep learning; explainable AI; continuous-time neural networks; neural ODE; robot control; control theory dynamical systems
en
Abstract:
Intelligent agents must learn coherent representations of their world, from high-dimensional sensory information, and utilize them to generalize well in unseen situations. Although contemporary deep learning algorithms have achieved noteworthy successes in variform of high-dimensional tasks, their learned causal structure, interpretability, and robustness were largely overlooked. This dissertation presents methods to address interpretation, stability and the overlooked properties of a class of intelligent algorithms, namely recurrent neural networks (RNNs), in continuous-time environments. Accordingly, the contributions of the work lie into two major frameworks: I) Designing interpretable RNN architectures — We first introduce a novel RNN instance that is formulated by computational models originally developed to explain the nervous system of small species. We call these RNNs liquid time-constant (LTCs) because they possess nonlinear compartments that regulate the state of a neuron through a variable time-constant. LTCs form a dynamic causal model capable of learning causal relationships between the input, their neural state, and the output dynamics directly from supervised training data. Moreover, we demonstrate that LTCs are universal approximators and can be advantageously used in continuous-time control domains. We then combine LTCs with contemporary scalable deep neural network architectures and structural inspirations from the C. elegans connectome, to develop novel neural processing units, that can learn to map multidimensional inputs to control commands by sparse, causal, interpretable and robust neural representations. We extensively evaluate the performance of LTC-based neural network instances in a large category of simulated and real-world applications ranging from time-series classification and prediction to autonomous robot and vehicle control. II) Designing interpretation methods for trained RNN instances — In this framework, we develop a quantitative method to interpret the dynamics of modern RNN architectures. As opposed to the existing methods that are proactively constructed by empirical feature visualization algorithms, we propose a systematic pipeline for interpreting individual hidden state dynamics within the network using response characterization methods. Our method is able to uniquely identify neurons with insightful dynamics, quantify relationships between dynamical properties and test accuracy through ablation analysis, and interpret the impact of network capacity on a network’s dynamical distribution. Finally, we demonstrate the scalability of our method by evaluating a series of different benchmark sequential datasets. The findings of this dissertation notably improves our understanding of neural information processing systems in continuous-time environments.