Deep learning and Reinforcement learning: New algorithms with improved interpretability, scalability, reliability and efficiency

Computer Science and Engineering
Project type: 
Sponsored Projects
2019 - 2021
Principal Investigator: 
Dr. Chandra Shekar Lakshminarayanan
Project Number: 
Sponsoring Agency: 
SERB - Start-up Research Grant (SRG)
Total Budget: 
Machine Learning (ML) methods are data driven approach to solving artificial intelligence (AI) tasks, and rest on the concept of learning and generalisation. Learning is the use data to learn a functional relationship (also known as the model) between the input and the output and generalisation is the ability of learnt model to perform well on unseen data. Recent times have witnessed two important ground breaking paradigm shifts happening in AI/ML. First of them is the wave of deep learning wherein a deep neural network (DNN) is used as a "General Purpose" learning architecture to engineer solutions for a spectrum of AI tasks. Secondly important shift is in reinforcement learning (RL) which are ML methods to tackle AI tasks with control. The application of RL is in a variety ranging from self-driving cars, to medicine, education and transportation. Major question of theoretical/practical interest in DL are: The loss function is non-convex, yet, why are deep neural networks capable of achieving zero training error? DL models are over- parameterised, yet, how do they generalise? The project's objective is to answer these questions with an aim to develop new DL and RL algorithms. To this end, we propose to obtain additional insights by looking at the deep learning at various levels of abstractions namely i) black-box level ii) at the level of weights, iii) at the level of connections and activations and iv) at the level of sub-networks. With this background, the project will test the following hypothesis: Hypothesis at black box level (H1) - In deep learning, depth always helps in reducing the training error. Hypothesis at level of weights (H2) - Most weights and activations of the a deep neural network are irrelevant. Hypothesis at level of sub-network (H3) - Large neural networks can always be pruned into a smaller sub- network without loss in performance. Experiment 1: In order to verify H1, H2 and H3, the plan is to understand the training error as a function of width, depth, batch size, step size and initialisation by fitting standard datasets with true labels, random labels and random pixel. The experiments will also study the the dynamics of the weights as the network trains, the variation in the activations over layer, and of a given layer during the course of training. The connectivity and information propagation from input to output through the weights and activations will be looked at. Experiments 2: Atari games are a popular benchmark to evaluate the performance of RL algorithms. The plan is to compare andimprove the sample efficiency of temporal difference based learning algorithms for value function estimation arising in RL. Significance: 1) Make deep learning more interpretable, scalable, reliable. 2) Eliminate hyper-parameter tuning by understanding a closed form relationship between depth, width, initialisation, batch size and step size. 3) Improve sample efficiency of reinforcement algorithms.