# Deepmind's Path to Neuro-inspired General Intelligence

By Jeremy Nixon [jnixon2@gmail.com]. Nov. 2017. Updated June 2018.

Overview

- Deepmind Paper Framing
- Deepmind Papers through Framing
- Current Frontier
- Examples of Systems Neuroscience Inspiration

Deepmind Papers Categories of the path to date:

- Transfer Learning
- Multi-task Learning
- Tools, Environment & Datasets
- Intuitive Physics
- Reinforcement Learning
- Model-based RL
- Exploration in RL

- Applications
- Safety
- Deep Learning
- RNNs
- CNNs

- Generative Models
- Variational Inference
- Unsupervised Learning
- Representation Learning
- Attention
- Memory
- Multi-Agent Systems
- Imitation Learning
- Metalearning
- Neural Programming

- Evolution
- Game Theory
- Natural Language Processing
- Multi-Modal Learning
- General Machine Learning
- Theory
- Miscellaneous
- Neuroscience

Papers:

- Transfer Learning
- DARLA: Improving Zero-Shot Transfer In Reinforcement Learning
- PathNet: Evolution Channels Gradient Descent in Super Neural Networks
- Matching Networks for One Shot Learning
- Progressive Neural Networks
- Sim-to-Real Robot Learning from Pixels with Progressive Nets
- Successor Features for Transfer in Reinforcement Learning

- Multi-Task Learning
- Multi-task Self-Supervised Visual Learning
- The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously
- Distral: Robust Multitask Reinforcement Learning
- Emergence of Locomotion Behaviors in Rich Environments
- Reinforcement Learning with Unsupervised Auxiliary Tasks
- Learning to Navigate in Complex Environments
- Learning and Transfer of Modulated Locomotor Controllers
- Multi-Task Sequence to Sequence Learning
- Learning by Playing - Solving Sparse Reward Tasks from Scratch
- Unicorn: Continual Learning with a Universal, Off-policy Agent
- Progress & Compress: A Scalable Framework for Continual Learning

- Tools, Environments, Evaluation & Datasets
- Starcraft II: A New Challenge for Reinforcement Learning
- DeepMind Lab
- The Kinetics Human Action Video Dataset
- An approximation of the Universal Intelligence Measure
- Psychlab: A Psychology Laboratory for Deep Reinforcement Learning
- Deepmind Control Suite
- Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

- Intuitive Physics
- Reinforcement Learning (Papers with a pure RL focus)
- Model-Based RL
- Learning Model-Based Planning from Scratch [Also, Planning]
- Recurrent Environment Simulators
- Structure Learning in Motor Control: A Deep Reinforcement Learning Model [Also Transfer, Intuitive Physics]
- Imagination-Augmented Agents for Deep Reinforcement Learning [Also, Planning]
- Continuous Deep Q-Learning with Model-based Acceleration
- Skip Context Tree Switching
- Bayes-Adaptive Simulation-Based Search with Value Function Approximation
- Learning and Querying Fast Generative Models for Reinforcement Learning

- Exploration in RL
- A Distributional Perspective on Reinforcement Learning
- FeUdal Networks for Hierarchical Reinforcement Learning [Also, Planning]
- Combining Policy Gradient and Q-Learning
- Strategic Attentive Writer for Learning Macro-Actions
- Safe and Efficient Off-Policy Reinforcement Learning
- Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
- Thompson Sampling is Asymptotically Optimal in General Environments
- Asynchronous Methods for Deep Reinforcement Learning
- Dueling Network Architectures for Deep Reinforcement Learning
- Increasing the Action Gap: New Operators for Reinforcement Learning
- Deep Reinforcement Learning with Double Q-Learning
- Policy Distillation
- Universal Value Function Approximators
- Human-level Control through Deep Reinforcement Learning
- Learning Continuous Control Policies by Stochastic Value Gradients
- Fictitious Self-Play in Extensive Form Games
- Toward Minimax Off-policy Value Estimation
- Massively Parallel Methods for Deep Reinforcement Learning
- Compress and Control
- Deterministic Policy Gradient Algorithms
- Playing Atari with Deep Reinforcement Learning
- Reinforcement Learning, Efficient Coding, and the Statistics of Natural Tasks
- Rainbow: Combining Improvements in Deep Reinforcement Learning
- Path Consistency Learning in Tsallis Entropy Regularized MDPs
- More Robust Doubly Robust Off-Policy Evaluation
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
- Mis&Match - Agent Curricula for Reinforcement Learning
- Vector-based Navigation Using Grid-Like Representations in Artificial Agents
- Kickstarting Deep Reinforcement Learning

- Model-Based RL
- Applications
- Go
- Poker
- Fairness

- Safety / Security
- Reinforcement Learning with a Corrupted Reward Channel [Also, Safety]
- Safely Interruptible Agents [Also, Safety]
- AI Safety Gridworlds
- Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
- Safe Exploration in Continuous Action Spaces
- Measuring and Avoiding Side Effects Using Relative Reachability

- Deep Learning
- Recurrent Neural Networks
- Convolutional Neural Networks
- Noisy Networks for Exploration
- Sobolev Training for Neural Networks
- Decoupled Neural Interfaces using Synthetic Gradients
- Understanding Synthetic Gradients and Decoupled Neural Interfaces
- Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
- Overcoming Catastrophic Forgetting in Neural Networks
- Local Minima in Training of Neural Networks
- Learning Values Across Many Orders of Magnitude
- MuProp: Unbiased Backpropagation for Stochastic Neural Networks
- ACDC: A Structured Efficient Linear Layer
- Natural Neural Networks
- Gradient Estimation Using Stochastic Computation Graphs
- Weight Uncertainty in Neural Networks
- Stochastic Backpropagation and Approximate Inference in Deep Generative Models
- On the Importance of Single Directions for Generalization

- Variational Inference
- Filtering Variational Objectives
- Variational Inference for Monte Carlo Objectives
- Variational Inference with Normalizing Flows
- Variational Information Maximization for Intrinsically Motivated Reinforcement Learning [Also, Reinforcement Learning]
- Neural Variational Inference and Learning in Belief Networks
- Distribution Matching in Variational Inference [Also, Generative, Unsupervised Learning]

- Generative Models
- The Cramer Distance as a Solution to Biased Wasserstein Gradients
- Variational Approaches for Auto-Encoding Generative Adversarial Networks
- Comparison of Maximum Likelihood and GAN-based training of Real NVPs
- Parallel Multiscale Autoregressive Density Estimation
- Conditional Image Generation with PixelCNN Decoders
- WaveNet: A Generative Model for Raw Audio
- Video Pixel Networks
- Learning in Implicit Generative Models
- Connecting Generative Adversarial Networks and Actor-Critic Methods [Also, Reinforcement Learning]
- Pixel Recurrent Neural Networks
- One-Shot Generalization in Deep Generative Models
- A Test of Relative Similarity for Model Selection in Generative Models
- DRAW: A Recurrent Neural Network for Image Generation [Also, Attention]
- Semi-Supervised Learning with Deep Generative Models
- Deep AutoRegressive Networks
- A Note on the Evaluation of Generative Models
- Parallel WaveNet: Fast High-Fidelity Speech Synthesis (WaveRNN)
- Efficient Neural Audio synthesis
- Learning and Querying Fast Generative Models for Reinforcement Learning

- Unsupervised Learning
- Representation Learning
- Attention
- Memory
- Neural Episodic Control
- Generative Temporal Models With Memory
- Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
- Model-Free Episodic Control
- One-Shot Learning with Memory-Augmented Neural Networks
- Associative Long Short-Term Memory
- Prioritized Experience Replay
- Sample Efficient Actor-Critic with Experience Replay
- Learning Efficient Algorithms with Hierarchical Attentive Memory [Also, attention]
- Count-Based Frequency Estimation with Bounded Memory [Also, Natural Language Processing]
- Memory-based Parameter Adaptation

- Multi-Agent Systems
- Imitation Learning
- Robust Imitation of Diverse Behaviors
- Learning Human Behaviors from Motion Capture by Adversarial Imitation
- Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
- Reinforcement and Imitation Learning for Diverse Visuomotor Skills
- Playing Hard Exploration Games by Watching Youtube

- Metalearning
- Neural Programming
- Hybrid Computing Using a Neural Network with Dynamic External Memory
- Programmable Agents [Also, Representation Learning]
- Neural Programmer-Interpreters
- Neural Random-Access Machines
- Neural Turing Machines
- Learning Explanatory Rules from Noisy Data
- Synthesizing Programs for Images using Reinforced Adversarial Learning (SPIRAL)

- Neural Programming
- Evolution
- Game Theory
- Learning Nash Equilibrium for General-Sum Markov Games from Batch Data
- The Mechanics of n-Player Differentiable Games [Also, Generative Models (GANs)]
- Symmetric Decomposition of Asymmetric Games
- A Generalised Method for Empirical Game Theoretic Analysis
- Inequity Aversion Resolves Intertemporal Social Dilemmas

- Natural Language Processing
- Generative and Discriminative Text Classification with Recurrent Neural Networks
- Learning to Compose Words Into Sentences with Reinforcement Learning
- Reference-Aware Language Models
- The Neural Noisy Channel
- Latent Predictor Networks for Code Generation
- Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
- Semantic Parsing with Semi-Supervised Sequential Autoencoders
- On the State of the Art of Evaluation in Neural Language Models
- Teaching Machines to Read and Comprehend
- Learning to Transduce with Unbounded Memory [Also, Memory, Neural Programming]
- Dependency Recurrent Neural Language Models for Sentence Completion
- Towards End-to-End Speech Recognition with Recurrent Neural Networks
- Learning Word Embeddings Efficiently with Noise-Contrastive Estimation
- The NarrativeQA Reading Comprehension Challenge
- Learning to Follow Language Instructions with Adversarial Reward Induction [Also, Loss Function Learning]

- Multi-Modal
- Look, Listen and Learn
- End-to-end Optimization of Goal-Driven and Visually Grounded Dialogue Systems
- GuessWhat?! Visual Object Discovery through Multi-Modal Dialogue
- Grounded Language Learning in a Simulated 3D World
- Understanding Grounded Language Learning Agents [Also, Natural Language Processing]
- Objects that Sound

- General Machine Learning
- The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
- Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees
- Unit Tests for Stochastic Optimization
- Bayesian Hierarchical Community Discovery
- Implicit Reparameterization Gradients
- Cleaning up the Neighborhood: A Full Classification for Adversarial Partial Monitoring

- Theory
- Miscellaneous
- Neuroscience
- The Successor representation in human reinforcement learning
- Dorsal Hippocampus Contributes to Model-Based Planning
- Neuroscience-Inspired Artificial Intelligence
- Computations Underlying Social Hierarchy Learning: Distinct Neural Mechanism for Updating and Representing Self-Relevant Information
- Dorsal Anterior Cingulate Cortex and the Value of Control
- Semantic Representations in the Temporal Pole Predict False Memories
- Towards an Integration of Deep Learning and Neuroscience
- What Learning Systems do Intelligent Agents Need? Complementary Learning systems Theory Updated
- Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network [Also, Planning]
- Predictive Representations can Link Model-Based Reinforcement Learning to Model-Free Mechanisms
- Hippocampal place cells construct reward related sequences through unexplored space
- A Probabilistic Approach to Demixing Odors
- Approximate Hubel-Wiesel Modules and the Data Structures of Neural Computation
- The Future of Memory: Remembering, Imagining, and the Brain
- Is the Brain a Good Model for Machine Intelligence?
- Evidence Integration in Model-Based Tree Search
- Commentary on0 Building Machines that Learn and Think for Themselves
- Prefrontal Cortex as a Meta-Reinforcement Learning System

Current Frontier:

- Hierarchical planning
- Imagination-based planning with generative models
- Unsupervised Learning
- Memory and one-shot learning
- Abstract Concepts
- Continual and Transfer Learning

Emphasis on systems neuroscience - using the brain as inspiration for the structure and function of algorithms.

Neuroscience Inspired Artificial Intelligence

Examples of previous success of neuro-inspiration:

- Reinforcement Learning
- Inspired by animal learning
- TD Learning came out of animal behavior research.
- Second-order conditioning (Conditional Stimulus) (Sutton and Barto, 1981) * Deep Learning
- Convolutional Neural Networks. Visual Cortex (V1)
- Uses hierarchical structure (successive processing layers)
- Neurons in the early visual systems responds strongly to specific patterns of light (say, precisely oriented bars) but hardly responds to many other patterns.
- Gabor functions describe the weights in V1 cells.
- Nonlinear Transduction
- Divisive Normalization

- Word / Sentence Vectors - Distributed Embeddings
- Parallel Distributed Processing in the brain for representation and computation

- Dropout
- Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
- Attention

- Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
- Applying attention to memory
- Thought - it doesn’t make much sense to train an attention model over a static image, rather than over a time series. With a time series, bringing attention to changing aspects of the input makes sense. * Multiple Memory Systems
- Episodic Memory
- Experience Replay
- Especially for one shot experiences

- Working Memory
- LSTM - gating allows for conditioning on current state

- Long-term Memory
- External Memory
- Gating in LSTM
- Continual Learning

- Elastic weight consolidation for slowing down learning on weights that are important for previous tasks.

Example of future success:

- Intuitive Understanding of Physics
- Need to understand space, number, objectness
- Need to disentangle representations for transfer. (Dude, I feel so stolen from) * Efficient Learning (Learning from few examples) * Transfer Learning
- Transferring generalized knowledge gained in one context to novel domains
- Concept representations for transfer
- No direct evidence of concept representations in brains
- Imagination and Planning

- No direct evidence of concept representations in brains
- Toward model-based RL
- Internal model of the environment
- Model needs to include compositional / disentangled representations for flexibility

- Implementing a forecasted-based method of action selection
- Monte-carlo Tree Search as simulation based planning
- In rat brains, we observe ‘preplay’ where rats imagine the likely future experience - measured by comparing neural activations at preplay to activations during the activity
- Generalization + Transfer in human planning
- Hierarchical Planning * Virtual Brain Analytics deepminds-path-to-neuro-inspired-general-intelligence.md Displaying deepminds-path-to-neuro-inspired-general-intelligence.md.