Deepmind's Path to Neuro-Inspired General Intelligence

Category: Machine Intelligence

Read the original document

By Jeremy Nixon [jnixon2@gmail.com]. Nov. 2017. Updated June 2018.

Overview

Deepmind Paper Framing
Deepmind Papers through Framing
Current Frontier
Examples of Systems Neuroscience Inspiration

Deepmind Papers

Categories of the path to date:

Transfer Learning
Multi-task Learning
Tools, Environment & Datasets
Intuitive Physics
Reinforcement Learning
1. Model-based RL
2. Exploration in RL
Applications
Safety
Deep Learning
1. RNNs
2. CNNs
Generative Models
Variational Inference
Unsupervised Learning
Representation Learning
Attention
Memory
Multi-Agent Systems
Imitation Learning
Metalearning
Neural Programming
Evolution
Game Theory
Natural Language Processing
Multi-Modal Learning
General Machine Learning
Theory
Miscellaneous
Neuroscience
Transfer Learning
1. DARLA: Improving Zero-Shot Transfer In Reinforcement Learning
  1. https://arxiv.org/pdf/1707.08475.pdf
2. PathNet: Evolution Channels Gradient Descent in Super Neural Networks
  1. https://arxiv.org/pdf/1701.08734.pdf
3. Matching Networks for One Shot Learning
  1. https://arxiv.org/abs/1606.04080
4. Progressive Neural Networks
  1. https://arxiv.org/pdf/1606.04671.pdf
5. Sim-to-Real Robot Learning from Pixels with Progressive Nets
  1. https://arxiv.org/pdf/1610.04286.pdf
6. Successor Features for Transfer in Reinforcement Learning
  1. https://arxiv.org/pdf/1606.05312.pdf
Multi-Task Learning
1. Multi-task Self-Supervised Visual Learning
  1. https://arxiv.org/pdf/1708.07860.pdf
2. The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously
  1. https://arxiv.org/pdf/1707.03300.pdf
3. Distral: Robust Multitask Reinforcement Learning
  1. https://arxiv.org/pdf/1707.04175.pdf
4. Emergence of Locomotion Behaviors in Rich Environments
  1. https://arxiv.org/pdf/1707.02286.pdf
5. Reinforcement Learning with Unsupervised Auxiliary Tasks
  1. https://arxiv.org/pdf/1611.05397.pdf
6. Learning to Navigate in Complex Environments
  1. https://arxiv.org/pdf/1611.03673.pdf
7. Learning and Transfer of Modulated Locomotor Controllers
  1. https://arxiv.org/pdf/1610.05182.pdf
8. Multi-Task Sequence to Sequence Learning
  1. https://arxiv.org/pdf/1511.06114v3.pdf
9. Learning by Playing - Solving Sparse Reward Tasks from Scratch
  1. https://arxiv.org/abs/1802.10567
10. Unicorn: Continual Learning with a Universal, Off-policy Agent
11. https://arxiv.org/abs/1802.08294
12. Progress & Compress: A Scalable Framework for Continual Learning
13. https://arxiv.org/abs/1805.06370
Tools, Environments, Evaluation & Datasets
1. Starcraft II: A New Challenge for Reinforcement Learning
  1. https://arxiv.org/pdf/1708.04782.pdf
2. DeepMind Lab
  1. https://arxiv.org/pdf/1612.03801.pdf
3. The Kinetics Human Action Video Dataset
  1. https://arxiv.org/pdf/1705.06950.pdf
4. An approximation of the Universal Intelligence Measure
  1. https://arxiv.org/pdf/1109.5951v2.pdf
5. Psychlab: A Psychology Laboratory for Deep Reinforcement Learning
  1. https://arxiv.org/abs/1801.08116
6. Deepmind Control Suite
  1. https://arxiv.org/pdf/1801.00690v1.pdf
7. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
  1. https://arxiv.org/pdf/1705.07750.pdf
Intuitive Physics
1. Position-Velocity Encoders for Unsupervised Learning of Structured State Representations
  1. https://arxiv.org/pdf/1705.09805.pdf
2. Learning to Perform Physics Experiments via Deep Reinforcement Learning
  1. https://arxiv.org/pdf/1611.01843.pdf
3. Continuous Control with Deep Reinforcement Learning
  1. https://arxiv.org/pdf/1509.02971v2.pdf
Reinforcement Learning (Papers with a pure RL focus)
1. Model-Based RL
  1. Learning Model-Based Planning from Scratch [Also, Planning]
    1. https://arxiv.org/pdf/1707.06170.pdf
  2. Recurrent Environment Simulators
    1. https://arxiv.org/pdf/1704.02254.pdf
  3. Structure Learning in Motor Control: A Deep Reinforcement Learning Model [Also Transfer, Intuitive Physics]
    1. https://arxiv.org/pdf/1706.06827.pdf
  4. Imagination-Augmented Agents for Deep Reinforcement Learning [Also, Planning]
    1. https://arxiv.org/abs/1707.06203
  5. Continuous Deep Q-Learning with Model-based Acceleration
    1. https://arxiv.org/abs/1603.00748
  6. Skip Context Tree Switching
    1. http://proceedings.mlr.press/v32/bellemare14.pdf
  7. Bayes-Adaptive Simulation-Based Search with Value Function Approximation
    1. http://www0.cs.ucl.ac.uk/staff/d.silver/web/Publications_files/bafa.pdf
  8. Learning and Querying Fast Generative Models for Reinforcement Learning
    1. https://arxiv.org/abs/1802.03006
2. Exploration in RL
  1. Count-Based Exploration with Neural Density Models
    1. https://arxiv.org/pdf/1703.01310.pdf
  2. Unifying Count-Based Exploration and Intrinsic Motivation
    1. https://arxiv.org/abs/1606.01868
  3. Deep Exploration via Bootstrapped DQN
    1. https://arxiv.org/abs/1602.04621
  4. Variational Intrinsic Control
    1. https://arxiv.org/pdf/1611.07507.pdf
  5. Learning to Search with MCTSnets
    1. https://arxiv.org/abs/1802.04697v1
  6. Observe and Look Further: Achieving Consistent Performance on Atari
    1. https://arxiv.org/abs/1805.11593
3. A Distributional Perspective on Reinforcement Learning
  1. https://arxiv.org/pdf/1707.06887.pdf
4. FeUdal Networks for Hierarchical Reinforcement Learning [Also, Planning]
  1. https://arxiv.org/pdf/1703.01161.pdf
5. Combining Policy Gradient and Q-Learning
  1. https://arxiv.org/pdf/1611.01626.pdf
6. Strategic Attentive Writer for Learning Macro-Actions
  1. https://arxiv.org/pdf/1606.04695.pdf
7. Safe and Efficient Off-Policy Reinforcement Learning
  1. https://arxiv.org/abs/1606.02647
8. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
  1. https://arxiv.org/pdf/1610.00633.pdf
9. Thompson Sampling is Asymptotically Optimal in General Environments
  1. https://arxiv.org/pdf/1602.07905.pdf
10. Asynchronous Methods for Deep Reinforcement Learning
11. https://arxiv.org/abs/1602.01783
12. Dueling Network Architectures for Deep Reinforcement Learning
13. https://arxiv.org/abs/1511.06581
14. Increasing the Action Gap: New Operators for Reinforcement Learning
15. https://arxiv.org/abs/1512.04860
16. Deep Reinforcement Learning with Double Q-Learning
17. https://arxiv.org/abs/1509.06461
18. Policy Distillation
19. https://arxiv.org/pdf/1511.06295.pdf
20. Universal Value Function Approximators
21. http://proceedings.mlr.press/v37/schaul15.pdf
22. Human-level Control through Deep Reinforcement Learning
23. https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf
24. Learning Continuous Control Policies by Stochastic Value Gradients
25. https://arxiv.org/pdf/1510.09142v1.pdf
26. Fictitious Self-Play in Extensive Form Games
27. http://proceedings.mlr.press/v37/heinrich15.pdf
28. Toward Minimax Off-policy Value Estimation
29. http://proceedings.mlr.press/v38/li15b.html
30. Massively Parallel Methods for Deep Reinforcement Learning
31. https://arxiv.org/pdf/1507.04296.pdf
32. Compress and Control
33. https://arxiv.org/pdf/1411.5326v1.pdf
34. Deterministic Policy Gradient Algorithms
35. http://proceedings.mlr.press/v32/silver14.pdf
36. Playing Atari with Deep Reinforcement Learning
37. https://arxiv.org/pdf/1312.5602v1.pdf
38. Reinforcement Learning, Efficient Coding, and the Statistics of Natural Tasks
39. http://www.sciencedirect.com/science/article/pii/S2352154615001151
40. Rainbow: Combining Improvements in Deep Reinforcement Learning
41. https://arxiv.org/abs/1710.02298
42. Path Consistency Learning in Tsallis Entropy Regularized MDPs
43. https://arxiv.org/abs/1802.03501
44. More Robust Doubly Robust Off-Policy Evaluation
45. https://arxiv.org/abs/1802.03493
46. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
47. https://arxiv.org/abs/1802.01561
48. Mis&Match - Agent Curricula for Reinforcement Learning
49. https://arxiv.org/abs/1806.01780
50. Vector-based Navigation Using Grid-Like Representations in Artificial Agents
51. https://www.nature.com/articles/s41586-018-0102-6.epdf?
52. Kickstarting Deep Reinforcement Learning
53. https://arxiv.org/abs/1803.03835
Applications
1. Go
  1. Mastering the Game of Go with Deep Neural Networks and Tree Search
    1. https://storage.googleapis.com/deepmind-media/alphago/AlphaGoNaturePaper.pdf
  2. More Evaluation in Go using Deep Convolutional Neural Networks [Also, Convolutional Neural Networks]
    1. http://www0.cs.ucl.ac.uk/staff/d.silver/web/Publications_files/deepgo.pdf
  3. Mastering the Game of Go Without Human Knowledge
    1. https://www.nature.com/articles/nature24270.epdf?
2. Poker
  1. Smooth UCT Search in Computer Poker
    1. http://www0.cs.ucl.ac.uk/staff/d.silver/web/Publications_files/smooth_uct.pdf
3. Fairness
  1. Path-Specific Counterfactual Fairness
    1. https://arxiv.org/pdf/1802.08139.pdf
Safety / Security
1. Reinforcement Learning with a Corrupted Reward Channel [Also, Safety]
  1. https://arxiv.org/pdf/1705.08417.pdf
2. Safely Interruptible Agents [Also, Safety]
  1. https://intelligence.org/files/Interruptibility.pdf
3. AI Safety Gridworlds
  1. https://arxiv.org/abs/1711.09883
4. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks
  1. https://arxiv.org/abs/1802.05666
5. Safe Exploration in Continuous Action Spaces
  1. https://arxiv.org/abs/1801.08757
6. Measuring and Avoiding Side Effects Using Relative Reachability
  1. https://arxiv.org/abs/1806.01186
Deep Learning
1. Recurrent Neural Networks
  1. Sequential Neural Models with Stochastic Layers [Also, Planning]
    1. https://arxiv.org/abs/1605.07571
  2. Memory-Efficient Backpropagation Through Time
    1. https://arxiv.org/abs/1606.03401
  3. Adaptive Computation Time for Recurrent Neural Networks
    1. https://arxiv.org/abs/1603.08983
  4. Grid Long-Short Term Memory
    1. https://arxiv.org/pdf/1507.01526v3.pdf
  5. Order Matters: Sequence to Sequence for Sets
    1. https://arxiv.org/pdf/1511.06391v3.pdf
2. Convolutional Neural Networks
  1. Exploiting Cyclic Symmetry in Convolutional Neural Networks
    1. https://arxiv.org/abs/1602.02660
  2. Spatial Transformer Networks
    1. https://arxiv.org/pdf/1506.02025.pdf
  3. Very Deep Convolutional Networks for Large Scale Image Recognition
    1. https://arxiv.org/pdf/1409h.1556v6.pdf
  4. Pooling is Neither Necessary Nor Sufficient for Appropriate Deformation Stability in CNNs
    1. https://arxiv.org/abs/1804.04438
3. Noisy Networks for Exploration
  1. https://arxiv.org/pdf/1706.10295.pdf
4. Sobolev Training for Neural Networks
  1. https://arxiv.org/abs/1706.04859
5. Decoupled Neural Interfaces using Synthetic Gradients
  1. https://arxiv.org/pdf/1608.05343.pdf
6. Understanding Synthetic Gradients and Decoupled Neural Interfaces
  1. https://arxiv.org/pdf/1703.00522.pdf
7. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
  1. https://arxiv.org/pdf/1612.01474.pdf
8. Overcoming Catastrophic Forgetting in Neural Networks
  1. https://arxiv.org/pdf/1612.00796.pdf
9. Local Minima in Training of Neural Networks
  1. https://arxiv.org/pdf/1611.06310.pdf
10. Learning Values Across Many Orders of Magnitude
11. https://arxiv.org/abs/1602.07714
12. MuProp: Unbiased Backpropagation for Stochastic Neural Networks
13. https://arxiv.org/pdf/1511.05176v2.pdf
14. ACDC: A Structured Efficient Linear Layer
15. https://arxiv.org/pdf/1511.05946v3.pdf
16. Natural Neural Networks
17. https://arxiv.org/pdf/1507.00210.pdf
18. Gradient Estimation Using Stochastic Computation Graphs
19. https://arxiv.org/pdf/1506.05254v1.pdf
20. Weight Uncertainty in Neural Networks
21. http://proceedings.mlr.press/v37/blundell15.pdf
22. Stochastic Backpropagation and Approximate Inference in Deep Generative Models
23. https://arxiv.org/abs/1401.4082
24. On the Importance of Single Directions for Generalization
25. https://arxiv.org/abs/1803.06959
Variational Inference
1. Filtering Variational Objectives
  1. https://arxiv.org/pdf/1705.09279.pdf
2. Variational Inference for Monte Carlo Objectives
  1. https://arxiv.org/abs/1602.06725
3. Variational Inference with Normalizing Flows
  1. https://arxiv.org/pdf/1505.05770.pdf
4. Variational Information Maximization for Intrinsically Motivated Reinforcement Learning [Also, Reinforcement Learning]
  1. https://arxiv.org/pdf/1509.08731v1.pdf
5. Neural Variational Inference and Learning in Belief Networks
  1. https://arxiv.org/pdf/1402.0030v2.pdf
6. Distribution Matching in Variational Inference [Also, Generative, Unsupervised Learning]
  1. https://arxiv.org/abs/1802.06847
Generative Models
The Cramer Distance as a Solution to Biased Wasserstein Gradients
1. https://arxiv.org/pdf/1705.10743.pdf
Variational Approaches for Auto-Encoding Generative Adversarial Networks
1. https://arxiv.org/pdf/1706.04987.pdf
Comparison of Maximum Likelihood and GAN-based training of Real NVPs
1. https://arxiv.org/pdf/1705.05263.pdf
Parallel Multiscale Autoregressive Density Estimation
1. https://arxiv.org/pdf/1703.03664.pdf
Conditional Image Generation with PixelCNN Decoders
1. https://arxiv.org/pdf/1606.05328.pdf
WaveNet: A Generative Model for Raw Audio
1. https://arxiv.org/pdf/1609.03499.pdf
Video Pixel Networks
1. https://arxiv.org/pdf/1610.00527.pdf
Learning in Implicit Generative Models
1. https://arxiv.org/pdf/1610.03483.pdf
Connecting Generative Adversarial Networks and Actor-Critic Methods [Also, Reinforcement Learning]
1. https://arxiv.org/pdf/1610.01945.pdf
Pixel Recurrent Neural Networks
1. https://arxiv.org/abs/1601.06759
One-Shot Generalization in Deep Generative Models
1. https://arxiv.org/abs/1603.05106
A Test of Relative Similarity for Model Selection in Generative Models
1. https://arxiv.org/pdf/1511.04581.pdf
DRAW: A Recurrent Neural Network for Image Generation [Also, Attention]
1. http://proceedings.mlr.press/v37/gregor15.pdf
Semi-Supervised Learning with Deep Generative Models
1. https://arxiv.org/abs/1406.5298
Deep AutoRegressive Networks
1. https://arxiv.org/abs/1310.8499
A Note on the Evaluation of Generative Models
1. https://arxiv.org/pdf/1511.01844v2.pdf
Parallel WaveNet: Fast High-Fidelity Speech Synthesis (WaveRNN)
1. https://arxiv.org/abs/1711.10433
Efficient Neural Audio synthesis
1. https://arxiv.org/abs/1802.08435
Learning and Querying Fast Generative Models for Reinforcement Learning
1. https://arxiv.org/abs/1802.03006
Unsupervised Learning
Unsupervised Learning of 3D Structure from Images [Also, Computer Vision]
1. https://arxiv.org/pdf/1607.00662.pdf
Early Visual Concept Learning with Unsupervised Deep Learning (beta-VAE)
1. https://arxiv.org/pdf/1606.05579.pdf
Neural Scene Representation and Rendering
1. http://science.sciencemag.org/content/360/6394/1204
Spectral Inference Networks: Unifying Spectral Methods with Deep Learning
1. https://arxiv.org/abs/1806.02215
Representation Learning
SCAN: Learning Abstract Hierarchical Compositional Visual Concepts
1. https://arxiv.org/pdf/1707.03389.pdf
Towards Conceptual Compression
1. https://arxiv.org/abs/1604.08772
Neural Discrete Representation Learning [Also, Unsupervised Learning]
1. https://arxiv.org/abs/1711.00937
Disentangling by Factorising
1. https://arxiv.org/abs/1802.05983
Associative Compression Networks for Representation Learning
1. https://arxiv.org/abs/1804.02476
Attention
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
1. https://arxiv.org/pdf/1603.08575.pdf
Reasoning about Entailment with Neural Attention [Also, Natural Language Processing]
1. https://arxiv.org/pdf/1509.06664v2.pdf
Multiple Object Recognition with Visual Attention
1. https://arxiv.org/pdf/1412.7755v2.pdf
Recurrent Models of Visual Attention
1. https://arxiv.org/abs/1406.6247
Memory
Neural Episodic Control
1. https://arxiv.org/pdf/1703.01988.pdf
Generative Temporal Models With Memory
1. https://arxiv.org/pdf/1702.04649.pdf
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
1. https://arxiv.org/pdf/1610.09027.pdf
Model-Free Episodic Control
1. https://arxiv.org/abs/1606.04460
One-Shot Learning with Memory-Augmented Neural Networks
1. https://arxiv.org/abs/1605.06065
Associative Long Short-Term Memory
1. https://arxiv.org/abs/1602.03032
Prioritized Experience Replay
1. https://arxiv.org/pdf/1511.05952v3.pdf
Sample Efficient Actor-Critic with Experience Replay
1. https://arxiv.org/pdf/1611.01224.pdf
Learning Efficient Algorithms with Hierarchical Attentive Memory [Also, attention]
1. https://arxiv.org/abs/1602.03218
Count-Based Frequency Estimation with Bounded Memory [Also, Natural Language Processing]
1. http://www.ijcai.org/Proceedings/15/Papers/470.pdf
Memory-based Parameter Adaptation
1. https://arxiv.org/abs/1802.10542
Multi-Agent Systems
Value Decomposition Networks For Cooperative Multi-Agent Learning
1. https://arxiv.org/pdf/1706.05296.pdf
Learning to Communicate with Deep Multi-Agent Reinforcement Learning [Also, Multi-Task RL]
1. https://arxiv.org/pdf/1605.06676v2.pdf
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning [Also, Game Theory]
1. https://arxiv.org/abs/1711.00832
Machine Theory of Mind
1. https://arxiv.org/abs/1802.07740
Imitation Learning
Robust Imitation of Diverse Behaviors
1. https://arxiv.org/pdf/1707.02747.pdf
Learning Human Behaviors from Motion Capture by Adversarial Imitation
1. https://arxiv.org/abs/1707.02201
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
1. https://arxiv.org/pdf/1707.08817.pdf
Reinforcement and Imitation Learning for Diverse Visuomotor Skills
1. https://arxiv.org/abs/1802.09564
Playing Hard Exploration Games by Watching Youtube
1. https://arxiv.org/abs/1805.11592
Metalearning
Neural Programming
1. Hybrid Computing Using a Neural Network with Dynamic External Memory
  1. https://www.nature.com/articles/nature20101.epdf?author_access_token=ImTXBI8aWbYxYQ51Plys8NRgN0jAjWel9jnR3ZoTv0MggmpDmwljGswxVdeocYSurJ3hxupzWuRNeGvvXnoO8o4jTJcnAyhGuZzXJ1GEaD-Z7E6X_a9R-xqJ9TfJWBqz
2. Programmable Agents [Also, Representation Learning]
  1. https://arxiv.org/pdf/1706.06383.pdf
3. Neural Programmer-Interpreters
  1. https://arxiv.org/pdf/1511.06279v3.pdf
4. Neural Random-Access Machines
  1. https://arxiv.org/pdf/1511.06392v3.pdf
5. Neural Turing Machines
  1. https://arxiv.org/abs/1410.5401
6. Learning Explanatory Rules from Noisy Data
  1. https://arxiv.org/abs/1711.04574
7. Synthesizing Programs for Images using Reinforced Adversarial Learning (SPIRAL)
  1. https://arxiv.org/abs/1804.01118
Learning to learn by gradient descent by gradient descent
1. https://arxiv.org/abs/1606.04474
Learning to Reinforcement Learn [Also, Reinforcement Learning]
1. https://arxiv.org/pdf/1611.05763.pdf
Hierarchical Representations for Efficient Architecture Search
1. https://arxiv.org/pdf/1711.00436.pdf
Population Based Training of Neural Networks
1. https://arxiv.org/abs/1711.09846
Meta-Gradient Reinforcement Learning
1. https://arxiv.org/abs/1805.09801
Evolution
Convolution by Evolution
1. https://arxiv.org/pdf/1606.02580.pdf
Game Theory
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data
1. https://arxiv.org/pdf/1606.08718.pdf
The Mechanics of n-Player Differentiable Games [Also, Generative Models (GANs)]
1. https://arxiv.org/abs/1802.05642
Symmetric Decomposition of Asymmetric Games
1. https://www.nature.com/articles/s41598-018-19194-4
A Generalised Method for Empirical Game Theoretic Analysis
1. https://arxiv.org/abs/1803.06376
Inequity Aversion Resolves Intertemporal Social Dilemmas
1. https://arxiv.org/abs/1803.08884
Natural Language Processing
Generative and Discriminative Text Classification with Recurrent Neural Networks
1. https://arxiv.org/pdf/1703.01898.pdf
Learning to Compose Words Into Sentences with Reinforcement Learning
1. https://arxiv.org/pdf/1611.09100.pdf
Reference-Aware Language Models
1. https://arxiv.org/pdf/1611.01628.pdf
The Neural Noisy Channel
1. https://arxiv.org/pdf/1611.02554.pdf
Latent Predictor Networks for Code Generation
1. https://arxiv.org/pdf/1603.06744.pdf
Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
1. https://arxiv.org/pdf/1605.03852.pdf
Semantic Parsing with Semi-Supervised Sequential Autoencoders
1. https://arxiv.org/pdf/1609.09315.pdf
On the State of the Art of Evaluation in Neural Language Models
1. https://arxiv.org/abs/1707.05589
Teaching Machines to Read and Comprehend
1. https://arxiv.org/pdf/1506.03340v1.pdf
Learning to Transduce with Unbounded Memory [Also, Memory, Neural Programming]
1. https://arxiv.org/pdf/1506.02516v1.pdf
Dependency Recurrent Neural Language Models for Sentence Completion
1. http://cs.nyu.edu/~mirowski/pub/MirowskiVlachos_ACL2015_DependencyTreeRNN.pdf
Towards End-to-End Speech Recognition with Recurrent Neural Networks
1. http://proceedings.mlr.press/v32/graves14.pdf
Learning Word Embeddings Efficiently with Noise-Contrastive Estimation
1. http://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf
The NarrativeQA Reading Comprehension Challenge
1. https://arxiv.org/abs/1712.07040v1
Learning to Follow Language Instructions with Adversarial Reward Induction [Also, Loss Function Learning]
1. https://arxiv.org/abs/1806.01946
Multi-Modal
Look, Listen and Learn
1. https://arxiv.org/pdf/1705.08168.pdf
End-to-end Optimization of Goal-Driven and Visually Grounded Dialogue Systems
1. https://arxiv.org/pdf/1703.05423.pdf
GuessWhat?! Visual Object Discovery through Multi-Modal Dialogue
1. https://arxiv.org/pdf/1611.08481.pdf
Grounded Language Learning in a Simulated 3D World
1. https://arxiv.org/pdf/1706.06551.pdf
Understanding Grounded Language Learning Agents [Also, Natural Language Processing]
1. https://arxiv.org/abs/1710.09867
Objects that Sound
1. https://arxiv.org/abs/1712.06651
General Machine Learning
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
1. https://arxiv.org/pdf/1611.00712.pdf
Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees
1. https://arxiv.org/pdf/1702.08833.pdf
Unit Tests for Stochastic Optimization
1. https://arxiv.org/pdf/1312.6055v3.pdf
Bayesian Hierarchical Community Discovery
1. http://papers.nips.cc/paper/5048-bayesian-hierarchical-community-discovery.pdf
Implicit Reparameterization Gradients
1. https://arxiv.org/abs/1805.08498
Cleaning up the Neighborhood: A Full Classification for Adversarial Partial Monitoring
1. https://arxiv.org/abs/1805.09247
Theory
Online Learning with Gated Linear Networks
1. https://arxiv.org/abs/1712.01897v1
Miscellaneous
Generalized Probability Smoothing
1. https://arxiv.org/abs/1712.02151
Agents and Devices: A Relative Definition of Agency
1. https://arxiv.org/abs/1805.12387
Neuroscience
The Successor representation in human reinforcement learning
1. http://www.biorxiv.org/content/biorxiv/early/2017/07/04/083824.full.pdf
Dorsal Hippocampus Contributes to Model-Based Planning
1. https://www.nature.com/articles/nn.4613.epdf?author_access_token=OfuqRzRgBmFKQdGE1Qw7FdRgN0jAjWel9jnR3ZoTv0N3IprbEH8EVdPgVTpPLVjgaMNMGW_KprBhEEIm7f1drNjI5FB2fds3h58n3XEtMJPC3kLK1Pp3J2_Qb45cy7uk
Neuroscience-Inspired Artificial Intelligence
1. http://www.cell.com/neuron/fulltext/S0896-6273(17)30509-3
Computations Underlying Social Hierarchy Learning: Distinct Neural Mechanism for Updating and Representing Self-Relevant Information
1. http://www.cell.com/neuron/pdf/S0896-6273(16)30802-9.pdf
Dorsal Anterior Cingulate Cortex and the Value of Control
1. https://www.nature.com/articles/nn.4384.epdf?author_access_token=dq-w7RyWLn3z-0m4nbyTW9RgN0jAjWel9jnR3ZoTv0N7RVyemANPvboWSepiJaSAsTFiGqyORVbog9B6IjN113kC9aqMAEoNVCCfdRA4gLVJXcy4e1klhW0KiKS5F1gp
Semantic Representations in the Temporal Pole Predict False Memories
1. http://www.pnas.org/content/113/36/10180.abstract
Towards an Integration of Deep Learning and Neuroscience
1. http://www.biorxiv.org/content/early/2016/06/13/058545
What Learning Systems do Intelligent Agents Need? Complementary Learning systems Theory Updated
1. http://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(16)30043-2
Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network [Also, Planning]
1. http://www.cell.com/neuron/abstract/S0896-6273(16)30057-5
Predictive Representations can Link Model-Based Reinforcement Learning to Model-Free Mechanisms
1. https://www.biorxiv.org/content/early/2016/10/27/083857
Hippocampal place cells construct reward related sequences through unexplored space
1. https://elifesciences.org/articles/06063
A Probabilistic Approach to Demixing Odors
1. http://www.nature.com/neuro/journal/v20/n1/full/nn.4444.html
Approximate Hubel-Wiesel Modules and the Data Structures of Neural Computation
1. https://arxiv.org/pdf/1512.08457v1.pdf
The Future of Memory: Remembering, Imagining, and the Brain
1. http://static1.1.sqspcdn.com/static/f/1096238/22043246/1361990370157/FutureMemory--Neuron12.pdf?token=b5gB3ycz3e%2BmKnQQCW3%2FvwZyHwE%3D
Is the Brain a Good Model for Machine Intelligence?
1. http://www.gatsby.ucl.ac.uk/~demis/TuringSpecialIssue(Nature2012).pdf
Evidence Integration in Model-Based Tree Search
1. http://www.pnas.org/content/112/37/11708.full.pdf
(Commentary on0 Building Machines that Learn and Think for Themselves
1. https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/article/building-machines-that-learn-and-think-for-themselves/E28DBFEC380D4189FB7754B50066A96F
Prefrontal Cortex as a Meta-Reinforcement Learning System
1. https://www.nature.com/articles/s41593-018-0147-8

Current Frontier:

Hierarchical planning
Imagination-based planning with generative models
Unsupervised Learning
Memory and one-shot learning
Abstract Concepts
Continual and Transfer Learning

Emphasis on systems neuroscience - using the brain as inspiration for the structure and function of algorithms.

Neuroscience Inspired Artificial Intelligence
Examples of previous success of neuro-inspiration:

Reinforcement Learning
- Inspired by animal learning
- TD Learning came out of animal behavior research.
- Second-order conditioning (Conditional Stimulus) (Sutton and Barto, 1981)
Deep Learning.
- Convolutional Neural Networks. Visual Cortex (V1)
  - Uses hierarchical structure (successive processing layers)
  - Neurons in the early visual systems responds strongly to specific patterns of light (say, precisely oriented bars) but hardly responds to many other patterns.
  - Gabor functions describe the weights in V1 cells.
  - Nonlinear Transduction
  - Divisive Normalization
- Word / Sentence Vectors - Distributed Embeddings
  - Parallel Distributed Processing in the brain for representation and computation
- Dropout
  - Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
Attention
- Applying attention to memory
- Thought - it doesn’t make much sense to train an attention model over a static image, rather than over a time series. With a time series, bringing attention to changing aspects of the input makes sense.
Multiple Memory Systems
- Episodic Memory
  - Experience Replay
  - Especially for one shot experiences
- Working Memory
  - LSTM - gating allows for conditioning on current state
- Long-term Memory
  - External Memory
  - Gating in LSTM
Continual Learning
- Elastic weight consolidation for slowing down learning on weights that are important for previous tasks.

Examples of future success:

Intuitive Understanding of Physics
- Need to understand space, number, objectness
- Need to disentangle representations for transfer. (Dude, I feel so stolen from)
Efficient Learning (Learning from few examples)
Transfer Learning
- Transferring generalized knowledge gained in one context to novel domains
- Concept representations for transfer
  - No direct evidence of concept representations in brains
Imagination and Planning
- Toward model-based RL
- Internal model of the environment
  - Model needs to include compositional / disentangled representations for flexibility
- Implementing a forecasted-based method of action selection
- Monte-carlo Tree Search as simulation based planning
- In rat brains, we observe ‘preplay’ where rats imagine the likely future experience - measured by comparing neural activations at preplay to activations during the activity
- Generalization + Transfer in human planning
- Hierarchical Planning
Virtual Brain Analytics
```
2.
```

Source: Original Google Doc