Google Brain Research Overview

Category: Technical

Read the original document

By Jeremy Nixon [[email protected]]. Nov 2017. Updated June 2018.

Overview

Categorization of Breakthroughs / Contents
Major / Minor Researchers List (All appearing on papers)
Genealogy
Sorted Researchers by Paper Count
Deep Learning
1. Scalability and Speed
2. Convolutional Neural Networks
3. Recurrent Neural Networks
4. Privacy
5. Understanding / Theory
6. Regularization
Applications
1. Speech Recognition
2. Image Categorization
3. Image Captioning
4. Machine Translation
5. Natural Language Understanding
6. Multi-Modal
7. Pedestrian Detection
8. Grasp Detection
9. Go
10. Video
11. Dialogue
12. 3D Object Reconstruction
13. Speaker Verification
14. Health Care
15. Theorem Proving
16. Music
17. Pose Estimation
18. Speech Generation
19. Super Resolution
20. Chemistry
21. Robotics
22. Autonomous Vehicles
23. Physics
24. Device Placement
25. Games
26. Art
Unsupervised Learning
Attention
Memory
Transfer Learning
Representation Learning
Reinforcement Learning
1. Model-Based Reinforcement Learning
2. Multi-Task Learning
Metalearning
1. Neural Programming
2. Hyperparameter Optimization
Generative
GANs
Interpretability
Tools, Environments & Datasets
Adversarial Examples
Multi-Agent Systems
Variational Inference
Kernel Machines
Collaborative Filtering
Graphical / Relational Learning
Miscellaneous
Deep Learning
1. Scalability and Speed
  1. Large Scale Distributed Deep Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40565.pdf
  2. Multiframe Deep Neural Networks for Acoustic Modeling
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40810.pdf
  3. Asynchronous Stochastic Optimization for Sequence Training of Deep Neural Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42248.pdf
  4. Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45166.pdf
  5. Distilling the Knowledge in a Neural Network
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44873.pdf
  6. Deep Networks with Large Output Spaces
    1. https://arxiv.org/pdf/1412.7479.pdf
  7. TensorFlow: A System for Large-Scale Machine Learning
    1. https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf
  8. Revisiting Distributed Synchronous SGD
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45187.pdf
  9. Depthwise Separable Convolutions for Neural Machine Translation
    1. https://arxiv.org/abs/1706.03059
  10. Large Scale Distributed Neural Network Training Through Online Distillation
  11. https://openreview.net/pdf?id=rkr1UDeC-
  12. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
  13. https://openreview.net/pdf?id=SkhQHMW0W
2. Convolutional Neural Networks
  1. Going Deeper with Convolutions [Inception]
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43022.pdf
  2. Rethinking the Inception Architecture for Computer Vision
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44903.pdf
  3. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45169.pdf
  4. Towards Understanding the Invertibility of Convolutional Neural Networks
    1. https://arxiv.org/pdf/1705.08664.pdf
3. Recurrent Neural Networks
  1. Sequence to Sequence Learning with Neural Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43155.pdf
  2. Sequence Discriminative Distributed Training of Long Short-Term Memory Recurrent Neural Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42547.pdf
  3. Recurrent Neural Network Regularization [Also, Language Modeling]
    1. https://arxiv.org/abs/1409.2329
  4. Semi-supervised Sequence Learning
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44267.pdf
  5. Learning to Execute
    1. https://research.google.com/pubs/pub45474.html
  6. An Empirical Exploration of Recurrent Network Architectures
    1. https://research.google.com/pubs/pub45473.html
  7. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44961.pdf
  8. Using Fast Weights to Attend to the Recent Past
    1. https://arxiv.org/pdf/1610.06258.pdf
  9. Unsupervised Pre-training for Sequence to Sequence Learning
    1. https://arxiv.org/pdf/1611.02683.pdf
  10. Order Matters: Sequence to Sequence for Sets
  11. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44871.pdf`
  12. Multi-Task Sequence to Sequence Learning
  13. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44928.pdf
  14. Generating Sentences from a Continuous Space
  15. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45404.pdf
  16. Exponential expressivity in deep neural networks through transient chaos
  17. https://arxiv.org/pdf/1606.05340.pdf
  18. An Online Sequence-to-Sequence Model Using Partial Conditioning
  19. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45167.pdf
  20. A Neural Transducer
  21. https://arxiv.org/pdf/1511.04868.pdf
  22. Tuning Recurrent Neural Networks with Reinforcement Learning
  23. https://openreview.net/pdf?id=Syyv2e-Kx
  24. Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control
  25. https://arxiv.org/pdf/1611.02796.pdf
  26. SGD Learns the Conjugate Kernel Class of the Network
  27. https://arxiv.org/pdf/1702.08503.pdf
  28. Learning Hierarchical Information Flow with Recurrent Neural Modules
  29. https://arxiv.org/pdf/1706.05744.pdf
  30. Latent Sequence Decompositions
  31. https://arxiv.org/pdf/1610.03035.pdf
  32. Capacity and Trainability in Recurrent Neural Networks
  33. https://openreview.net/pdf?id=BydARw9ex
  34. Initialization Matters: Orthogonal Predictive State Recurrent Neural Networks
  35. https://openreview.net/pdf?id=HJJ23bW0b
4. Privacy
  1. Deep Learning with Differential Privacy
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45428.pdf
  2. Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data
    1. https://arxiv.org/pdf/1610.05755.pdf
  3. Glimmers: Resolving the Privacy / Trust Quagmire
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46128.pdf
  4. Scalable Private Learning with PATE
    1. https://arxiv.org/pdf/1802.08908.pdf
  5. Learning Differentially Private Recurrent Language Models [Also, Language Modeling]
    1. https://openreview.net/pdf?id=BJ0hF1Z0b
5. Understanding / Theory
  1. Qualitatively Characterizing Neural Network Optimization Problems
    1. https://arxiv.org/pdf/1412.6544.pdf
  2. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
    1. https://arxiv.org/pdf/1602.05897.pdf
  3. Understanding Deep Learning Requires Re-Thinking Generalization
    1. https://arxiv.org/pdf/1611.03530.pdf
  4. Sharp Minima Can Generalize for Deep Nets
    1. https://arxiv.org/pdf/1703.04933.pdf
  5. On the Expressive Power of Deep Neural Networks
    1. https://arxiv.org/pdf/1606.05336.pdf
  6. Nonlinear Random Matrix Theory for Deep Learning
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46342.pdf
  7. Mean Field Residual Networks: On the Edge of Chaos
    1. http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos.pdf
  8. Identity Matters in Deep Learning
    1. https://arxiv.org/pdf/1611.04231.pdf
  9. Geometry of Neural Network Loss Surfaces via Random Matrix Theory
    1. http://proceedings.mlr.press/v70/pennington17a/pennington17a.pdf
  10. Explaining the Learning Dynamics of Direct Feedback Alignment
  11. https://openreview.net/pdf?id=HkXKUTVFl
  12. Deep Information Propagation
  13. https://openreview.net/pdf?id=H1W1UN9gg
  14. The Emergence of Spectral Universality in Deep Networks
  15. https://arxiv.org/pdf/1802.09979.pdf
  16. Sensitivity and Generalization in Neural Networks: An Empirical Study
  17. https://arxiv.org/pdf/1802.08760.pdf
  18. Gradient Descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks
  19. https://arxiv.org/pdf/1802.06093.pdf
  20. Deep Neural Networks as Gaussian Processes
  21. https://openreview.net/pdf?id=B1EA-M-0Z
  22. A Bayesian Perspective on Generalization and Stochastic Gradient Descent
  23. https://openreview.net/pdf?id=BJij4yg0Z
6. Regularization
  1. Adding Gradient Noise Improves Learning for Very Deep Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45137.pdf
  2. Surprising Properties of Dropout in Deep Networks
    1. http://www.phillong.info/publications/HL17_deep_dropout.pdf
  3. Regularizing Neural Networks by Penalizing Confident Output Distributions
    1. https://arxiv.org/pdf/1701.06548.pdf
  4. A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization
    1. https://arxiv.org/pdf/1706.06569.pdf
7. Training Highly Multiclass Classifiers
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41872.pdf
8. Random Walk Initialization for Training Very Deep Feedforward Networks
  1. https://arxiv.org/pdf/1412.6558.pdf
9. Learning Factored Representations in a Deep Mixture of Experts
  1. https://arxiv.org/pdf/1312.4314.pdf
10. Training Deep Neural Networks on Noisy Labels with Bootstrapping
11. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43273.pdf
12. Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks
13. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43455.pdf
14. Reward Augmented Maximum Likelihood for Neural Structured Prediction
15. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45580.pdf
16. MuProp: Unbiased Backpropagation for Stochastic Neural Networks
17. https://arxiv.org/pdf/1511.05176v3.pdf
18. Chained predictions using convolutional neural networks
19. https://research.google.com/pubs/pub45945.html
20. Training a Subsampling Mechanism in Expectation
21. https://openreview.net/pdf?id=BJBkkaNYe
22. Resurrecting the Sigmoid in deep learning through dynamical isometry: theory and practice
23. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46341.pdf
24. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
25. https://openreview.net/pdf?id=B1ckMDqlg
26. On Blackbox Backpropagation and Jacobian Sensing
27. https://research.google.com/pubs/pub46347.html
28. Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
29. https://arxiv.org/pdf/1703.04363.pdf
30. Critical Hyper-Parameters: No Random, No Cry
31. https://arxiv.org/pdf/1706.03200.pdf
32. Distilling a Neural Network into a Soft Decision Tree
33. https://arxiv.org/pdf/1711.09784.pdf
34. Categorical Reparameterization with Gumbel-Softmax
35. https://arxiv.org/pdf/1611.01144.pdf
36. Training Confidence-Calibrated Classifiers For Detecting Out-of-Distribution Samples
37. https://openreview.net/pdf?id=ryiAv2xAZ
38. Fidelity-Weighted Learning
39. https://openreview.net/pdf?id=B1X0mzZCW
40. Don’t Decay the Learning Rate, Increase the Batch Size
41. https://openreview.net/pdf?id=B1Yy1BxCZ
Applications
1. Speech Recognition
  1. Deep Neural Networks for Acoustic Modeling in Speech Recognition
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38131.pdf
  2. Application of Pre-trained Deep Neural Networks to Large Vocabulary Speech Recognition
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38130.pdf
  3. On Rectified Linear Units for Speech Processing
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40811.pdf
  4. Multilingual Acoustic Models Using Distributed Deep Neural Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40807.pdf
  5. An Empirical Study of Learning Rates in DNNs for Speech Recognition
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40808.pdf
  6. Word Embeddings for Speech Recognition
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42543.pdf
  7. Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42947.pdf
  8. Learning the Speech Front-end with Raw Waveform CLDNNs
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43960.pdf
  9. Acoustic Modeling for Google Home
    1. http://www.cs.cmu.edu/~chanwook/MyPapers/b_li_interspeech_2017.pdf
  10. Multilingual Speech Recognition With a Single End-to-End Model
  11. https://arxiv.org/pdf/1711.01694.pdf
  12. An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model
  13. https://arxiv.org/pdf/1712.01996.pdf
2. Image Classification
  1. Using Web Co-occurrence Statistics for Improving Image Categorization
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42244.pdf
  2. The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
    1. https://arxiv.org/pdf/1511.06789.pdf
3. Image Captioning
  1. Grounded Compositional Semantics for Finding and Describing Images with Sentences
    1. https://nlp.stanford.edu/~socherr/SocherKarpathyLeManningNg_TACL2013.pdf
  2. Show and Tell: A Neural Image Caption Generator
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43274.pdf
  3. Learning Semantic Relationships for Better Action Retrieval in Images
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43443.pdf
4. Machine Translation
  1. Exploiting Similarities among Languages for Machine Translation
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44931.pdf
  2. Addressing the Rare Word Problem in Neural Machine Translation
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44929.pdf
  3. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
    1. https://arxiv.org/abs/1609.08144
  4. Sequence-to-Sequence Models Can Directly Translate Foreign Speech
    1. https://arxiv.org/pdf/1703.08581.pdf
  5. Massive Exploration of Neural Machine Translation Architectures
    1. https://arxiv.org/pdf/1703.03906.pdf
5. Natural Language Understanding
  1. Efficient Estimation of Word Representations in Vector Space
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41224.pdf
  2. Distributed Representations of Words and Their Compositionality
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44876.pdf
  3. Zero-Shot Learning by Convex Combination of Semantic Embeddings
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42371.pdf
  4. Distributed Representations of Sentences and Documents
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44930.pdf
  5. Sentence Compression by Deletion with LSTMs
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43852.pdf
  6. Grammar as a Foreign Language
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43799.pdf
  7. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45190.pdf
  8. Multilingual Language Processing From Bytes
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45170.pdf
  9. Exploring the Limits of Language Modeling
    1. https://arxiv.org/pdf/1602.02410.pdf
  10. Towards better decoding and language model integration in sequence to sequence models
  11. https://arxiv.org/pdf/1612.02695.pdf
  12. Learning to Skim Text
  13. https://arxiv.org/pdf/1704.06877.pdf
  14. Get To The Point: Summarization with Pointer-Generator Networks
  15. https://arxiv.org/pdf/1704.04368.pdf
  16. Generating Wikipedia by Summarizing Long Sequences
  17. https://openreview.net/pdf?id=Hyg0vbWC-
  18. An Efficient Framework for Learning Sentence Representations
  19. https://openreview.net/pdf?id=rJvJXZb0W
6. Multi-Modal
  1. DeViSE: A Deep Visual-Semantic Embedding Model
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41869.pdf
  2. Modulating Early Visual Processing by Language
    1. https://arxiv.org/pdf/1707.00683.pdf
  3. Context-aware Captions from Context-agnostic Supervision
    1. http://openaccess.thecvf.com/content_cvpr_2017/papers/Vedantam_Context-Aware_Captions_From_CVPR_2017_paper.pdf
  4. Better Text Understanding Through Image-To-Text Transfer
    1. https://arxiv.org/pdf/1705.08386.pdf
7. Pedestrian Detection
  1. Real Time Pedestrian Detection with Deep Network Cascades
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43850.pdf
  2. Pedestrian Detection with a Large Field-Of-View Deep Network
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43849.pdf
8. Grasp Detection
  1. Real-Time Grasp Detection Using Convolutional Neural Networks
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43875.pdf
9. Go
  1. Move Evaluation in Go Using Deep Convolutional Neural Networks
    1. https://arxiv.org/pdf/1412.6564.pdf
  2. Mastering the game of Go with deep neural networks and tree search
    1. https://www.nature.com/articles/nature16961
10. Video
11. Beyond Short Snippets: Deep Networks for Video Classification
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43793.pdf
12. Dialogue
13. A Neural Conversational Model
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44925.pdf
14. Smart Reply: Automated Response Suggestion for Email
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45189.pdf
15. Adversarial Evaluation of Dialogue Models
  1. https://arxiv.org/pdf/1701.08198.pdf
16. Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models
  1. https://arxiv.org/pdf/1701.03185.pdf
17. 3D Object Reconstruction
18. https://arxiv.org/pdf/1612.00814.pdf
19. Speaker Verification
20. End-to-End Text-Dependent Speaker Verification
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44681.pdf
21. Health Care
22. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45732.pdf
23. Theorem Proving
24. DeepMath - Deep Sequence Models for Premise Selection
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45402.pdf
25. Deep Network Guided Proof Search
  1. https://arxiv.org/pdf/1701.06972.pdf
26. Music
27. Audio Deepdream: Optimizing Raw Audio with Convolutional Networks
  1. https://18798-presscdn-pagely.netdna-ssl.com/ismir2016/wp-content/uploads/sites/2294/2016/08/ardila-audio.pdf
28. Generating Music by Fine-Tuning Recurrent Neural Networks with Reinforcement Learning
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45871.pdf
29. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
  1. https://arxiv.org/pdf/1704.01279.pdf
30. Pose Estimation
31. Towards Accurate Multi-person Pose Estimation in the Wild
  1. https://arxiv.org/pdf/1701.01779.pdf
32. Speech Generation
33. Tacotron: Towards End-to-End Speech Synthesis
  1. https://arxiv.org/pdf/1703.10135.pdf
34. RNN Approaches to Text Normalization: A Challenge
  1. https://arxiv.org/pdf/1611.00068.pdf
35. On Using Backpropagation for Speech texture Generation and Voice Cnversion
  1. https://arxiv.org/pdf/1712.08363.pdf
36. Natural TTS Synthesis by Conditioning Wavenet on Mel Spectrogram Prediction [Tacotron 2]
  1. https://arxiv.org/pdf/1712.05884.pdf
37. Super Resolution
38. Pixel Recursive Super Resolution
  1. https://arxiv.org/pdf/1702.00783.pdf
39. Chemistry
40. Neural Message Passing for Quantum Chemistry
  1. https://arxiv.org/pdf/1704.01212.pdf
41. Robotics
42. Autonomous Vehicles
  1. Learning with Proxy Supervision for End-To-End Visual Learning
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45985.pdf
43. Learning Robotic Manipulation of Granular Media
  1. https://arxiv.org/pdf/1709.02833.pdf
44. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection
  1. https://drive.google.com/file/d/0B0mFoBMu8f8BaHYzOXZMdzVOalU/view
45. End-to-End Learning of Semantic Grasping
  1. https://arxiv.org/pdf/1707.01932.pdf
46. Cognitive Mapping and Planning for Visual Navigation
  1. https://arxiv.org/pdf/1702.03920.pdf
47. Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
  1. https://arxiv.org/pdf/1709.07857.pdf
48. Physics
49. Accelerating Eulerian Fluid Simulation with Convolutional Networks
  1. https://arxiv.org/pdf/1607.03597.pdf
50. Device Placement
51. Device Placement Optimization with Reinforcement Learning
  1. https://arxiv.org/pdf/1706.04972.pdf
52. A Hierarchical Model for Device Placement
  1. https://openreview.net/pdf?id=Hkc-TeZ0W
53. Games
54. Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?
  1. https://arxiv.org/pdf/1711.02301.pdf
55. Art
56. A Neural Representation of Sketch Drawings
  1. https://arxiv.org/pdf/1704.03477.pdf
Unsupervised Learning
1. Building High-level Features Using Large Scale Unsupervised Learning
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/38115.pdf
2. Towards Principled Unsupervised Learning
  1. https://arxiv.org/pdf/1511.06440.pdf
3. Time-Contrastive Networks: Self-Supervised Learning from Video
  1. https://arxiv.org/pdf/1704.06888.pdf
4. Stochastic Variational Video prediction [Also, Model-Based RL]
  1. https://arxiv.org/pdf/1710.11252.pdf
5. Short and Deep: Sketching Neural Networks
  1. https://openreview.net/pdf?id=r1br_2Kge
6. Geometry-Based Next Frame Prediction from Monocular Video
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45984.pdf
7. Decomposing Motion and Content for Natural Video Sequence Prediction
  1. https://sites.google.com/a/umich.edu/rubenevillegas/iclr2017
8. Cross-View Training for Semi-Supervised Learning
  1. https://openreview.net/forum?id=BJubPWZRW
Attention
1. On Learning Where to Look
  1. https://arxiv.org/pdf/1405.5488.pdf
2. Pointer Networks
  1. https://arxiv.org/pdf/1506.03134.pdf
3. Attention for Fine-Grained Categorization
  1. https://arxiv.org/pdf/1412.7054.pdf
4. Listen, Attend and Spell
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44926.pdf
5. Collective Entity Resolution with Multi-Focal Attention
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45395.pdf
6. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
  1. https://arxiv.org/pdf/1603.08575.pdf
7. Online and Linear-Time Attention by Enforcing Monotonic Alignments
  1. https://research.google.com/pubs/pub46110.html
8. Learning Hard Alignments with Variational Inference [Hard Attention]
  1. https://arxiv.org/pdf/1705.05524.pdf
9. Efficient Attention using a Fixed-Size Memory Representation
  1. https://arxiv.org/pdf/1707.00110.pdf
10. Attention is All You Need
11. https://arxiv.org/pdf/1706.03762.pdf
12. An Analysis of “Attention” in Sequence-to-Sequence Models
13. http://www.isca-speech.org/archive/Interspeech_2017/pdfs/0232.PDF
14. Monotonic Chunkwise Attention
15. https://openreview.net/pdf?id=Hko85plCW
Memory
1. Learning to Remember Rare Events
  1. https://openreview.net/pdf?id=SJTQLdqlg
Transfer Learning
1. Net2Net: Accelerating Learning via Knowledge Transfer
  1. https://arxiv.org/pdf/1511.05641.pdf
2. Domain Separation Networks
  1. https://arxiv.org/pdf/1608.06019.pdf
3. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
  1. https://arxiv.org/pdf/1612.05424.pdf
4. PathNet: Evolution Channels Gradient Descent in Super Neural networks
  1. https://arxiv.org/pdf/1701.08734.pdf
5. One Model to Learn Them All
  1. https://arxiv.org/pdf/1706.05137.pdf
6. Exploring the structure of a real-time, arbitrary neural artistic stylization network
  1. https://arxiv.org/pdf/1705.06830.pdf
7. A Brief Study of In-Domain Transfer and Learning from Fewer Samples using a Few Simple Priors
  1. https://arxiv.org/pdf/1707.03979.pdf
Representation Learning
1. SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability
  1. https://arxiv.org/pdf/1706.05806.pdf
2. A Learned Representation for Artistic Style
  1. https://arxiv.org/pdf/1610.07629.pdf
3. Learning Latent Permutations with Gumbel-Sinkhorn Networks
  1. https://openreview.net/pdf?id=Byt3oJ-0W
Reinforcement Learning
1. Model-Based Reinforcement Learning
  1. Unsupervised Learning for Physical Interaction through Video Prediction [Also, Robotics]
    1. https://arxiv.org/pdf/1605.07157.pdf
  2. Continuous Deep Q-Learning with Model-based Acceleration
    1. http://proceedings.mlr.press/v48/gu16.pdf
  3. Value Prediction Network
    1. https://arxiv.org/pdf/1707.03497.pdf
  4. Learning to Generate Long-term Future via Hierarchical Prediction
    1. https://arxiv.org/pdf/1704.05831.pdf
  5. Discrete Sequential Prediction of Continuous Actions for Deep RL
    1. https://arxiv.org/pdf/1705.05035.pdf
  6. Deep Visual Foresight for Planning Robot Motion [Also, Robotics]
    1. https://arxiv.org/pdf/1610.00696.pdf
  7. Temporal Difference Models: Model-Free Deep RL for Model-Based Control
    1. https://openreview.net/pdf?id=Skw0n-W0Z
  8. Learning Unsupervised Latent Dynamics Models for Multi-task Continuous Control from Pixels
    1. https://drive.google.com/file/d/1HWDyhEUpVgAiSSQtEYEGb6EIeY8YQay8/view
2. Multi-Task Learning
  1. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
    1. https://arxiv.org/pdf/1706.05064.pdf
3. Unsupervised Perceptual Rewards for Imitation Learning
  1. https://openreview.net/pdf?id=Byf3mmNFl
4. Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
  1. https://arxiv.org/pdf/1707.01891.pdf
5. Robust Adversarial Reinforcement Learning [Also, Multi-Agent Systems]
  1. https://arxiv.org/pdf/1703.02702.pdf
6. REBAR: Low-Variance, unbiased gradient estimates for discrete latent variable models
  1. https://arxiv.org/pdf/1703.07370.pdf
7. Q-Prop: Sample Efficient Policy Gradient with an Off-Policy Critic
  1. https://arxiv.org/pdf/1611.02247.pdf
8. Particle Value Functions
  1. https://openreview.net/pdf?id=BJyBKyHKg
9. Path Integral Guided Policy Search [Also, Robotics]
  1. https://arxiv.org/pdf/1610.00529.pdf
10. Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
11. https://arxiv.org/pdf/1706.00387.pdf
12. Improving Policy Gradient by Exploring Under-Appreciated Rewards
13. https://openreview.net/pdf?id=ryT4pvqll
14. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
15. https://arxiv.org/pdf/1610.00633.pdf
16. Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search
17. https://arxiv.org/pdf/1610.00673.pdf
18. Changing Model Behavior at Test Time Using Reinforcement Learning
19. https://arxiv.org/pdf/1702.07780.pdf
20. Bridging the Gap Between Value and Policy Based Reinforcement Learning
21. https://arxiv.org/pdf/1702.08892.pdf
22. A comparative study of counterfactual estimators
23. https://arxiv.org/pdf/1704.00773.pdf
24. PRM-RL: Long Range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning
25. https://arxiv.org/pdf/1710.03937.pdf
26. Path consistency Learning in Tsallis Entropy Regularized MDPs
27. https://arxiv.org/abs/1802.03501
28. Leave No Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
29. https://openreview.net/pdf?id=S1vuO-bCW
30. Deep Bayesian Bandits Showdown
31. https://openreview.net/pdf?id=SyYe6k-CW
Metalearning
1. Neural Programming
  1. Reinforcement Learning Neural Turing Machines
    1. https://research.google.com/pubs/pub45478.html
  2. Neural Random-Access Machines
    1. https://research.google.com/pubs/pub45472.html
  3. Neural Programmer: Inducing Latent Programs with Gradient Descent
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/44927.pdf
  4. Neural GPUs Learn Algorithms
    1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45139.pdf
  5. Learning a Natural Language Interface with Neural Programmer
    1. https://arxiv.org/pdf/1611.08945.pdf
2. Hyperparameter Optimization
  1. Toward Optimal Run Racing: Application to Deep Learning Calibration
    1. https://arxiv.org/pdf/1706.03199.pdf
  2. Searching for Activation Functions
    1. https://arxiv.org/pdf/1710.05941.pdf
  3. Neural Optimizer Search with Reinforcement Learning
    1. https://arxiv.org/pdf/1709.07417.pdf
  4. Neural Combinatorial Optimization with Reinforcement Learning
    1. https://openreview.net/pdf?id=Bk9mxlSFx
  5. Neural Architecture Search with Reinforcement Learning
    1. https://arxiv.org/pdf/1611.01578.pdf
  6. Large-Scale Evolution of Image Classifiers
    1. https://arxiv.org/pdf/1703.01041.pdf
  7. Searching for Activation Functions
    1. https://arxiv.org/pdf/1710.05941.pdf
3. Learned Optimizers that Scale and Generalize
  1. https://arxiv.org/pdf/1703.04813.pdf
4. HyperNetworks
  1. https://openreview.net/pdf?id=rkpACe1lx
5. Supervised Learning of Unsupervised Learning Rules
  1. http://metalearning.ml/papers/metalearn17_metz.pdf
6. MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
  1. https://arxiv.org/pdf/1711.06798.pdf
7. A Meta-Learning Perspective on Cold-Start Recommendations for Items
  1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46346.pdf
8. Meta-Learning for Semi-Supervised Few-Shot Classification
  1. https://openreview.net/pdf?id=HJcSzz-CZ
9. Generalizing Hamiltonian Monte Carlo with Neural Networks
  1. https://openreview.net/pdf?id=B1n8LexRZ
Generative
GANs
1. Improved Generator Objectives for GANs
  1. https://arxiv.org/pdf/1612.02780.pdf
2. Unrolled Generative Adversarial Networks
  1. https://openreview.net/pdf?id=BydrOIcle
3. Improving Image Generative Models with Human Interactions
  1. https://arxiv.org/pdf/1709.10459.pdf
4. Conditional Image Synthesis with Auxiliary Classifier GANs
  1. https://arxiv.org/pdf/1610.09585.pdf
5. Are GANs Created Equal? A Large-Scale Study
  1. https://arxiv.org/pdf/1711.10337.pdf
6. AdaGAN: Boosting Generative Models
  1. https://arxiv.org/pdf/1701.02386.pdf
7. MaskGAN: Better Text Generation Via Filling in the _____
  1. https://openreview.net/pdf?id=ByOExmWAb
8. Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence at Every Step
  1. https://openreview.net/pdf?id=ByQpn1ZA-
Experiments in Handwriting with a Neural Network
1. https://distill.pub/2016/handwriting/
From optimal transport to generative modeling: the VEGAN cookbook
1. https://arxiv.org/pdf/1705.07642.pdf
Density Estimation Using Real NVP
1. https://arxiv.org/pdf/1605.08803.pdf
A Neural Representation of Sketch Drawings
1. https://arxiv.org/pdf/1704.03477.pdf
Wasserstein Auto-Encoders
1. https://arxiv.org/pdf/1711.01558.pdf
Stochastic Variational Video Prediction
1. https://openreview.net/pdf?id=rk49Mg-CW
Latent Constraints: Learning to Generate Conditionally From Unconditional Generative Models
1. https://openreview.net/pdf?id=Sy8XvGb0-
Interpretability
Deconvolution and Checkerboard Artifacts
1. https://distill.pub/2016/deconv-checkerboard/
Visualizing Dataflow Graphs of Deep Learning Models in Tensorflow
1. http://idl.cs.washington.edu/files/2018-TensorFlowGraph-VAST.pdf
Towards A Rigorous Science of Interpretable Machine Learning
1. https://arxiv.org/pdf/1702.08608.pdf
The (Un)reliability of Saliency Methods
1. https://arxiv.org/pdf/1711.00867.pdf
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability [Also, Recurrent Neural Networks]
1. https://arxiv.org/pdf/1611.09434.pdf
VisualBackProp: Efficient Visualization of CNNs
1. https://arxiv.org/abs/1611.05418
Learning How to Explain Neural Networks: PatternNet and PatternAttribution
1. https://openreview.net/pdf?id=Hkn7CBaTW
Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs [Also, Recurrent Neural Networks]
1. https://openreview.net/pdf?id=rkRwGg-0Z
Tools, Environments & Datasets
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41880.pdf
Adversarial Examples
Intriguing Properties of Neural Networks
1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42503.pdf
Explaining and Harnessing Adversarial Examples
1. https://arxiv.org/pdf/1412.6572.pdf
Virtual Adversarial Training for Semi-Supervised Text Classification
1. https://research.google.com/pubs/pub45403.html
The Space of Transferable Adversarial Examples
1. https://arxiv.org/pdf/1704.03453.pdf
Adversarial Examples in the Physical World
1. https://arxiv.org/pdf/1607.02533.pdf
Adversarial Training Methods for Semi-Supervised Text Classification
1. https://arxiv.org/pdf/1605.07725.pdf
Adversarial Machine Learning at Scale
1. https://arxiv.org/pdf/1611.01236.pdf
Thermometer Encoding: One Hot Way to Resist Adversarial Examples
1. https://openreview.net/pdf?id=S18Su--CW
Intriguing Properties of Adversarial Examples
1. https://arxiv.org/pdf/1711.02846.pdf
Ensemble Adversarial Training: Attacks and Defences 1. https://openreview.net/pdf?id=rkZvSe-RZ
Adversarial Spheres 1. https://arxiv.org/pdf/1801.02774.pdf
Multi-Agent Systems
Learning to Protect Communications with Adversarial Neural Cryptography
1. https://arxiv.org/pdf/1610.06918.pdf
Adversarial Autoencoders
1. https://arxiv.org/pdf/1511.05644.pdf
XGAN: Unsupervised Image-To-Image Translation for Many-To-Many Mappings
1. https://arxiv.org/pdf/1711.05139.pdf
Supervision via Competition: Robot Adversaries for Learning Tasks
1. https://arxiv.org/pdf/1610.01685.pdf
Variational Inference
Variational Boosting: Iteratively Refining Posterior Approximations
1. https://arxiv.org/pdf/1611.06585.pdf
Reducing Reparameterization Gradient Variance
1. https://arxiv.org/pdf/1705.07880.pdf
Filtering Variational Objectives
Kernel Machines
Fastfood - Approximating Kernel Expansions in Loglinear Time
1. http://www-cs.stanford.edu/~quocle/LeSarlosSmola_ICML13.pdf
Random Features for Compositional Kernels
1. https://arxiv.org/pdf/1703.07872.pdf
The Geometry of Random Features
1. http://storage.googleapis.com/pub-tools-public-publication-data/pdf/70a89b15f9b160dd10248de8862d1584f03ddc22.pdf
Collaborative Filtering
Local Collaborative Ranking
1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42242.pdf
Graphical / Relational Learning
Large-Scale Object Classification Using Label Relation Graphs
1. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42854.pdf
Graph Searching Games and Width Measures for Directed Graphs
1. http://drops.dagstuhl.de/opus/volltexte/2015/4902/pdf/2.pdf
Graph Partition Neural Networks for Semi-Supervised Classification
1. https://arxiv.org/pdf/1803.06272.pdf
Miscellaneous
Tensorflow: Learning Functions at Scale
1. https://dl.acm.org/citation.cfm?id=2976746
Deep Learning Games
1. https://papers.nips.cc/paper/6315-deep-learning-games.pdf
Tangent: Automatic Differentiation Using Source Code Transformation in Python
1. https://arxiv.org/pdf/1711.02712.pdf
ExtDict: Extensible Dictionaries for Data and Platform-Aware Large Scale Learning
1. http://www.aceslab.org/sites/default/files/main_0.pdf
Dynamic Routing between Capsules
1. https://research.google.com/pubs/pub46351.html
Climbing a Shaky Ladder: Better ADaptive Risk Estimation
1. https://arxiv.org/pdf/1706.02733.pdf
Avoiding Discrimination through Causal Reasoning
1. https://arxiv.org/pdf/1706.02744.pdf
Who Said What: Modeling Individual Labelers Improves Classification
1. https://arxiv.org/pdf/1703.08774.pdf
Matrix Capsules with EM Routing
1. https://openreview.net/pdf?id=HJWLfGWRb
Graph sketching-based Space-efficient Data Clustering 1. http://storage.googleapis.com/pub-tools-public-publication-data/pdf/7174df3a5627e483b5d120d8edb5843fa593577e.pdf

Major Researchers [10+ Papers / Founding]

Jeff Dean
Samy Bengio
Geoffrey Hinton
Andrew Ng
Quoc Le
Greg Corrado
Vincent Vanhoucke
Yoran Singer
Ian Goodfellow
Tomas Mikolov
Ilya Sutskever
Oriol Vinyals
Marc’ Aurelio Ranzato
Christian Szegedy
Navdeep Jaitly
Mohammad Norouzi
Lukasz Kaiser
Jonathon Shlens

Minor Researchers

Rajat Monga
Kai Chen
Matthieu Devin
Mark Mao
Andrew Senior
Paul Tucker
Ke Yang
Patrick Nguyen
Dumitru Erhan
Eugene Ie
Andrew Rabinovich
Jon Shlens
Yoram Singer
Ciprian Chelba
Mike Schuster
Qi Ge
Thorsten Brants
Tamas Sarlos
Georg Heigold
Andrea Frome
Maya Gupta
David Sussillo
Dragonir Anguelov
Alexander Toshev
Andrew Dai
Anelia Angelova
Alex Krizhevsky
Lucasz Kaiser
Terry Koo
Slav Petrov
Tara Sainath
Hasim Sak
Pierre Sermanet
Esteban Real
Peter Liu
Sergey Levine
Amit Daniely
Roy Frostig
Martin Abadi
Zhifeng Chen
Yonghui Wu
Dale Schuurmans
Jianmin Chen
Rafal Jozefowicz
Sergey Ioffe
Honglak Lee
Manjunath Kudlur
Karol Kurach
Minh-Thang Luong
John Nahm
Alexander Alemi
Jascha Sohl-Dckstein
Noam Shazeer
David Ha
Shan Carter
Chris Olah
Ignacio Moreno
Douglas Eck
Natasha Jaques
Shixiang Gu
Konstantinos Bousmalis
Francois Chollet
Geoffrey Irving
Amarnag Subramanya
Michael Ringgaard
Fernando Pereira
Adam Roberts
Cinjon Resnick
Anjuli Kannan
Ryan Adams
David Dohan
Luke Metz
Kelvin Xu
Jan Chorowski
Colin Raffel
Dieterich Lawson
George Papandreou
Kevin Murphy
Jonathan Tompson
Olivier Bousquet
Sylvain Gelly
Olivier Teytaud
Damien Vincent
Eric Jang
Jasmine Hsu
Been Kim
Bart van Merrienboer
Alexander Wiltschko
Dan Moldovan
Yuxuan Wang
RJ Skerry-Ryan
James Davidson
Ron Weiss
Jan Chorowski
Yonghui Wu
Zhifeng Chen
Kunal Talwar
Barret Zoph
Maithra Raghu
Justin Gilmer
Jeffrey Pennington
Samuel Schoenholz
Gabriel Pereyra
George Tucker
Vineet Gupta
Ryan Dahl
Azalia Mirhoseini
Andy Davis
Ashish Vaswani
Krzysztof Maziarz
Vikas Sindhwani
Irwan Bello
Hugo Larochelle
Vijay Vasudevan
Hieu Pham
Jesse Engel
Denny Britz
Anna Goldie
Connor Schenck
Ruben Villegas
Yuliang Zou
Sungryull Sohn
Danijar Hafner
Alex Irpan
James Davidson
Chung-Cheng Chiu
Kevin Swersky
Olga Wichrowska
Jakob Forester
Andrew Lampinen
David So
Fred Bertsch
Reza Mahjourian
Yasaman Bahri
Ofir Nachum
Melody Guan
Julian Ibarz
Benoit Steiner
Rasmus Larsen
Ethan Holly
Gal Chechik
Augustus Odena
Christopher Olah
Jasmine Collins
Michal Jastrzebski
Philip Haeusser
Mario Lucic
Richard Sproat
Alexey Kurakin
Takeru Miyato
Kristofer Schlachter
Tomer Koren
Ayush Sekhari
Matthew Kelcey
Laura Downs

Genealogy

Founding Team

Jeff Dean
Samy Bengio
Geoffrey Hinton
Andrew Ng
Quoc Le
Greg Corrado
Vincent Vanhoucke
Yoran Singer
Ian Goodfellow
Tomas Mikolov
Rajat Monga
Kai Chen (Brain NY)
Matthieu Devin
Mark Mao
Marc’ Aurelio Ranzato (Brain NY)
Andrew Senior
Paul Tucker
Ke Yang
Patrick Nguyen
Yoram Singer
Dzmitry Bahdanau

In the early days, the exploration was mainly in scaling deep learning and discovering new applications to speech recognition, image categorization and language modeling.

Tensorflowers: Mart´ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, ´ Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng

Massive acceleration of Brain papers into ICLR 2016… Tensorflowers start making their way onto papers.

Noisy Counts (Scraped)

[29, 'Oriol Vinyals'], [27, 'Samy Bengio'], [23, 'Ilya Sutskever'], [20, 'Navdeep Jaitly'], [16, 'Sergey Levine'], [14, 'Mohammad Norouzi'],1 [14, 'Ian Goodfellow'], [13, 'Lukasz Kaiser'], [13, 'Jonathon Shlens'], [12, 'Vincent Vanhoucke'], [10, 'Quoc Le'], [10, 'Geoffrey Hinton'], [9, 'Dumitru Erhan'], [8, 'Shixiang Gu'], [8, 'Rajat Monga'], [8, 'Honglak Lee'], [8, 'Greg Corrado'], [8, 'Christian Szegedy'], [8, 'Andrew Senior'], [7, 'Yoram Singer'], [7, 'Tomas Mikolov'], [7, 'Sylvain Gelly'], [7, 'Olivier Bousquet'], [7, 'Karol Kurach'], [7, 'Georg Heigold'], [7, 'Anelia Angelova'], [6, 'Zhifeng Chen'], [6, 'Rafal Jozefowicz'], [6, 'Ofir Nachum'], [6, 'Matthieu Devin'], [6, 'Martin Abadi'], [6, 'James Davidson'], [6, 'Dieterich Lawson'], [6, 'Dale Schuurmans'], [5, 'Yonghui Wu'], [5, 'Yonghui Wu'], [5, 'Tara Sainath'], [5, 'Mike Schuster'], [5, 'Manjunath Kudlur'], [5, 'Kevin Murphy'], [5, 'Justin Gilmer'], [5, 'George Tucker'], [5, 'Douglas Eck'], [4, 'Pierre Sermanet'], [4, 'Noam Shazeer'], [4, 'Maithra Raghu'], [4, 'Kunal Talwar'], [4, 'Kelvin Xu'], [4, 'Kai Chen'], [4, 'Jeff Dean'], [4, 'Jan Chorowski'], [4, 'Geoffrey Irving'], [4, 'David Sussillo'], [4, 'David Ha'], [4, 'Colin Raffel'], [4, 'Chris Olah'], [4, 'Andrea Frome'], [4, 'Amit Daniely'], [4, 'Alexander Toshev'], [3, 'Vikas Sindhwani'], [3, 'Vijay Vasudevan'], [3, 'Tomer Koren'], [3, 'Paul Tucker'], [3, 'Patrick Nguyen'], [3, 'Olivier Teytaud'], [3, 'Natasha Jaques'], [3, 'Konstantinos Bousmalis'], [3, 'Julian Ibarz'], [3, 'Jonathan Tompson'], [3, 'Jeffrey Pennington'], [3, 'Hasim Sak'], [3, 'Denny Britz'], [3, 'Damien Vincent'], [3, 'Benoit Steiner'], [3, 'Barret Zoph'], [3, 'Azalia Mirhoseini'], [3, 'Augustus Odena'], [3, 'Andy Davis'], [3, 'Andrew Rabinovich'], [3, 'Alex Krizhevsky'], [2, 'Vineet Gupta'], [2, 'Sergey Ioffe'], [2, 'Ryan Dahl'], [2, 'Ruben Villegas'], [2, 'Roy Frostig'], [2, 'Peter Liu'], [2, 'Melody Guan'], [2, 'Luke Metz'], [2, 'Ke Yang'], [2, 'Jianmin Chen'], [2, 'Irwan Bello'], [2, 'Hugo Larochelle'], [2, 'Hieu Pham'], [2, 'Fred Bertsch'], [2, 'Francois Chollet'], [2, 'Esteban Real'], [2, 'Eric Jang'], [2, 'Cinjon Resnick'], [2, 'Been Kim'], [2, 'Ashish Vaswani'], [2, 'Anna Goldie'], [2, 'Anjuli Kannan'], [2, 'Andrew Dai'], [2, 'Amarnag Subramanya'], [2, 'Alexey Kurakin'], [2, 'Adam Roberts'], [1, 'Yuxuan Wang'], [1, 'Yuliang Zou'], [1, 'Yasaman Bahri'], [1, 'Thorsten Brants'], [1, 'Terry Koo'], [1, 'Tamas Sarlos'], [1, 'Takeru Miyato'], [1, 'Sungryull Sohn'], [1, 'Slav Petrov'], [1, 'Shan Carter'], [1, 'Ryan Adams'], [1, 'Richard Sproat'], [1, 'Reza Mahjourian'], [1, 'Rasmus Larsen'], [1, 'RJ Skerry-Ryan'], [1, 'Qi Ge'], [1, 'Philip Haeusser'], [1, 'Olga Wichrowska'], [1, 'Michal Jastrzebski'], [1, 'Mark Mao'], [1, 'Krzysztof Maziarz'], [1, 'Kristofer Schlachter'], [1, 'Kevin Swersky'], [1, 'Jesse Engel'], [1, 'Jasmine Hsu'], [1, 'Jasmine Collins'], [1, 'Ignacio Moreno'], [1, 'George Papandreou'], [1, 'Gal Chechik'], [1, 'Gabriel Pereyra'], [1, 'Fernando Pereira'], [1, 'Eugene Ie'], [1, 'Ethan Holly'], [1, 'David Dohan'], [1, 'Danijar Hafner'], [1, 'Dan Moldovan'], [1, 'Connor Schenck'], [1, 'Ciprian Chelba'], [1, 'Chung-Cheng Chiu'], [1, 'Christopher Olah'], [1, 'Ayush Sekhari'], [1, 'Andrew Ng'], [1, 'Andrew Lampinen'], [1, 'Alex Irpan'],

Source: Original Google Doc