NIPS 2016 OpenAI Schedule


#1

OpenAI is excited to be participating in NIPS 2016. You can find members of our research and engineering teams presenting at the following tutorials, posters, talks and workshops.

Monday (Dec 5)

Tutorials

Deep Reinforcement Learning Through Policy Optimization
8:30-10:30am @ Rooms 211 + 212
Pieter Abbeel, John Schulman

Generative Adversarial Networks
2:30-4:30pm @ Area 1 + 2
Ian Goodfellow

Posters

Mon Dec 5th 6:00 - 9:30pm @ Area 5+6+7+8

Unsupervised Learning for Physical Interaction through Video Prediction
In Monday Posters (#62)

Chelsea Finn, Ian Goodfellow, Sergey Levine

Improving Variational Autoencoders with Inverse Autoregressive Flow (#83)
Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Xi Chen, Ilya Sutskever, Max Welling

Improved Techniques for Training GANs (#166)
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, Xi Chen

Tuesday (Dec 6)

Posters

6:00-9:30pm @ Area 5+6+7+8

Learning to learn by gradient descent by gradient descent (#9)
Marcin Andrychowicz, Misha Denil, Sergio GĂłmez, Matthew W Hoffman, David Pfau, Tom Schaul, Nando de Freitas

Generative Adversarial Imitation Learning (#40)
Jonathan Ho, Stefano Ermon

A Neural Transducer (#53)
Navdeep Jaitly, Quoc V Le, Oriol Vinyals, Ilya Sutskever, David Sussillo, Samy Bengio

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (#107)
Xi Chen, Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel

VIME: Variational Information Maximizing Exploration (#117)
Rein Houthooft, Xi Chen, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel

Wednesday (Dec 7)

Talks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
5:40 - 6:00pm @ Area 1 + 2
Tim Salimans, Diederik P Kingma

Thursday (Dec 8)

Deep Learning Symposium

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
2:50-3:15pm
Xi Chen*, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel

Panel discussion 1
3:15-4pm
Laurent Dinh, Aaron van den Oord, Xi Chen, Raia Hadsell, Yoshua Bengio

Friday (Dec 9)

Private Multi‑Party Machine Learning Workshop

Posters

Machine Learning with Privacy by Knowledge Aggregation and Transfer
Nicolas Papernot, Ulfar Erlingsson, Martin Abadi, Kunal Talwar, Ian Goodfellow

MAchine INtelligence workshop

8:30am - 6pm

A paradigm for situated and goal-driven language learning
2:50pm - 3pm
Jon Gauthier, Igor Mordatch

Neurobotics Workshop

Invited talk
4-4:30pm
Pieter Abbeel

Adversarial Training Workshop

8:00am - 6:30pm @ Area 3
Organizers: David Lopez-Paz, Leon Bottou, Alec Radford

Talks

Introduction to Generative Adversarial Networks
9:30-10am
Ian Goodfellow

Panel Discussion
3-4pm
Ian Goodfellow, Soumith Chintala, Arthur Gretton, Sebastian Nowozin, Aaron Courville, Yann LeCun, and Emily Denton

Posters

Adversarial Training Methods for Semi-Supervised Text Classification
Takeru Miyato, Andrew Dai and Ian Goodfellow.

A Connection Between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models
Chelsea Finn, Paul Christiano, Pieter Abbeel and Sergey Levine.

Deep Reinforcement Learning Workshop

Organizers: Pieter Abbeel, Peter Chen, David Silver, and Satinder Singh

Invited speaker

John Schulman
10:00 - 10:30am

Contributed talk

RL^2: Fast Reinforcement Learning via Slow Reinforcement Learning
Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel

Posters

Learning from the Hindsight Plan — Episodic MPC Improvement
Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel

Deep Reinforcement Learning for Tensegrity Robot Locomotion
Xinyang Geng, Marvin Zhang, Jonathan Bruce, Ken Caluwaerts, Massimo Vespignani, Vytas SunSpiral, Pieter Abbeel, Sergey Levine

Probabilistically Safe Policy Transfer
David Held, Zoe McCarthy, Michael Zhang, Fred Shentu, Pieter Abbeel

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel

Generalizing Skills with Semi-Supervised Reinforcement Learning
Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer
Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States
Harley Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

Stochastic Neural Networks for Hierarchical Reinforcement Learning
Carlos Florensa, Yan Duan, Pieter Abbeel

Learning Visual Servoing with Deep Features and Trust Region Fitted Q-Iteration
Alex X. Lee, Sergey Levine, Pieter Abbeel

A K-fold Method for Baseline Estimation in Policy Gradient Algorithms
Nithyanand Kota, Abhishek Mishra, Sunil Srinivasa, Xi Chen and Pieter Abbeel

Saturday (Dec 10)

Bayesian Deep Learning Workshop

Invited talks

Adversarial Approaches to Bayesian Learning and Bayesian Approaches to Adversarial Robustness
Ian Goodfellow
2-2:25pm

Panel: Will Bayesian deep learning be the next big thing? Or is Bayesian modelling dead?
Shakir Mohamed, David Blei, Ryan Adams, Jose Miguel Hernandez Lobato, Ian Goodfellow, Yarin Gal
4-5pm

Workshop on Machine Learning for Intelligent Transportation Systems

Invited talks

Safety in Reinforcement Learning
Pieter Abbeel
8:45-9:15am

Machine Learning for Education Workshop

Invited talks

Gradescope: AI for Grading
Pieter Abbeel
9:25-9:50am


#2

Looks awesome! Congratulations on all your progress, OpenAI devs!


#3

Are there any videos of this?


#4

At least some video available. Click the titles on Erika’s post, and you will get to a summary page. If there is a video it will be linked by a button marked “video” just under the title

E.g. “Deep Reinforcement Learning Through Policy Optimization”:

I’m pretty sure I have stumbled across a few of these talks on YouTube as well, just searching “Reinforcement Learning”


#5

Neil Slater,

All right, thanks.


#6

Thanks neil excellent video there! Lots of cool ideas packed into that one video.

It is becoming evident that we have created way too many names /3 or 4 letter abbreviations for algorithms which are really the same thing. I suspect this will get out of hand in the next 1 to 2 years and will probably stump ML world due to the overwhelming number of named algorithms which should really be viewed as a small subset of models with different configurations and not unique models/algos.

The community should probably invest some time in coming up with a nomenclature to speak about the algorithms being used. We need something similar to IUPAC way we talk about molecules.

  1. There should be a globally accepted proper way of talking about machine learning in particular AI in plain English (no greek letters, but some greek/latin terms tolerated?)

  2. A standard way of short-hand drawing models (think Feynman not pretty Adobe Illustrator versions from Nature papers showing the miracle that is 7 identical hidden layers of an MLP :roll_eyes:

As a side-rant…

All of the recent papers talk about "parameter x when set to val1 yielded better results than…"
While grad-students are affordable and bound to converge, it is evident that there needs to be a global consensus regarding hyper-parameter search and optimization. AFAIK, every bit of code which is written as a basis for an academic type paper should be written to contain a systematic self-tuning, sampling hyper-parameter search. With packages such as Hyperopt it boggles my mind why this is not the basis of all algos. On the surface it seems like a waste of time to hard-code this into your algos, however we ALL know what the real waste of compute time and lag between your papers/deliverables is caused by manually tuning any parameter. Just imagine if your microwave came with some teensy hyperparameters such as joules and peak power, we would very quickly deem them obsolete, dangerous and likely to get stuck in local minima…:wink: