Paper: | MLSP-P9.8 |
Session: | Probabilistic Modeling for Learning |
Location: | Churchill: Poster Area D |
Session Time: | Thursday, March 9, 13:30 - 15:30 |
Presentation Time: | Thursday, March 9, 13:30 - 15:30 |
Presentation: |
Poster
|
Topic: |
Machine Learning for Signal Processing: Sequential learning; sequential decision methods |
Paper Title: |
LEARNING ONLINE ALIGNMENTS WITH CONTINUOUS REWARDS POLICY GRADIENT |
Authors: |
Yuping Luo, Tsinghua University, China; Chung-Cheng Chiu, Navdeep Jaitly, Google Brain, United States; Ilya Sutskever, Open AI, China |