| Paper: | MLSP-P9.8 |
| Session: | Probabilistic Modeling for Learning |
| Location: | Churchill: Poster Area D |
| Session Time: | Thursday, March 9, 13:30 - 15:30 |
| Presentation Time: | Thursday, March 9, 13:30 - 15:30 |
| Presentation: |
Poster
|
| Topic: |
Machine Learning for Signal Processing: Sequential learning; sequential decision methods |
| Paper Title: |
LEARNING ONLINE ALIGNMENTS WITH CONTINUOUS REWARDS POLICY GRADIENT |
| Authors: |
Yuping Luo, Tsinghua University, China; Chung-Cheng Chiu, Navdeep Jaitly, Google Brain, United States; Ilya Sutskever, Open AI, China |