| Paper: | MLSP-P9.8 | 
  
| Session: | Probabilistic Modeling for Learning | 
  | Location: | Churchill: Poster Area D | 
  | Session Time: | Thursday, March 9, 13:30 - 15:30 | 
  | Presentation Time: | Thursday, March 9, 13:30 - 15:30 | 
  | Presentation: | 
     Poster
     | 
	 | Topic: | 
     Machine Learning for Signal Processing: Sequential learning; sequential decision methods | 
	
	 | Paper Title: | 
     LEARNING ONLINE ALIGNMENTS WITH CONTINUOUS REWARDS POLICY GRADIENT | 
  
	| Authors: | 
    Yuping Luo, Tsinghua University, China; Chung-Cheng Chiu, Navdeep Jaitly, Google Brain, United States; Ilya Sutskever, Open AI, China |