Following is the list of accepted SLT 2016 papers, sorted by paper title. You can use the search feature of your web browser to find your paper number. Notifications to all authors have also been sent by email. If you have not received your notification of the results by email, please contact us at papers@slt2016.org.
1171 | A FACTOR ANALYSIS MODEL OF SEQUENCES FOR LANGUAGE RECOGNITION |
1076 | A LOG-LINEAR WEIGHTING APPROACH IN THE WORD2VEC SPACE FOR SPOKEN LANGUAGE UNDERSTANDING |
1194 | A MULTICHANNEL CONVOLUTIONAL NEURAL NETWORK FOR CROSS-LANGUAGE DIALOG STATE TRACKING |
1130 | A NONPARAMETRIC BAYESIAN APPROACH FOR AUTOMATIC DISCOVERY OF A LEXICON AND ACOUSTIC UNITS |
1024 | A PRIORITIZED GRID LONG SHORT-TERM MEMORY RNN FOR SPEECH RECOGNITION |
1016 | A ROBUST DIARIZATION SYSTEM FOR MEASURING DOMINANCE IN PEER-LED TEAM LEARNING GROUPS |
1022 | A STUDY OF SPEECH DISTORTION CONDITIONS IN REAL SCENARIOS FOR SPEECH PROCESSING APPLICATIONS |
1081 | ABSTRACTIVE HEADLINE GENERATION FOR SPOKEN CONTENT BY ATTENTIVE RECURRENT NEURAL NETWORKS WITH ASR ERROR MODELING |
1160 | ADAPTATION OF SVM FOR MIL FOR INFERING THE POLARITY OF MOVIES AND MOVIE REVIEWS |
1115 | AN OVERVIEW OF END–TO–END LANGUAGE UNDERSTANDING AND DIALOG MANAGEMENT FOR PERSONAL DIGITAL ASSISTANTS |
1038 | AN UNSUPERVISED VOCABULARY SELECTION TECHNIQUE FOR CHINESE AUTOMATIC SPEECH RECOGNITION |
1110 | ANALYSIS OF THE DNN-BASED SRE SYSTEMS IN MULTI-LANGUAGE CONDITIONS |
1161 | ANALYSIS OF USER BEHAVIOR WITH MULTIMODAL VIRTUAL CUSTOMER SERVICE AGENTS |
1183 | APPROACHES FOR LANGUAGE IDENTIFICATION IN MISMATCHED ENVIRONMENTS |
1043 | ATTRIBUTE BASED SHARED HIDDEN LAYERS FOR CROSS-LANGUAGE KNOWLEDGE TRANSFER |
1089 | AUDIO-VISUAL SPEECH ACTIVITY DETECTION IN A TWO-SPEAKER SCENARIO INCORPORATING DEPTH INFORMATION FROM A PROFILE OR FRONTAL VIEW |
1166 | AUTOMATED OPTIMIZATION OF DECODER HYPER-PARAMETERS FOR ONLINE LVCSR |
1040 | AUTOMATED STRUCTURE DISCOVERY AND PARAMETER TUNING OF NEURAL NETWORK LANGUAGE MODEL BASED ON EVOLUTION STRATEGY |
1048 | AUTOMATIC OPTIMIZATION OF DATA PERTURBATION DISTRIBUTIONS FOR MULTI-STYLE TRAINING IN SPEECH RECOGNITION |
1144 | AUTOMATIC PLAGIARISM DETECTION FOR SPOKEN RESPONSES IN AN ASSESSMENT OF ENGLISH LANGUAGE PROFICIENCY |
1135 | AUTOMATIC TURN SEGMENTATION FOR MOVIE & TV SUBTITLES |
1059 | BATCH-NORMALIZED JOINT TRAINING FOR DNN-BASED DISTANT SPEECH RECOGNITION |
1015 | BBN TECHNOLOGIES' OPENSAD SYSTEM |
1092 | BLIND SPEECH SEGMENTATION USING SPECTROGRAM IMAGE-BASED FEATURES AND MEL CEPSTRAL COEFFICIENTS |
1104 | BOOSTING PERFORMANCE ON LOW-RESOURCE LANGUAGES BY STANDARD CORPORA: AN ANALYSIS |
1023 | CODE-SWITCHING DETECTION USING MULTILINGUAL DNNS |
1098 | COMPARING SPEAKER INDEPENDENT AND SPEAKER ADAPTED CLASSIFICATION FOR WORD PROMINENCE DETECTION |
1129 | CONTEXTUAL LANGUAGE MODEL ADAPTATION USING DYNAMIC CLASSES |
1077 | DEEP BOTTLENECK FEATURES AND SOUND-DEPENDENT I-VECTORS FOR SIMULTANEOUS RECOGNITION OF SPEECH AND ENVIRONMENTAL SOUNDS |
1145 | DEEP LEARNING WITH MAXIMAL FIGURE-OF-MERIT COST TO ADVANCE MULTI-LABEL SPEECH ATTRIBUTE DETECTION |
1012 | DEEP NEURAL NETWORK BASED ACOUSTIC MODEL PARAMETER REDUCTION USING MANIFOLD REGULARIZED LOW RANK MATRIX FACTORIZATION |
1176 | DEEP NEURAL NETWORK DRIVEN MIXTURE OF PLDA FOR ROBUST I-VECTOR SPEAKER VERIFICATION |
1029 | DEEP NEURAL NETWORK-BASED SPEAKER EMBEDDINGS FOR END-TO-END SPEAKER VERIFICATION |
1128 | DEVELOPMENT OF THE MIT ASR SYSTEM FOR THE 2016 ARABIC MULTI-GENRE BROADCAST CHALLENGE |
1193 | DIALOG STATE TRACKING WITH ATTENTION-BASED SEQUENCE-TO-SEQUENCE LEARNING |
1157 | DIALPORT: CONNECTING THE SPOKEN DIALOG RESEARCH COMMUNITY TO REAL USER DATA |
1179 | DISCRIMINATIVE ACOUSTIC WORD EMBEDDINGS: RECURRENT NEURAL NETWORK-BASED APPROACHES |
1027 | DISCRIMINATIVE MULTIPLE SOUND SOURCE LOCALIZATION BASED ON DEEP NEURAL NETWORKS USING INDEPENDENT LOCATION MODEL |
1090 | DNN ADAPTATION FOR RECOGNITION OF CHILDREN SPEECH THROUGH AUTOMATIC UTTERANCE SELECTION |
1070 | DYNAMIC ADJUSTMENT OF LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION USING WORD SIMILARITY |
1168 | EARLY STAGE INJECTION OF A SEMANTIC OBJECTIVE FOR LEARNING OF SPOKEN LANGUAGE TRANSLATION |
1181 | END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION |
1146 | END-TO-END TRAINING APPROACHES FOR DISCRIMINATIVE SEGMENTAL MODELS |
1060 | ENTROPY-BASED PRUNING OF HIDDEN UNITS TO REDUCE DNN PARAMETERS |
1097 | ENVIRONMENTALLY ROBUST AUDIO-VISUAL SPEAKER IDENTIFICATION |
1150 | EVALUATION AND CALIBRATION OF LOMBARD EFFECTS IN SPEAKER VERIFICATION |
1035 | EXTRACTIVE SPEECH SUMMARIZATION LEVERAGING CONVOLUTIONAL NEURAL NETWORK TECHNIQUES |
1167 | F0 TRANSFORMATION TECHNIQUES FOR STATISTICAL VOICE CONVERSION WITH DIRECT WAVEFORM MODIFICATION WITH SPECTRAL DIFFERENTIAL |
1080 | FURTHER OPTIMISATIONS OF CONSTANT Q CEPSTRAL PROCESSING FOR INTEGRATED UTTERANCE AND TEXT-DEPENDENT SPEAKER VERIFICATION |
1052 | HIERARCHICAL ATTENTION MODEL FOR IMPROVED MACHINE COMPREHENSION OF SPOKEN CONTENT |
1175 | HIGH QUALITY AGREEMENT-BASED SEMI-SUPERVISED TRAINING DATA FOR ACOUSTIC MODELING |
1178 | IMPROVED PREDICTION OF THE ACCENT GAP BETWEEN SPEAKERS OF ENGLISH FOR INDIVIDUAL-BASED CLUSTERING OF WORLD ENGLISHES |
1074 | IMPROVING MULTI-STREAM CLASSIFICATION BY MAPPING SEQUENCE-EMBEDDING IN A HIGH DIMENSIONAL SPACE |
1085 | INCREMENTALLY LEARN THE RELEVANCE OF WORDS IN A DICTIONARY FOR SPOKEN LANGUAGE ACQUISITION |
1019 | INFLUENCE OF CORPUS SIZE AND CONTENT ON THE PERCEPTUAL QUALITY OF A UNIT SELECTION MARYTTS VOICE |
1153 | INTENT DETECTION USING SEMANTICALLY ENRICHED WORD EMBEDDINGS |
1064 | ITERATIVE TRAINING OF A DPGMM-HMM ACOUSTIC UNIT RECOGNIZER IN A ZERO RESOURCE SCENARIO |
1002 | I-VECTOR ESTIMATION AS AUXILIARY TASK FOR MULTI-TASK LEARNING BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION |
1142 | JOINTLY LEARNING TO ALIGN AND CONVERT GRAPHEMES TO PHONEMES WITH NEURAL ATTENTION MODELS |
1095 | LEARNING DIALOGUE DYNAMICS WITH THE METHOD OF MOMENTS |
1102 | LEARNING UTTERANCE-LEVEL NORMALISATION USING VARIATIONAL AUTOENCODERS FOR ROBUST AUTOMATIC SPEECH RECOGNITION |
1191 | LIA@DSTC5: TRACKING DIALOG STATES USING AUTHOR-TOPIC BASED REPRESENTATION |
1054 | LIUM ASR SYSTEMS FOR THE 2016 MULTI-GENRE BROADCAST ARABIC CHALLENGE |
1030 | LOOK, LISTEN, AND DECODE: MULTIMODAL SPEECH RECOGNITION WITH IMAGES |
1108 | LOW-RANK BASES FOR FACTORIZED HIDDEN LAYER ADAPTATION OF DNN ACOUSTIC MODELS |
1190 | LSTM ENCODER-DECODER FOR DIALOGUE RESPONSE GENERATION |
1028 | MAX-POOLING LOSS TRAINING OF LONG SHORT-TERM MEMORY NETWORKS FOR SMALL-FOOTPRINT KEYWORD SPOTTING |
1114 | MEDIAN-BASED GENERATION OF SYNTHETIC SPEECH DURATIONS USING A NON-PARAMETRIC APPROACH |
1149 | MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION |
1116 | MULTILINGUAL BLSTM AND SPEAKER-SPECIFIC VECTOR ADAPTATION IN 2016 BUT BABEL SYSTEM |
1118 | MULTI-LINGUAL DEEP NEURAL NETWORKS FOR LANGUAGE RECOGNITION |
1101 | MULTIMODAL DEEP NEURAL NETS FOR DETECTING HUMOR IN TV SITCOMS |
1189 | NEURAL DIALOG STATE TRACKER FOR LARGE ONTOLOGIES BY ATTENTION MECHANISM |
1173 | OPTIMIZING NEURAL NETWORK HYPERPARAMETERS WITH GAUSSIAN PROCESSES FOR DIALOG ACT CLASSIFICATION |
1008 | PARALLEL LONG SHORT-TERM MEMORY FOR MULTI-STREAM CLASSIFICATION |
1126 | PERFORMANCE MONITORING FOR AUTOMATIC SPEECH RECOGNITION IN NOISY MULTI-CHANNEL ENVIRONMENTS |
1017 | PHONETIC CONTENT IMPACT ON FORENSIC VOICE COMPARISON |
1088 | PRE-FILTERED DYNAMIC TIME WARPING FOR POSTERIORGRAM BASED KEYWORD SEARCH |
1111 | PUNCTUATED TRANSCRIPTION OF MULTI-GENRE BROADCASTS USING ACOUSTIC AND LEXICAL APPROACHES |
1094 | QCRI ADVANCED TRANSCRIPTION SYSTEM (QATS) FOR THE ARABIC MULTI-DIALECT BROADCAST MEDIA RECOGNITION: MGB-2 CHALLENGE |
1071 | QUATERNION NEURAL NETWORKS FOR SPOKEN LANGUAGE UNDERSTANDING |
1010 | RECOGNIZING EMOTIONS IN SPOKEN DIALOGUE WITH HIERARCHICALLY FUSED ACOUSTIC AND LEXICAL FEATURES |
1184 | RECURRENT CONVOLUTIONAL NEURAL NETWORKS FOR STRUCTURED SPEECH ACT TAGGING |
1084 | ROBUST UTTERANCE CLASSIFICATION USING MULTIPLE CLASSIFIERS IN THE PRESENCE OF SPEECH RECOGNITION ERRORS |
1125 | SEMANTIC MODEL FOR FAST TAGGING OF WORD LATTICES |
1004 | SEQUENCE TRAINING AND ADAPTATION OF HIGHWAY DEEP NEURAL NETWORKS |
1177 | SPEAKER INDEPENDENT DIARIZATION FOR CHILD LANGUAGE ENVIRONMENT ANALYSIS USING DEEP NEURAL NETWORKS |
1036 | SPEECH ENHANCEMENT USING LONG SHORT-TERM MEMORY BASED RECURRENT NEURAL NETWORKS FOR NOISE ROBUST SPEAKER VERIFICATION |
1005 | SPEECH VS. TEXT: A COMPARATIVE ANALYSIS OF FEATURES FOR DEPRESSION DETECTION SYSTEMS |
1031 | SYNTAX OR SEMANTICS? KNOWLEDGE-GUIDED JOINT SEMANTIC FRAME PARSING |
1185 | THE FIFTH DIALOG STATE TRACKING CHALLENGE |
1186 | THE MSIIP SYSTEM FOR DIALOG STATE TRACKING CHALLENGE 5 |
1009 | THE NDSC TRANSCRIPTION SYSTEM FOR THE 2016 MULTI-GENRE BROADCAST CHALLENGE |
1121 | TOWARD HUMAN-ASSISTED LEXICAL UNIT DISCOVERY WITHOUT TEXT RESOURCES |
1117 | TOWARDS A VIRTUAL PERSONAL ASSISTANT BASED ON A USER-DEFINED PORTFOLIO OF MULTI-DOMAIN VOCAL APPLICATIONS |
1096 | TOWARDS ACOUSTIC MODEL UNIFICATION ACROSS DIALECTS |
1134 | UNSUPERVISED CONTEXT LEARNING FOR SPEECH RECOGNITION |
1056 | UNSUPERVISED K-MEANS CLUSTERING BASED OUT-OF-SET CANDIDATE SELECTION FOR ROBUST OPEN-SET LANGUAGE RECOGNITION |
1113 | VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION |
1148 | VOICE SEARCH LANGUAGE MODEL ADAPTATION USING CONTEXTUAL INFORMATION |
1026 | WEAKLY SUPERVISED USER INTENT DETECTION FOR MULTI-DOMAIN DIALOGUES |