IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022

Education Short Courses

The IEEE Signal Processing Society (IEEE-SPS) Education Board is planning an inaugural education activity in the form of short courses at ICASSP 2022. The introduction of education-oriented short courses will offer Professional Development Hours (PDHs) and Continuing Education Units (CEUs) certificates to those who complete each course. Given that students, academic, and industry researchers and practitioners have a broad diversity of interests and areas of experience worldwide, the IEEE-SPS goal is to develop meaningful methods of offering beneficial and relevant courses in support of our membership educational needs.

Six courses have been selected by SPS Education Board and the ICASSP committee. The courses will be conducted in-person and online as parallel tracks during the main ICASSP conference. The total duration of each course is 10 hours.

In-Person Live-Stream
Courses SC-1, SC-2, and SC-3 will be held In-Person in Singapore.
  • Speakers will be in-person in Singapore.
  • Attendees will join in-person or virtually.
  • Attendees (in-person) will attend the course in Singapore.
  • Attendees (virtual) will attend via Zoom link and attend live at the same time as being conducted in Singapore.
Virtual Live-Stream
Courses SC-4, SC-5, and SC-6 will be held virtually.
  • Speakers will conduct course virtually via Zoom.
  • Only Virtual Attendees and they will attend via Zoom link.

More information and syllabus can be found here.

Fee Structure Regular Rate
By 21 May
On‑Site Rate
After 21 May
Education Short Courses
10 hour courses spanning several days conducted in-person and online live as parallel tracks. The short courses provide a deep and multi-sided understanding of a topic including hands-on experience, and include the course materials as well as a professional development certificate. Cost is per course.
Regular S$300 S$350
Student S$180 S$250
Short Courses at ICASSP 2022
In-Person
Live-Stream
Short Course
SC-1: Low-Dimensional Models for High-Dimensional Data: From Linear to Nonlinear, Convex to Nonconvex, and Shallow to Deep
Tuesday, 24 May 2022, 13:00 - 17:00 (UTC+8)
Wednesday, 25 May 2022, 13:00 - 15:00 (UTC+8)
Thursday, 26 May 2022, 13:00 - 17:00 (UTC+8)
In-Person
Live-Stream
Short Course
SC-2: Inclusive Neural Speech Synthesis -iNSS
In-Person
Live-Stream
Short Course
SC-3: Biomedical Signal Analysis and Healthcare
Virtual
Live-Stream
Short Course
SC-4: Signal Processing and Learning from Network Data
Tuesday, 24 May 2022, 19:00 - 23:00 (UTC+8)
Wednesday, 25 May 2022, 19:00 - 21:00 (UTC+8)
Thursday, 26 May 2022, 19:00 - 23:00 (UTC+8)
Virtual
Live-Stream
Short Course
SC-5: Speech Technology for Health: From Technical Foundations to Applications
Virtual
Live-Stream
Short Course
SC-6: Transformer Architectures for Multimodal Signal Processing and Decision Making

SC-1: Low-Dimensional Models for High-Dimensional Data: From Linear to Nonlinear, Convex to Nonconvex, and Shallow to Deep

Presented by Sam Buchanan, Columbia University; Yi Ma, UC Berkeley; Qing Qu, University of Michigan; John Wright, Columbia University; Yuqian Zhang, Rutgers University; Zhihui Zhu, University of Denver

Tuesday, 24 May 2022, 13:00 - 17:00 (UTC+8)
Wednesday, 25 May 2022, 13:00 - 15:00 (UTC+8)
Thursday, 26 May 2022, 13:00 - 17:00 (UTC+8)
In-Person
Live-Stream
Short Course

The course will start by introducing fundamental linear low-dimensional models (e.g., basic sparse and low-rank models) and convex relaxation approaches with motivating engineering applications, followed by a suite of scalable and efficient optimization methods. Based on these developments, we will introduce nonlinear low-dimensional models for several fundamental learning and inverse problems (e.g., dictionary learning and sparse blind deconvolution), and nonconvex approaches from a symmetry and geometric perspective, followed by their guaranteed correctness and efficient nonconvex optimization. Building upon these results, we will discuss strong conceptual, algorithmic, and theoretical connections between low-dimensional structures and deep models, providing new perspectives to understand state-of-the-art deep models, as well as leading to new principles for designing deep networks for learning low-dimensional structures, with both clear interpretability and practical benefits.

(In-Person presentations are subject to COVID situations and travel permission.
Yuqian Zhang, Zhihui Zhu will be presenting remotely.)

SC-2: Inclusive Neural Speech Synthesis -iNSS

Presented by Yannis Stylianou, Univ of Crete (Greece) & Apple UK; Vassilis Tsiaras, Univ of Crete (Greece); Alistair Conkie, Apple USA; Soumi Maiti, Apple USA; Junichi Yamagishi, NII Japan; Xin Wang, NII Japan; Yutian Chen, DeepMind, China; Malcolm Slaney, Google USA; Petko Petkov, Apple UK; Shifas Padinjaru Veettil, Apple UK; George Kafentzis, Univ of Crete (Greece)

Tuesday, 24 May 2022, 13:00 - 17:00 (UTC+8)
Wednesday, 25 May 2022, 13:00 - 15:00 (UTC+8)
Thursday, 26 May 2022, 13:00 - 17:00 (UTC+8)
In-Person
Live-Stream
Short Course

In the first part, the main components of a Neural Text-to-Speech (TTS) system, from front-end to acoustic modelling and to the vocoders will be presented. In the second part, concrete applications for making TTS an inclusive technology will be discussed and demonstrated. The main Neural TTS components will be presented taking into account the plethora of approaches currently used, provide the links between them and discuss perspectives. The applications will focus on people with mild to moderate hearing loss, elderly people, blind and low-vision users, and speech-impaired users. iNSS will strongly support hands-on sessions to provide a balance between theory and practice while introducing methods and tools to the participants.

(In-Person presentations are subject to COVID situations and travel permission. George Kafentzis will be presenting remotely.)

SC-3: Biomedical Signal Analysis and Healthcare

Presented by Kai Keng ANG; Jung-jae KIM; Mahsa Paknezhad; Arvind Channarayapatna Srinivasa; Pavitra Krishnaswamy; Ramasamy Savitha, Agency for Science, Technology and Research, Singapore

Tuesday, 24 May 2022, 13:00 - 17:00 (UTC+8)
Wednesday, 25 May 2022, 13:00 - 15:00 (UTC+8)
Thursday, 26 May 2022, 13:00 - 17:00 (UTC+8)
In-Person
Live-Stream
Short Course

Biomedical signal processing for healthcare is experiencing tremendous growth worldwide, and biomedical jobs are one of the fastest-growing career fields worldwide. This field is fundamental to the understanding, visualizing, and quantifying of medical images and bio-signals in clinical applications. With the help of artificial intelligence and machine learning techniques, disease diagnosis will be easier, faster, and accurate leading to significant development in medicine in general. The targeted audiences are senior-year undergraduate, postgraduate, engineers, and practitioners with some background in signal processing. The goal of this course is to help them develop basic skills in computational biomedical signal processing augmented with hands-on programming exercises, and help them to appreciate the considerations in developing practical solutions for this domain.

SC-4: Signal Processing and Learning from Network Data

Presented by Marcelo Fiori, Universidad de la Republica, Uruguay; Fernando Gama, Rice University, USA; Federico Larroca, Universidad de la Republica, Uruguay; Gonzalo Mateos, University of Rochester, USA

Tuesday, 24 May 2022, 19:00 - 23:00 (UTC+8)
Wednesday, 25 May 2022, 19:00 - 21:00 (UTC+8)
Thursday, 26 May 2022, 19:00 - 23:00 (UTC+8)
Virtual
Live-Stream
Short Course

Coping with the challenges found at the intersection of Network Science and Big Data necessitates fundamental breakthroughs in modeling, identification, and controllability of distributed network processes – often conceptualized as information associated with or signals defined on graphs. For instance, graph-supported signals can model vehicle congestion levels over road networks, economic activity observed over a network of production flows between industrial sectors, infectious states of individuals susceptible to an epidemic disease spreading on a social network, brain activity signals supported on brain connectivity networks, and fake news that diffuse on online social networks. There is an evident mismatch between our scientific understanding of signals defined over regular domains (time or space) and graph-supported signals. Knowledge about time series was developed over the course of decades and boosted by real needs in areas such as communications, speech, and control. On the contrary, the prevalence of network-related signal processing problems and the access to quality network data are recent events. Making sense of large-scale datasets from a network-centric perspective will constitute a crucial step to obtain new insights in various areas in science and engineering; and signal processing can play a key role to that end. Machine learning, in particular, can significantly benefit from graph-based representations, as they are instrumental to unveil inner data structures that can properly guide e.g., semi-supervised learning algorithms. In this context, the technical focus of this short course is on fundamentals and algorithmic advances for learning from network (i.e., graph) data. Topics covered go all the way from learning graph representations of complex signals, to graph signal processing fundamentals, statistical models for network data, and learning efficient signal representations via state-of-the-art Graph Neural Network architectures. To better illustrate the concepts taught, a diverse gamut of application domains will be considered, including communication, social, brain, and power networks, multi-agent systems, and artificial intelligence.

Course Web Page

SC-5: Speech Technology for Health: From Technical Foundations to Applications

Presented by Chi-Chun Lee, National Tsing Hua University, Taiwan; Prasanta Kumar Ghosh, Indian Institute of Science, India; Yu Tsao, Yi-Chiao Wu, Hsin-Min Wang, Academia Sinica, Taiwan;

Tuesday, 24 May 2022, 19:00 - 23:00 (UTC+8)
Wednesday, 25 May 2022, 19:00 - 21:00 (UTC+8)
Thursday, 26 May 2022, 19:00 - 23:00 (UTC+8)
Virtual
Live-Stream
Short Course

The promise of speech technology for health applications is profound and is becoming more mature recently. Advancements in core speech technologies and their integration – ranging from automatic speech recognition (ASR), text-to-speech (TTS)/voice conversion (VC), speech enhancement (SE), and states and traits recognition from paralinguistics – offer novel tools for both scientific discovery and creating innovative solutions for clinical screening, diagnostics, intervention supports and beyond. Credited to the potential for widespread impact, research sites across all continents are actively engaged in this societally important research area, tackling a rich set of challenges, resulting in a large body of technical research, and leading to grounded system development and deployment. Major speech processing conferences such as ICASSP and INTERSPEECH increasingly feature regular and special sessions on research of speech for health applications – on a variety of topics such as disordered and atypical speech analysis, mental and behavioral health modeling, and assistive systems via speech processing, and the same trend also happens in key journal publication venues such as IEEE Transactions on Audio, Speech and Language Processing, IEEE Journal of Selected Topics on Signal Processing, Computer Speech and Language, Speech Communication and so on.

Given the broad and integrative nature, while there is a blooming of research works, there has not been a dedicated course for this emerging and important topic. The proposed short course will serve a timely need to put together cohesive educational materials on speech for health for the SP community. Specifically, it would include broad-to-specific materials that cover overviews on the nuts and bolts of core speech technology, details of algorithmic approaches with connection to health-related topics, a survey on recent advancements of this interdisciplinary effort, and a hands-on exercise. We believe that this short course would help lay out the needed foundational knowledge for the SP community and beyond and would encourage more students, engineers, and even scientists and clinicians to engage in this research topic.

Course Web Page

SC-6: Transformer Architectures for Multimodal Signal Processing and Decision Making

Presented by Chen Sun, Brown University and Google; Boqing Gong, Google

Tuesday, 24 May 2022, 19:00 - 23:00 (UTC+8)
Wednesday, 25 May 2022, 19:00 - 21:00 (UTC+8)
Thursday, 26 May 2022, 19:00 - 23:00 (UTC+8)
Virtual
Live-Stream
Short Course

Transformers have become the de-facto model of choice in natural language processing (NLP). In computer vision, there has recently been a surge of interest in end-to-end Transformers, prompting the efforts to replace hand-wired features or inductive biases with general-purpose neural architectures powered by data-driven training. The Transformer architectures have also arrived at state-of-the-art performance in multimodal learning, protein structure prediction, decision making, and so on. These results indicate the Transformer architectures' great potential beyond the previously mentioned domains and in the signal processing (SP) community. We envision these efforts may lead to a unified knowledge base that produces versatile representations for different data modalities, simplifying the inference and deployment of deep learning models in various application scenarios. Hence, it is timely for this course on the Transformer architectures and related learning algorithms.

Course Web Page