Panels
Tue, 24 May, 06:00 - 07:30 UTC
Moderator: Sharon Gannot, Bar-Ilan University, Israel and Zheng-Hua Tan, Aalborg University, Denmark
Panelists:
- Zheng-Hua Tan, Aalborg University, Denmark
- Martin Haardt, Ilmenau University of Technology, Germany
- Nancy F. Chen, Agency for Science, Technology, and Research (ASTAR), Singapore
- Hoi-To Wai, The Chinese University of Hong Kong, Hong Kong
- Ivan Teshev, Microsoft Research, USA
Description
In the last decade the signal processing community has witnessed a paradigm shift from model-based to data-driven methods. Machine-learning, and specifically deep learning, methodologies are nowadays widely used in all signal processing fields, e.g., audio, speech, image, video, multimedia, multi-modal and multi-sensor processing, to name a few. Many data-driven methods are also incorporating domain-knowledge to improve the problem modeling, especially when computational burden, training data scarceness, and the size of memory are important constraints.
Data science, as a research field, emerged from several scientific disciplines, namely mathematics (mainly statistics and optimization), computer science, electrical engineering (mainly signal processing), industrial engineering and information systems. Each of these disciplines suggests an independent teaching program in its core domain with a segment in data science. In recent years, several institutes world-wide are starting to offer dedicated data science teaching programs that can be used for different application areas.
We believe that there is a unique signal processing perspective of data science that should be reflected in the education we would like to give to our students. Moreover, we think that we are in the correct time to start defining our needs and inspirations.
In this panel, we shall focus on these education aspects and, hopefully, draft a manifesto for a SP-oriented data science curriculum.
Discussion Questions
- How are “signals” defined? Specifically, does a signal always represent an underlying physical phenomenon? Or can it represent a cognitive space in our brains, e.g., semantics?
- The key question: Is there a unique perspective of signal processing in data science that is different from the viewpoints of other disciplines? If this is indeed the case, can we come up with a clear definition of this perspective?
- Should data science programs be offered already at undergraduate level studies, or should it be postponed to graduate level studies? We should keep in mind the different education systems in different countries with either a 4-year BSc program or a 3+2 BSc+MSc program. What should be the different roles of the BSc and MSc programs in educating the future DS scientists/engineers?
-
What should be regarded as core undergraduate education in the field? Which parts are mandatory and which can be elective? We can propose the following topics, and open the floor to more ideas:
- Mandatory: 1) Mathematics; 2) Statistics; 3) Computer skills and algorithms; 4) Signal processing and Machine Learning; 5) Ethics.
-
Elective:
Students should specialize in a specific topic(s) from the proposed list and must also select 3-4 application courses.
Proposed list of specialization track (that can be changed/extended): 1) Advanced algorithms and optimization; 2) Security and privacy preservation; 3) Data sharing and communication over networks; 4) Applications in diverse fields
- Teaching methodologies: Does DS education require different teaching methodologies than in regular engineering/CS/math education? Do we need to extend the availability of online courses, flipped classes, hands-on, group work, projects? (If time permits)
-
Topics for graduate (MSc and PhD level) studies in the field? (If time permits)
Advanced studies are research-oriented, and the list of topics should reflect the research activities of the department.
This may include: 1) Audio processing; 2) Image Processing and computer graphics; 3) Machine learning theory; 4) Deep learning; 5) Natural language processing; 6) Multi-modal and multi-sensor processing; 7) Bio-medical processing; 8) Networks and communications
Sharon Gannot is a Professor at the Faculty of Engineering, Bar-Ilan University, Israel, where he is heading the Data Science Program. He also serves as the Faculty Vice Dean and the Deputy Director of the Data Science Institute. Dr. Gannot took many leadership roles in the scientific community, including membership in IEEE Signal Processing Society (SPS) Conferences, Technical Directions and Education Boards, Chairing the Audio and Acoustic Signal Processing (AASP) technical committee, serving as Associate Editor and Senior Area Chair for several journals, inc. IEEE Transactions on Audio, Speech, and Language Processing, and Signal Processing Magazine, and serving as the General Co-Chair for IWAENC 2010 and WASPAA 2013. Dr. Gannot will be the general co-chair of Interspeech to be held in Jerusalem in 2024. Currently, he is serving as the Chair of IEEE SPS, Data Science Initiative, a member of the IEEE-SPS Education Board and a member of the IEEE-SPS Education Center Editorial Board, and a member EURASIP Signal Processing for Multisensor System TAC.
Dr. Gannot was selected (with colleagues) to present tutorials at IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2012, European Signal Processing Conference (EUSIPCO) 2012, ICASSP 2013, EUSIPCO 2013 and EUSIPCO 2019, and was a keynote speaker for IWAENC, Aachen, Germany, 2012; International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA) Grenoble, France, 2017, Informationstechnische Gesellschaft im VDE (ITG) Conference on Speech Communication, Oldenburg, Germany 2018 and Audio Analysis Workshop, Aalborg, Denmark 2018. Dr. Gannot is the recipient of Bar-Ilan University Outstanding Lecturer Award in 2010 and 2014 and the Bar-Ilan Rector Innovation in Research Award in 2018. He is also a co-recipient of twelve best paper awards. He is the recipient of the EURASIP Group Technical Achievement Award in 2022. Dr. Gannot is an IEEE Fellow for contributions to acoustical modeling and statistical learning in speech enhancement (Class 2021). Dr. Gannot’s research interests include statistical signal processing and machine learning. The methods he develops utilize either single- and multi-microphone (ad hoc) arrays, and are applied to speech enhancement, noise reduction and speaker separation and diarization, dereverberation, speaker localization and tracking. Dr. Gannot has published 285 peer-reviewed papers in the field, co-edited 3 books and was a co-author of another book.
Martin Haardt has been a Full Professor in the Department of Electrical Engineering and Information Technology and Head of the Communications Research Laboratory at Ilmenau University of Technology, Germany, since 2001. After studying electrical engineering at the Ruhr-University Bochum, Germany, and at Purdue University, USA, he received his Diplom-Ingenieur (M.S.) degree from the Ruhr-University Bochum in 1991 and his Doktor-Ingenieur (Ph.D.) degree from Munich University of Technology in 1996. In 1997 he joint Siemens Mobile Networks in Munich, Germany, where he was responsible for strategic research for third generation mobile radio systems. From 1998 to 2001 he was the Director for International Projects and University Cooperations in the mobile infrastructure business of Siemens in Munich, where his work focused on mobile communications beyond the third generation. During his time at Siemens, he also taught in the international Master of Science in Communications Engineering program at Munich University of Technology.
In 2018, Martin Haardt was named an IEEE Fellow “for contributions to multi-user MIMO communications and tensor-based signal processing.” He has received the 2009 Best Paper Award from the IEEE Signal Processing Society, the Vodafone (formerly Mannesmann Mobilfunk) Innovations-Award for outstanding research in mobile communications, the ITG best paper award from the Association of Electrical Engineering, Electronics, and Information Technology (VDE), and the Rohde & Schwarz Outstanding Dissertation Award.
His research interests include wireless communications, array signal processing, high-resolution parameter estimation, as well as tensor-based signal processing.
Prof. Haardt has served as a Senior Editor for the IEEE Journal of Selected Topics in Signal Processing (since 2019), as an Associate Editor for the IEEE Transactions on Signal Processing (2002-2006 and 2011-2015), the IEEE Signal Processing Letters (2006-2010), the Research Letters in Signal Processing (2007-2009), the Hindawi Journal of Electrical and Computer Engineering (since 2009), the EURASIP Signal Processing Journal (2011-2014), and as a guest editor for the EURASIP Journal on Wireless Communications and Networking.
From 2011 until 2019 he was an elected member of the Sensor Array and Multichannel (SAM) technical committee of the IEEE Signal Processing Society, where he served as the Vice Chair (2015–2016), Chair (2017–2018), and Past Chair (2019). Since 2020, he has been an elected member of the Signal Processing Theory and Methods (SPTM) technical committee of the IEEE Signal Processing Society.
Moreover, he has served as the technical co-chair of PIMRC 2005 in Berlin, Germany, ISWCS 2010 in York, UK, the European Wireless 2014 in Barcelona, Spain, as well as the Asilomar Conference on Signals, Systems, and Computers 2018, USA, and as the general co-chair of WSA 2013 in Stuttgart, Germany, ISWCS 2013 in Ilmenau, Germany, CAMSAP 2013 in Saint Martin, French Antilles, WSA 2015 in Ilmenau, SAM 2016 in Rio de Janeiro, Brazil, CAMSAP 2017 in Curacao, Dutch Antilles, SAM 2020 in Hangzhou, China, as well as the Asilomar Conference on Signals, Systems, and Computers 2021, USA.
Zheng-Hua Tan is a Professor in the Department of Electronic Systems and a Co-Head of the Centre for Acoustic Signal Processing Research at Aalborg University, Aalborg, Denmark. He is also a Co-Lead of the Pioneer Centre for AI, Denmark. He was a Visiting Scientist at the Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, USA, an Associate Professor at the Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, China, and a postdoctoral fellow at the AI Laboratory, KAIST, Daejeon, Korea. His research interests are centred around deep representation learning and generally include machine learning, deep learning, speech and speaker recognition, noise-robust speech processing, and multimodal signal processing. He is the Chair of the IEEE Signal Processing Society Machine Learning for Signal Processing Technical Committee (MLSP TC). He serves on the Conferences Board and the Technical Directions Board of the IEEE Signal Processing Society. He is an Associate Editor for the IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING. He has served as an Associate/Guest Editor for several other journals. He was the General Chair for IEEE MLSP 2018 and a TPC Co-Chair for IEEE SLT 2016.
Nancy F. Chen received her Ph.D. from MIT and Harvard in 2011. She worked at MIT Lincoln Laboratory on her Ph.D. research in multilingual speech processing. She is now leading research efforts in conversational AI and natural language generation with applications related to education, healthcare journalism, and defense at the Institute for Infocomm Research (I2R), A*STAR. Speech evaluation technology developed by her team has been deployed at the Ministry of Education in Singapore to support home-based learning. Dr. Chen led a cross-continent team working on low-resource spoken language processing, which was one of the top performers in the NIST Open Keyword Search Evaluations (2013-2016), funded by the IARPA Babel program.
Dr. Chen has received numerous awards, including Singapore 100 Women in Tech (2021), Young Scientist Award at MICCAI (2021), Best Paper Award at SIGDIAL (2021), the 2020 P&G Connect + Develop Open Innovation Award, 2019 L'Oréal UNESCO Singapore For Women in Science National Fellowship, Best Paper at APSIPA ASC (2016), Singapore MOE (Ministry of Education) Outstanding Mentor Award (2012), the Microsoft-sponsored IEEE Spoken Language Processing Grant (2011), Outstanding Paper at ICASSP (2011), and the NIH (National Institute of Health) Ruth L. Kirschstein National Research Award (2004-2008).
Dr. Chen is currently serving on the ISCA (International Speech Communication Association) Board (2021-2025), a senior IEEE member (2015-present), an elected member of the IEEE Speech and Language Technical Committee (2016-2018, 2019-2021), senior area editor of Signal Processing Letters (2021-2022), associate editor of IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020-2023), Neurocomputing (2020-2021), and IEEE Signal Processing Letters (2019-2021) and was the guest editor for the special issue of “End-to-End Speech and Language Processing” in the IEEE Journal of Selected Topics in Signal Processing (2017).
Dr. Chen has also consulted for various companies ranging from startups to multinational corporations in the areas of emotional intelligence (Cogito Health), speech recognition (Vlingo, acquired by Nuance), climate change (normal, an early-stage social impact startup), EdTech (NovoLearning), and defense/aerospace (BAE Systems).
Hoi-To Wai received his PhD degree from Arizona State University (ASU) in Electrical Engineering in Fall 2017, B. Eng. (with First Class Honor) and M. Phil. degrees in Electronic Engineering from The Chinese University of Hong Kong (CUHK) in 2010 and 2012, respectively. He is an Assistant Professor in the Department of Systems Engineering & Engineering Management at CUHK. He has held research positions at ASU, UC Davis, Telecom ParisTech, Ecole Polytechnique, LIDS, MIT.
Hoi-To's research interests are in the broad area of signal processing, machine learning and distributed optimization, with a focus on their applications to network science. His dissertation has received the 2017's Dean's Dissertation Award from the Ira A. Fulton Schools of Engineering of ASU and he is a recipient of a Best Student Paper Award at ICASSP 2018.
Dr. Ivan Tashev received his Diploma Engineer degree (master’s equivalent) in Electronic Engineering and PhD in Computer Science from the Technical University of Sofia, Bulgaria, in 1984 and 1990 respectively. He was assistant professor in this university, teaching “Data and Signal Processing” and “Programming of Real-time Systems”, when he joined Microsoft in 1998. Currently Ivan Tashev is a Partner Software Architect and leads the Audio and Acoustics Research Group in Microsoft Research – Redmond, WA, USA. His research interests include audio signal processing, machine learning, multichannel transducers, bio-signal processing. He also coordinates the Brain-Computer Interfaces project in MSR. Dr. Tashev published two books, two book chapters, 100+ scientific papers, listed as inventor in 50 US patents. He is affiliate professor in University of Washington in Seattle, and honorary professor at Technical University of Sofia, Bulgaria. Dr. Tashev transferred algorithms to RoundTable device, Windows, Microsoft Auto platform, and served as the audio architect of Kinect for Xbox and of HoloLens. He is IEEE Fellow, member of AES and ASA. More details about him can be found in his web page https://www.microsoft.com/en-us/research/people/ivantash/.
Wed, 25 May, 06:00 - 07:30 UTC
Moderator: C.-C. Jay Kuo, University of Southern California, USA
Panelists:
- Xilin Chen, Chinese Academy of Science, China
- Weisi Lin, Nanyang Technological University, Singapore
- Shan Liu, Tencent America, USA
- Yi Ma, University of California at Berkeley, USA
- Helen Meng, Chinese University of Hong Kong, Hong Kong
Description
Machine learning has played an increasingly important role in modern signal processing. It has been widely used to solve multimedia problems, including audio, speech, image and video, graphics, 3D point clouds, etc. The data-driven methodology is expected to continue to grow. Many powerful data-driven solutions are based on deep learning. We have seen the impact of deep learning in numerous conferences and journals. There is however a huge gap between deep learning and classic signal processing disciplines. The former is a black box with a large model size. It is computationally expensive and data hungry. The latter is a white box with a smaller model size. It is computationally effective and can be easily adapted to smaller datasets. It would be desired to find a way to bridge the two. In this panel, we invite world leading experts to express their opinions on a couple of key questions: Is there a role for classical signal processing to play in the machine learning era? Will there be some non-deep-learning-based alternatives in machine learning? etc.
Discussion Questions
- Big data and machine learning have attracted a lot of attention in the last decade. Through the construction of large datasets, many difficult problems in various domains such as natural language processing (NLP), computer vision (CV), and computer graphics (CG) can be greatly simplified. Do you see any problem in proceeding along this direction? What are the limitations of this data-driven methodology?
- Deep learning is the dominating tool in machine learning, which is widely used in acoustics, signal, speech, and multimedia processing nowadays. Some junior researchers may have doubts on the value of classical signal processing training (e.g., linear algebra, probabilities, etc.). In your opinion, is classical signal processing still valuable? How can they contribute to modern machine learning? Is it possible to find learning-based substitutes without following the deep learning paradigm?
- What are future R&D opportunities and/or directions in the interplay of machine learning, signal processing and multimedia computing? Some concrete examples are helpful.
- What advice will you offer to junior students, researchers, and engineers with major in signal processing and multimedia so that they can be better prepared for the job market and/or an academic career?
C.-C. Jay Kuo is the holder of the William M. Hogue Professorship in Electrical and Computer Engineering, a Distinguished Professor of Electrical and Computer Engineering and Computer Science, and the Director of the USC Multimedia Communication Laboratory (MCL) at the University of Southern California. His research activities lie in multimedia and green computing. He has received several awards for his research contributions, including the 2019 IEEE Computer Society Edward J. McCluskey Technical Achievement Award, the 2019 IEEE Signal Processing Society Claude Shannon-Harry Nyquist Technical Achievement Award, the 2020 IEEE TCMC Impact Award, the 72nd annual Technology and Engineering Emmy Award (2020), and the 2021 IEEE Circuits and Systems Society Charles A. Desoer Technical Achievement Award. Dr. Kuo has guided 161 students to their PhD degrees and supervised 31 postdoctoral research fellows. He is listed as the top advisor in the Mathematics Genealogy Project in terms of the number of supervised PhD students. He is the recipient of the 2017 IEEE Leon K. Kirchmayer Graduate Teaching Award. Dr. Kuo is a Fellow of NAI, AAAS, IEEE and SPIE.
Xilin Chen is currently a Professor with the Institute of Computing Technology, Chinese Academy of Sciences (CAS). He has authored one book and more than 300 papers in refereed journals and proceedings in the areas of computer vision, pattern recognition, image processing, and multimodal interfaces. He is a fellow of ACM, IAPR, IEEE and CCF. He was a recipient of several awards, including China’s State Natural Science Award in 2015, and China’s State S&T Progress Award in 2000, 2003, 2005, and 2012. He served as an Organizing Committee member for many conferences, including the General Co-Chairs for FG13/FG18 / VCIP 2022. He is/was the Area Chair of CVPR, ICCV, and ECCV. He is / was an Associate Editor of the IEEE Transactions on Image Processing, the IEEE Transactions on Multimedia, and Senior Associate Editor of Journal of Visual Communication and Image Representation, a Leading Editor of the Journal of Computer Science and Technology, and an Associate Editor-in-Chief of the Chinese Journal of Computers, and Chinese Journal of Pattern Recognition and Artificial Intelligence.
Weisi Lin received the bachelor’s degree in electronics and the master’s degree in digital signal processing from Sun Yat-sen University, Guangzhou, China, and the Ph.D. degree in computer vision from King’s College London, U.K. He is currently a Professor with the School of Computer Science and Engineering, Nanyang Technological University, Singapore. His research interests include image processing, perceptual modeling, video compression, multimedia communication, and computer vision. He is a fellow of IEEE and IET, an honorary fellow of the Singapore Institute of Engineering Technologists, and a Chartered Engineer in U.K. He was the Chair of the IEEE MMTC Special Interest Group on Quality of Experience. He has served as a Lead Guest Editor for a Special Issue on Perceptual Signal Processing for the IEEE Journal of Selected Topics in Signal Processing in 2012. He has also served or serves as an Associate Editor for IEEE Transactions on Image Processing, IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Multimedia, IEEE Signal Processing Letters, and Journal of Visual Communication and Image Representation. He was awarded as the Distinguished Lecturer for IEEE Circuits and Systems Society in 2016–2017. He has been awarded Highly Cited Researcher 2019, 2020 and 2021 by Clarivate Analytics.
Shan Liu received the B.Eng. degree in electronic engineering from Tsinghua University and the M.S. and Ph.D. degrees in electrical engineering from the University of Southern California. She is currently a Tencent Distinguished Scientist, General Manager of Tencent Media Lab and General Manager, Platform Technologies of Tencent Online Video.. She was formerly Director of Media Technology Division at MediaTek USA. She was also formerly with MERL and Sony. She has been an active contributor to international standards for more than a decade and has numerous technical proposals adopted into various standards, such as VVC, HEVC, OMAF, DASH, MMT, and PCC. She holds more than 400 granted U.S. patents. She served an Editor of H.265/HEVC SCC and H.266/VVC standards. She received the Best AE Award from IEEE Transactions on Circuits and Systems for Video Technology in 2019 and 2020. She has been the Vice Chair of IEEE Data Compression Standards Committee since 2019. She was named the APSIPA Distinguished Industry Leader in 2018. Dr. Liu is a Fellow of IEEE.
Yi Ma is a Professor at the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. His research interests include computer vision, high-dimensional data analysis, and intelligent systems. Yi received his Bachelor’s degrees in Automation and Applied Mathematics from Tsinghua University in 1995, two Masters degrees in EECS and Mathematics in 1997, and a PhD degree in EECS from UC Berkeley in 2000. He has been on the faculty of UIUC ECE from 2000 to 2011, the principal researcher and manager of the Visual Computing group of Microsoft Research Asia from 2009 to 2014, and the Executive Dean of the School of Information Science and Technology of ShanghaiTech University from 2014 to 2017. He then joined the faculty of UC Berkeley EECS in 2018. He has published about 60 journal papers, 120 conference papers, and three textbooks in computer vision, generalized principal component analysis, and high-dimensional data analysis. He received the NSF Career award in 2004 and the ONR Young Investigator award in 2005. He also received the David Marr prize in computer vision from ICCV 1999 and best paper awards from ECCV 2004 and ACCV 2009. He has served as the Program Chair for ICCV 2013 and the General Chair for ICCV 2015. He is a Fellow of IEEE, ACM, and SIAM.
Helen Meng received the B.S., M.S., and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology. She is currently Chair Professor of the Department of Systems Engineering and Engineering Management at The Chinese University of Hong Kong. In 2019, her inter-disciplinary research team was awarded the first HKSAR Government RGC Theme-based Research Project on Artificial Intelligence. In 2020, she helped establish the CUHK-led Centre for Perceptual and Interactive Intelligence at the Hong Kong Science & Technology Park. She is Chair of Curriculum Development in the CUHK-JC AI4Future Project, which has developed and published (in 2021) the first comprehensive pre-tertiary AI education curriculum that is being taught across Hong Kong. She was former Department Chairman and Associate Dean of Research with CUHK Faculty of Engineering. Her research interests include human–computer interaction via multimodal and multilingual spoken language systems, and spoken language processing to support learning, digital health and wellbeing. She was former Editor-in-Chief (2009-2011) of the IEEE Transactions on Audio, Speech and Language Processing and recipient of the 2019 IEEE Signal Processing Society Leo L. Beranek Meritorious Service Award. She has served in the International Speech Communication Association (ISCA) Board and International Advisory Council. Prof. Helen Meng is a Fellow of IEEE and ISCA.
Thu, 26 May, 00:00 - 01:30 UTC
Moderators: Mingyi Hong, University of Minnesota, USA; Anthony Kuh, University of Hawaii, USA
Panelists:
- Soummya Kar, Carnegie Mellon University, USA
- H. Vincent Poor, Princeton University, USA
- Michael Rabbat, Meta Platforms Inc, USA
- Anna Scaglione, Cornell University, USA
- Alex Sprintson, Texas A&M University, USA
Description
Advances in hardware and a proliferation of applications for edge devices and systems (mobile phones, sensor networks, IoT devices) have led to more processing and learning at the edge. In particular, Federated Learning (FL) where data gathering and learning takes place at the edge devices and systems have become key research areas in both academia and industry. Additional concerns that Federated Learning addresses include privacy, communications, heterogeneous systems, and heterogeneous data. This panel addresses signal and information processing advances for Federated Learning. Some of the issues that will be addressed include; key technical innovations that drive FL research, FL use cases in industry, emerging applications in FL, and future directions of FL research.
Mingyi Hong is an Associate Professor in the Department of Electrical and Computer Engineering, University of Minnesota. Currently, he is serving on the IEEE Signal Processing for Communications and Networking (SPCOM) technical committee, and as an Associate Editor for IEEE Transactions on Signal Processing. His research interests are in optimization theory and its applications in information processing and machine learning. He is the recipient of a Facebook Research Award and an IBM Faculty Research Award. He has received several best (student) paper awards, including an IEEE Signal Processing Society Best Paper Award (2021), an International Consortium of Chinese Mathematician Best Paper Award (2020), and a Best Student Paper Award from NeurIPS Workshop on Scalability, Privacy, and Security in Federated Learning.
Anthony Kuh received his B.S. in Electrical Engineering and Computer Science at the University of California, Berkeley in 1979, an M.S. in Electrical Engineering from Stanford University in 1980, and a Ph.D. in Electrical Engineering from Princeton University in 1987. He previously worked at AT&T Bell Laboratories and has been on the faculty in Electrical Engineering at the University of Hawai’i since 1986. He is currently a Professor in the Department and previously served as Department Chair. His research is in the area of neural networks and machine learning, adaptive signal processing, sensor networks, and renewable energy and smart grid applications. He won a National Science Foundation Presidential Young Investigator Award and is an IEEE Fellow. From 2017 – 2021 he served as program director for NSF in the Electrical, Communications, and Cyber Systems (ECCS) division working in the Energy, Power, Control, and Network (EPCN) group. At NSF he also assisted in initiatives including Harnessing the Data Revolution (HDR), the Mathematics of Deep Learning (MoDL), the AI Institutes, Cyber Physical Systems (CPS), and Smart and Connected Communities. He previously served on the Awards Board of the IEEE Signal Processing Society and is President of the Asia Pacific Signal and Information Processing Association.
Soummya Kar received a B.Tech. in electronics and electrical communication engineering from the Indian Institute of Technology, Kharagpur, India, in May 2005 and a Ph.D. in electrical and computer engineering from Carnegie Mellon University, Pittsburgh, PA, in 2010. From June 2010 to May 2011, he was with the Electrical Engineering Department, Princeton University, Princeton, NJ, USA, as a Postdoctoral Research Associate. He is currently a Professor of Electrical and Computer Engineering at Carnegie Mellon University, Pittsburgh, PA, USA. His research interests include decision-making in large-scale networked systems, stochastic systems, multi-agent systems and data science, with applications to cyber-physical systems and smart energy systems. Recent recognition of his work includes the 2016 O. Hugo Schuck Best Paper Award from the American Automatic Control Council and a 2016 Dean's Early Career Fellowship from CIT, Carnegie Mellon. He is a fellow of the IEEE.
H. Vincent Poor is the Michael Henry Strater University Professor at Princeton University, where his interests include information theory, machine learning and network science, and their applications in wireless networks, energy systems, and related areas. His publications in these areas include the forthcoming book Machine Learning and Wireless Communications (Cambridge University Press). Dr. Poor is a Member of U.S. National Academy of Engineering and U.S. National Academy of Sciences, and a foreign member of the Royal Society and other national and international academies. He received the IEEE Alexander Graham Bell Medal in 2017.
Michael Rabbat is a Research Scientist and Manager in FAIR, the fundamental AI research group of Meta Platforms Inc. He received the BSc degree from the University of Illinois, Urbana-Champaign, the MSc degree from Rice University, and the PhD from the University of Wisconsin, Madison, all in electrical engineering. From 2007-2018 he was a professor at McGill University, and he has held visiting positions at IMT-Atlantique, Brest, France, the Inria Bretagne-Atlantique Research Center, Rennes, France, and KTH Royal Institute of Technology, Stockholm, Sweden. His research interests include optimization for machine learning, large-scale and distributed optimization, and federated learning. He is a Senior Member of the IEEE, and he has served on the editorial boards of IEEE Signal Processing Letters, IEEE Transactions on Signal and Information Processing Over Networks, and IEEE Transactions on Control of Network Systems.
Anna Scaglione (M.Sc.'95, Ph.D. '99) is currently a professor in electrical and computer at Cornell Tech, the New York City campus of Cornell University, Prior to that she held faculty positions at Arizona State University, the University of California at Davis, Cornell University (the first time) and the University of New Mexico. She is IEEE fellow since 2011 and received the 2013, IEEE Donald G. Fink Prize Paper Award, the 2000 IEEE Signal Processing Transactions Best Paper Award the NSF CAREER grant (2002). She is co-recipient with her students of several best student papers awards at conferences and received the 2013 IEEE Signal Processing Society Young Author Best Paper Award with one of the PhD students. She was Distinguished Lecturer of the Signal Processing Society in 2019 and 2020, when most of her travel was cut short by the pandemic. Dr. Scaglione's expertise and research considers theoretical and applied problems is in statistical signal processing, communications, optimization theory and cyber-physical systems.
Dr. Alex Sprintson joined NSF as a rotating Program Director in September 2018, in the Directorate of Computer & Information Science and Engineering (CISE). He manages networking research within the Networking Technologies and Systems (NeTS) and Secure and Trustworthy Cyberspace (SaTC) programs. Alex Sprintson is a faculty member in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, where he conducts research on wireless network coding, distributed storage, and software-defined networks. Dr. Sprintson received the Wolf Award for Distinguished Ph.D.students, the Viterbi Postdoctoral Fellowship, the TAMU College of Engineering Outstanding Contribution Award, and the NSF CAREER award. From 2013 and 2019 he served as an Associate Editor of the IEEE Transactions on Wireless Communications. He has been a member of the Technical Program Committee for the IEEE Infocom 2006--2023.
Thu, 26 May, 07:30 - 09:00 UTC
Moderator: Kong Aik LEE, Institute for Infocomm Research, A*STAR, Singapore
Panelists:
- Emmanuel Vincent, Inria, France
- Tomi Kinnunen, University of Eastern Finland, Finland
- Junichi Yamagishi, National Institute of Informatics, Japan
- Oldrich Plchot, Brno University of Technology, Czech Republic
- Rohan Kumar Das, Fortemedia, Singapore
Description
Speech is among the most natural and convenient means of biometric authentication. The individual traits embedded in the speech signals form the basis of speaker recognition or voice authentication. With the widespread availability of speech synthesis tools, the threat from spoofing attacks to speaker recognition systems is growing since fraudsters can use these tools to produce a natural-sounding speech of a victim. While research on speech anti-spoofing has seen significant progress in the past few years, privacy concerns have called for the need for speech anonymization. In this panel, we invite world-leading experts to share their opinions on the security and the privacy expects in handling individual traits in speech, the challenges posed by the advancement in neural speech synthesizers, and the collaborative efforts that could be put together in answering the concerns and challenges.
Format
- 10 minutes of introduction (moderator)
- 30 minutes of presentation (panelists)
- 50 minutes of open discussion
- Call for contribution to ASVspoof5
Kong Aik Lee is currently a Senior Scientist at the Institute for Infocomm Research, A*STAR, Singapore. He was a Senior Principal Researcher at the Data Science Research Laboratories, NEC Corporation, Japan, from 2018 to 2020. He received his Ph.D. degree from Nanyang Technological University, Singapore, in 2006. After which he joined the Institute for Infocomm Research, Singapore, as a Research Scientist and then a Strategic Planning Manager (concurrent appointment). He was the recipient of the Singapore IES Prestigious Engineering Achievement Award 2013 for his contribution to voice biometrics technology, the Outstanding Service Award by IEEE ICME 2020, and the 2021 A*STAR CRF (UIBR) Award. He was the Lead Guest Editor for the CSL Special Issue on “Two decades into Speaker Recognition Evaluation - are we there yet?” Currently, he serves as an Editorial Board Member for Elsevier Computer Speech and Language (2016 - present) and was an Associate Editor for IEEE/ACM Transactions on Audio, Speech, and Language Processing (2017 - 2021). He is an elected member of the IEEE Speech and Language Processing Technical Committee (2019 – 2021,2022 – 2024) and was the General Chair of the Speaker Odyssey 2020 Workshop. His research focuses on the automatic and para-linguistic analysis of speaker characteristics, ranging from speaker recognition, language, and accent recognition, diarization, voice biometrics, spoofing, and countermeasure.
Emmanuel Vincent received the Ph.D. degree in music signal processing from IRCAM in 2004 and joined Inria, the French national research institute for digital science and technology, in 2006. He is currently a Senior Research Scientist and the Head of Science of Inria Nancy - Grand Est. His research covers several speech and audio processing tasks, with a focus on privacy preservation, learning from little or no labeled data, source separation and speech enhancement, and robust speech and speaker recognition. He is a founder of the MIREX, SiSEC, CHiME, and VoicePrivacy challenge series. He is a scientific advisor of the startup company Nijta, which provides speech anonymization solutions.
Tomi H. Kinnunen is a Professor at the University of Eastern Finland. He received his Ph.D. degree in computer science from the University of Joensuu in 2005. From 2005 to 2007, he was an Associate Scientist at the Institute for Infocomm Research (I2R), Singapore. Since 2007, he has been with UEF. From 2010 to 2012, he was funded by a postdoctoral grant from the Academy of Finland. He has been a PI or co-PI in three other large Academy of Finland-funded projects and a partner in the H2020-funded OCTAVE project. He chaired the Odyssey workshop in 2014. From 2015 to 2018, he served as an Associate Editor for IEEE/ACM Trans. on Audio, Speech, and Language Processing and from 2016 to 2018 as a Subject Editor in Speech Communication. In 2015 and 2016, he visited the National Institute of Informatics, Japan, for 6 months under a mobility grant from the Academy of Finland, with a focus on voice conversion and spoofing. Since 2017, he has been Associate Professor at UEF, where he leads the Computational Speech Group. He is one of the cofounders of the ASVspoof challenge, a nonprofit initiative that seeks to evaluate and improve the security of voice biometric solutions under spoofing attacks.
Junichi Yamagishi is a professor at the National Institute of Informatics in Japan. He is also a senior research fellow in the Centre for Speech Technology Research (CSTR) at the University of Edinburgh, UK. He was awarded a Ph.D. by the Tokyo Institute of Technology in 2006 for a thesis that pioneered speaker-adaptive speech synthesis and was awarded the Tejima Prize as the best Ph.D. thesis at the Tokyo Institute of Technology in 2007. Since 2006, he has authored and co-authored over 250 refereed papers in international journals and conferences. He was awarded the Itakura Prize from the Acoustic Society of Japan, the Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan, the Young Scientists’ Prize from the Minister of Education, Science and Technology, the JSPS Prize, the Docomo mobile science award in 2010, 2013, 2014, 2016, and 2018, respectively. He served previously as a co-organizer for the bi-annual ASVspoof special sessions at INTERSPEECH 2013-9, the bi-annual Voice conversion challenge at INTERSPEECH 2016, and Odyssey 2018, an organizing committee member for the 10th ISCA Speech Synthesis Workshop 2019 and a technical program committee member for IEEE ASRU 2019. He also served as a member of the IEEE Speech and Language Technical Committee, as an Associate Editor of the IEEE/ACM TASLP, and as the Lead Guest Editor for the IEEE JSTSP SI on Spoofing and Countermeasures for Automatic Speaker Verification. He is currently a PI of JST-CREST and ANR supported VoicePersonae project. He also serves as a chairperson of ISCA SynSIG and as a Senior Area Editor of the IEEE/ACM TASLP.
Oldrich Plchot, Ing. [MS]. Brno University of Technology, 2007, Ph.D. Brno University of Technology, 2014, is senior researcher in BUT Speech@FIT research group. He worked on EU-sponsored project MOBIO (7th FP) as well as in several projects sponsored at the local Czech level. He was the technical lead of US-Air Force EOARD sponsored project “Improving the capacity of language recognition systems to handle rare languages using radio broadcast data”, and key member of personnel in BEST project and RATS Patrol project sponsored by U.S. IARPA and DARPA respectively. He participated at several high-profile international research workshops: BOSARIS held in Brno in 2010 and 2012 and at the Johns Hopkins University (MD, USA) summer research workshop in 2013. He significantly contributed to the success of BUT team in international evaluations organized by NIST (Speaker recognition since 2010, Language recognition since 2007) as well as in evaluations organized within IARPA and DARPA projects. He has authored or co-authored more than 50 papers including IEEE Transactions on Audio, Speech, and Language Processing and high-profile conferences such as ICASSP, and Interspeech. He is recipient of 2016 “Josef Hlávka Prize” awarded to the most talented PhD students and young researchers of Czech technical Universities.
Rohan Kumar Das is currently a Research and Development (R&D) Manager at Fortemedia, Singapore division. Prior to that he was associated with National University of Singapore as a Research Fellow from 2017-2021 and as a Data Scientist in KOVID Research Labs, India in the year 2017. He is a Ph.D. graduate from Indian Institute of Technology (IIT) Guwahati. He was one of the organizers of the special sessions on “The Attacker’s Perspective on Automatic Speaker Verification”, “Far-Field Speaker Verification Challenge 2020” in Interspeech 2020, and the Voice Conversion Challenge 2020. He served as Publication Chair of IEEE Automatic Speech Recognition Understanding (ASRU) Workshop 2019 and one of the Chairs of Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020. He is a Senior Member of IEEE, a member of ISCA and APSIPA. His research interests are speech/audio signal processing, speaker verification, anti-spoofing, social signal processing and various applications of deep learning.
Fri, 27 May, 06:00 - 08:00 UTC
Moderator: Athina Petropulu, Rutgers, The State University of New Jersey, USA
Panelists:
- Min Wu, University of Maryland, USA
- Helen Meng, Chinese University of Hong Kong, Hong Kong
- Shrikanth Narayanan, University of Sothern California, USA
- Mike Polley, Samsung, USA
- Shuicheng Yan, Sea AI Lab (SAIL), Singapore
- Ana Pérez-Neira, Centre Tecnologic de Telecomunicacions de Catalunya, Spain
Description
Athina Petropulu is a Distinguished Professor of Electrical and Computer Engineering at Rutgers University. Her interests include radar signal processing and PHY security. She received the Presidential Faculty Fellow Award (1995) from NSF and the U.S. White House, and the 2012 IEEE Signal Processing Society (SPS) Meritorious Service Award. She is an AAAS Fellow. She is co-author of the 2005 IEEE Signal Processing Magazine Best Paper Award, the 2020 IEEE Signal Processing Society Young Author Best Paper Award (B. Li), the 2021 IEEE Signal Processing Society Young Author Best Paper Award (F. Liu), and the 2021 Aerospace and Electronic Systems Society Barry Carlton Best Paper Award. She is currently President-Elect of the IEEE Signal Processing Society.
Min Wu is a Professor of Electrical and Computer Engineering and a Distinguished Scholar- Teacher at the University of Maryland, College Park. She is currently serving as Associate Dean for Graduate Affairs for the University’s Clark School of Engineering. She received her Ph.D. degree in electrical engineering from Princeton University in 2001. At UMD, she leads the Media, Analytics, and Security Team (MAST), with main research interests on information security and forensics, multimedia signal processing, and applications of data science and machine learning in health and IoT. Dr. Wu was elected as IEEE Fellow, AAAS Fellow, and Fellow of the National Academy of Inventors. She chaired the IEEE Technical Committee on Information Forensics and Security, and has served as Vice President - Finance of the IEEE Signal Processing Society and Editor-in-Chief of the IEEE Signal Processing Magazine.
Shrikanth Narayanan is University Professor and Niki & C. L. Max Nikias Chair in Engineering at the University of Southern California, where he is Professor of Electrical & Computer Engineering, Computer Science, Linguistics, Psychology, Neuroscience, Pediatrics, and Otolaryngology—Head & Neck Surgery, Director of the Ming Hsieh Institute and Research Director of the Information Sciences Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research. He is a Fellow of the National Academy of Inventors, the Acoustical Society of America, IEEE, ISCA, the American Association for the Advancement of Science, the Association for Psychological Science, and the American Institute for Medical and Biological Engineering. He is presently VP for Education for the IEEE Signal Processing Society. He has received several honors including the 2015 Engineers Council’s Distinguished Educator Award, a Mellon award for mentoring excellence, the 2005 and 2009 Best Transactions Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11, a 2018 ISCA CSL Best Journal Paper award, and serving as an ISCA Distinguished Lecturer for 2015-16, Willard R. Zemlin Memorial Lecturer for ASHA in 2017, and the Ten Year Technical Impact Award in 2014 and the Sustained Accomplishment Award in 2020 from ACM ICMI. He has published over 900 papers and has been granted eighteen U.S. patents. [https://sail.usc.edu/people/shri.html].
Helen Meng received the B.S., M.S., and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology. She is currently Chair Professor of the Department of Systems Engineering and Engineering Management at The Chinese University of Hong Kong. In 2019, her inter-disciplinary research team was awarded the first HKSAR Government RGC Theme-based Research Project on Artificial Intelligence. In 2020, she helped establish the CUHK-led Centre for Perceptual and Interactive Intelligence at the Hong Kong Science & Technology Park. She is Chair of Curriculum Development in the CUHK-JC AI4Future Project, which has developed and published (in 2021) the first comprehensive pre-tertiary AI education curriculum that is being taught across Hong Kong. She was former Department Chairman and Associate Dean of Research with CUHK Faculty of Engineering. Her research interests include human–computer interaction via multimodal and multilingual spoken language systems, and spoken language processing to support learning, digital health and wellbeing. She was former Editor-in-Chief (2009-2011) of the IEEE Transactions on Audio, Speech and Language Processing and recipient of the 2019 IEEE Signal Processing Society Leo L. Beranek Meritorious Service Award. She has served in the International Speech Communication Association (ISCA) Board and International Advisory Council. Prof. Helen Meng is a Fellow of IEEE and ISCA.
Mike Polley is Senior Vice President and Head of the Mobile Processor Innovation Lab at Samsung where he leads a team of world-class algorithm and system designers focused on creating advanced technologies for Samsung’s Galaxy smartphones as well as next-generation mobile devices. Prior to Samsung, Mike worked at Texas Instruments for 18 years defining chipset architectures and leading embedded signal processing R&D. He was recognized for his technical accomplishments by election to TI Fellow in 2008.
Mike received his B.S., M.S., and Ph.D. degrees in electrical engineering from MIT. He holds 41 U.S. patents on a broad range of products across communications and multimedia systems.
Dr. Yan Shuicheng is currently director of Sea AI Lab (SAIL) and group chief scientist of Sea, building SAIL from zero. He is a Fellow of Academy of Engineering, Singapore, AAAI Fellow, ACM Fellow, IEEE Fellow, IAPR Fellow. His research areas include computer vision, machine learning and multimedia analysis. Till now, he has published over 600 papers in top international journals and conferences, with H-index 120+. He had been among “Thomson Reuters Highly Cited Researchers” in 2014, 2015, 2016, 2018, 2019, 2020 and 2021. Shuicheng’s team has received winner or honourable-mention prizes for 10 times of two core competitions, Pascal VOC and ImageNet (ILSVRC), which are deemed as “World Cup” in the computer vision community. Also, his team won over 10 best paper or best student paper prizes and especially, a grand slam in ACM MM, the top conference in multimedia, including Best Paper Award three times, Best Student Paper Award twice and Best Demo Award once.
Ana Pérez-Neira is full professor at Universitat Politècnica de Catalunya in the Signal Theory and Communication department since 2006 and was Vice rector for Research (2010-14). Currently, she is the Director of Centre Tecnològic de Telecomunicacions de Catalunya, Spain. Her research is in signal processing for communications, focused on satellite communications. She has more than 60 journal papers and 300 conference papers. She is co-author of 7 books. She has leaded more than 20 projects and holds 8 patents. She is the coordinator of the Networks of Excellence on satellite communications, financed by the European Space Agency: SatnexIV-V. She has been associate editor of the IEEE TSP and EURASIP SP and ASP. Currently she is senior area editor of IEEE OJSP. She is member of the BoG of the IEEE SPS and Vice-President for conferences (2021-23). She is IEEE Fellow, EURASIP Fellow, and member of the Real Academy of Science and Arts of Barcelona (RACAB). She is recipient for the 2018 EURASIP Society Award and she has been the general chair of IEEE ICASSP’20 (the first big IEEE virtual conference held by IEEE with more than 15.000 attendees). In 2020, she has been awarded the ICREA Academia distinction by the Catalan government.