IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022

Show & Tell Demonstrations

Sun, 8 May, 23:00 - 23:45 China Time (UTC +8)
Sun, 8 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demo, we present a low-rate, high-dynamic-range analog to digital conversion (ADC). While sampling signals through an ADC, typically, it is assumed that the signal’s dynamic range is within the dynamic range of the ADC. However, in many applications, such as radar and ultrasound imaging, the dynamic range of the received signal could be beyond that of the ADC which results in clipping and leads to inaccurate reconstruction. A modulo preprocessing can be used to avoid clipping and sampling signals beyond the dynamic range of the ADC. The modulo step folds the signal to the dynamic range of the ADC, and the folded signal is sampled. During reconstruction, the true samples are recovered from the folded ones by using an unfolding algorithm. Typically, the unfolding algorithms operate at a much higher rate compared to the rate without a modulo operation. This results in a large number of bits per second (NMBPS) post sampling and quantization which may not be suitable in many applications as a large amount of storage or transmission bandwidth is required.

In this demo, we propose a dedicated hardware prototype that can handle high frequency and high amplitude input signal. Further, we propose a new algorithm called Beyond Bandwidth Residual Recovery, so that unwrapping can be performed robustly at a low sampling rate. In particular, the proposed algorithm uses the time-domain separation and Fourier-domain separation properties of the given finite energy bandlimited signal. Moreover, through simulation and hardware results we show that the proposed algorithm can operate low sampling rate in comparison with the existing methods. In this way, the overall NMBPS is much lower than the existing methods.

We present the hardware demonstration together with an interactive graphical user interface (GUI) during the on-site conference. In addition, we also submit a poster and a video presentation of the entire demo which can be used during the virtual conference.

Related Papers: Gupta, C., Kamath, P., & Wyse, L. (2021). Signal representations for synthesizing audio textures with generative adversarial networks. arXiv preprint arXiv:2103.07390. Wyse, L., Kamath, P., & Gupta, C. (2021). An Integrated System Architecture for Generative Audio Modeling.

Interactive webpage for all attendees: https://animatedsound.com/icassp2022/Trumpinet.60.76/ https://animatedsound.com/icassp2022/oreilly_grid2/

More Information
Sun, 8 May, 23:00 - 23:45 China Time (UTC +8)
Sun, 8 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

We have released Video Complexity Analyzer (VCA) version 1.0 open-source software on Feb 14, 2022 as a valentine’s day gift to video coding enthusiasts across the globe. The primary objective of VCA is to become the best spatial and temporal complexity predictor for every video (segment) which aids in predicting encoding parameters for applications like scene-cut detection and online per-title encoding. VCA leverages x86 SIMD and multi-threading optimizations for effective performance. While VCA is primarily designed as a video complexity analyzer library, a command-line executable is provided to facilitate testing and development. VCA is available as an open-source library, published under the GPLv3 license.

According to the Bitmovin Video Developer Report 2021, live streaming at scale has the highest scope for innovation in video streaming services. Currently, there are no open-source implementations available which can predict video complexity for live streaming applications. To this light, we plan to demo the functions of VCA software, and show accuracy of the complexities analyzed by VCA using the heatmaps, and show-case the speed of video complexity analysis. VCA can achieve an analysis speed of about 370fps compared to the 5fps speed of the reference SITI implementation. Hence, we show that it can be used for live streaming applications.

We expect VCA to be impactful in many leading video encoding solutions in the coming years. Video complexity analysis is also proved to be an important step in applications like rate-distortion modeling and modeling QoE evaluation metrics. Fast video complexity analysis can be used in online per-title encoding schemes which determine optimized resolution, bitrate-ladder, framerate, and other relevant encoding parameters for live streaming applications.

In the demo, the attendees can watch the spatial and temporal complexity heatmap of the test video, to understand the accuracy of the features predicted. Finally, we show the shot-transitions of the test video detected by VCA as an use-case of the analyzed complexity features.

More Information
Mon, 9 May, 23:00 - 23:45 China Time (UTC +8)
Mon, 9 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In many emerging machine learning scenarios, intelligent mobile devices, research teams, and startup companies often need to cooperate to gain side information and improve their local learning task, but without sharing sensitive data, models, and objective tasks. This demo will show a novel framework of collaborative learning based on a recently proposed technology named “Assisted Learning” (NeurIPS spotlight presentation). In particular, the demo will provide an online web link to the ICASSP audience, from where a participant can download a Python-based GUI developed and pre-configured by our team. With this GUI, a participant can load local data (pre-installed) and initialize a real-time connection with another anonymous participant. Then, the participant can click a button and start to receive automated assistance from the others by sharing limited statistics, from which reverse engineering of local data, model, and task label is unlikely. A participant is expected to see consistent improvement of prediction performance over time, given that the underlying participants share common interests (such as sharing features or data cases). The novelty of this demo is two-fold. First, the innovation shows that learning organizations may enhance their machine learning performance without leaking proprietary information to collaborators; Second, the demo will show the ICASSP audience how individual data scientists may use the related technology to team up with others quickly, effectively, and anonymously. We envision the demo to bring broad impact to signal processing communities by raising significant interest in areas such as secure machine learning, decentralized signal processing, and collaborative learning algorithms. In addition, the demonstrated techniques will have numerous positive ethical and societal consequences, which match the theme of human-centric signal processing in the ICASSP 2022.

More Information
Mon, 9 May, 23:00 - 23:45 China Time (UTC +8)
Mon, 9 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data transparent and acceptable to the general public. This demo will show an entirely novel way of collecting and analyzing data, based on a recently proposed notion of data privacy named "Interval Privacy" (to appear in IEEE Transactions on Signal Processing).

In particular, the demo will provide an online link to the ICASSP audience. Each participant will be able to access a randomly generated survey form that asks privacy-sensitive questions (e.g., salary) in the form of "Is your salary higher than X"? Here, the values of X are randomly generated by our backend server (according to our specified mechanisms). Then, a participant may choose to answer or not answer a question. After a few participants (which we expect to be at least ten) have submitted their data, our backend server will immediately show an estimated population average on the webpage. The novelty of the demo is two-fold: 1. This demo shows how a data collector (e.g., a company or government agency) can process signals from incomplete/broken information at an individual level but still obtain accurate population-level information. 2. This demo shows a novel form of survey that, unlike classical surveys with fixed questions/choices, uses randomly generated questions to improve accuracy significantly. 3. Compared with the current data privacy implementations that rely on the data collector to add noise to collected data, this demo shows an entirely novel human-computer interface to guarantee privacy, which is perceptible, transparent, and simple to individuals who own the private data.

The demo is expected to significantly impact signal processing communities by showing a novel way of sharing/collecting sensitive signals interactively and on-the-fly, featuring human-centric signal processing that matches the ICASSP 2022 theme.

More Information
Mon, 9 May, 23:00 - 23:45 China Time (UTC +8)
Mon, 9 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demo, we present task-based quantization hardware built for providing an accurate signal estimate in a multi-user wireless communication setting. Specifically, the transmitted signals from multi-users are recovered at the receiver by applying the developed task-based low-bit quantization board.

Quantization plays a critical role in digital signal processing systems, allowing the representation of a continuous-amplitude signal using a finite number of bits. However, for high-dimensional input signals such as those in multi-user MIMO systems, accurately representing these signals requires a large number of quantization bits, causing severe cost, power consumption, and memory burden. To address this challenge, we recently proposed a task-based quantization approach that guarantees the recovery of high-dimensional signals from a low-bit representation by accounting for the underlying task in the design of the quantizer [1]. A tailored analog precoder is designed to properly pre-process the signal prior to quantization allowing to dramatically reduce the number of bits while still allowing for signal recovery.

In this demo, we design a configurable quantization hardware, consisting of an analog combiner to reduce the input dimensionality and scalar quantizers with dynamically adjustable quantization bits. The developed hardware platform is then applied to multi-user signal recovery. Our demonstration platform consists of a 16x2 analog combiner and a configurable quantizer, including 2, 3, 4 & 12 bits quantization. Using a dedicated GUI, our demo will show that the nearly optimal performance of multi-user signal recovery can be achieved with a low-bit quantizer by accounting for the task.

[1] N. Shlezinger, Y. C. Eldar, and M. R. Rodrigues, “Hardware-limited task-based quantization,” IEEE Trans. Signal Process., vol. 67, no. 20, pp. 5223–5238, 2019

More Information
Tue, 10 May, 23:00 - 23:45 China Time (UTC +8)
Tue, 10 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

You will experience the effects of broad-array binaural beamforming in real time using a hearing assistive device we have developed. The stereo input/output device consists of two pieces of earphones wired through a control box with three tactile switches. The binaural beamforming algorithm implemented in Qualcomm QCC5144, a general- purpose SoC, operates with low latency and provides a natural hearing experience for people with hearing loss as well as for people with normal hearing. Imagine you are in a restaurant enjoying a conversation with a friend or beloved one. You may have experienced difficulty in hearing in such a noisy environments. In such cases, using cues from both ears is very important for hearing. Therefore, we have developed a binaural signal processing function that preserves the information cues of both ears and incorporates them into the signal flow of a side-branch filter-bank. In addition, hearing devices such as hearing aids a severely constrained in terms of computational cost due to low latency and battery life issues. We have solved these limitations by implementing MVDR with a pre-designed fixed filter in the frequency domain. The MVDR retains the directional information of the desired sound source, but distorts the sense of direction of the interfering signal, making spatial separation from the desired source difficult. In fact, the MVDR-IC algorithm used in this system can be tuned with Interaural Coherence as a trade-off parameter, and is expected to improve the audibility by spatial separation of interfering signals. As a demonstration, the selection of the desired direction and the trade-off parameters can be handled via the smartphone application. To achieve low latency WDRC and MVDR-IC, frequency warped filter-bank and frequency domain filtering are employed. You can experience the effect with headphones by wearing our prototype hearing assistive device on the dummy head. We will also provide online participants with a video showing the effects of binaural beamforming that was taken in advance.

More Information
Tue, 10 May, 23:00 - 23:45 China Time (UTC +8)
Tue, 10 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

Much published work on spatial sound reproduction suggests the goal is for precise sound-field control [1]. In contrast, some perceptual studies and commercial designs demonstrate that spatial error, reverberation, or deliberate decorrelation are preferred, especially with rich captured audio scenes [2,3]. This contradiction continues on paper and in system design, whilst, unfortunately, it is rare that spatial arrays are available and used for exploring and experimenting with comparative rendering techniques. Using multiple pop-up, onsite 16 channel 1.5m arrays, we will be providing a direct experience of the comparative experience of sound fields with both precision and structured decoherence. Content will range from synthetic through to high-order captured audio scenes, engaging the community of researchers, new and old, in a discussion of the 'sweet spot' and the overall goal and outcomes from sound-field reproduction work.

The demonstration will comprise a set of audio presentations suitable for a 1-minute experience. For those off-site, a dummy head recording with typical listener mobility will be made available, with the possibility of a live binaural feed during the event. ICASSP is an ideal academic assemblage to engage an enthusiastic academic audience at scale and stimulate useful discourse, potential collaborations, and help frame and steer future researchers. As a display of a low-cost system for perceptual spatial sound work, all work, code, audio samples, and system designs will be accessible for replication elsewhere.

Dating back to Helmholtz, both the math of theoretical wave fields and the broad space of perception as an "inductive conclusion" [4] are known to be important for volumetric sound reproduction. This proposal will bring these closely together with contemporary technology, in a compact and interactive demonstration. By first-hand experience, and through the reactions of others, this will most surely be entertaining and engaging.

1. Ahrens J (2012), Analytic Methods of Sound Field Synthesis. Springer, Berlin. 2. Tucker A (2013), Perception of Soundfields: The Dirty Little Secret. 52nd AES Conference, Guildford. 3. Rumsey F (2014), Spatial Audio: Reconstructing Reality or Creating Illusion, Presentation, AES Section, Chicago 4. Warren R.M (1968), Helmholtz on Perception, John Wiley & Sons (citing Helmholtz 1896)

More Information
Tue, 10 May, 23:00 - 23:45 China Time (UTC +8)
Tue, 10 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

For the ICASSP Show and Tell session, Elevear is proposing a real-time demonstrator of an active occlusion cancellation system. Occluding an ear canal, e.g., by a headphone or hearing aid, leads to a muffled sensation of own-voice. An amplification of body-conducted sound as well as an attenuation of air-conducted sound gives an impression like talking under water. It is one the biggest problems in headphones and hearing aids, where it leads to decrease acceptance by users. Our algorithms are running on a specialized digital signal processor connected to commercially available headphones with direct access to speakers and microphones. Our technology creates a compensation signal to compensate the occlusion effect. The major novelty of our technology is the level of naturalness and its capability to adjust to users. Specifically, the algorithm adapts to different levels of occlusion while keeping ambient sound and the own-voice natural. The level of occlusion is different between headphones, between users as well as between individual sounds. The demonstration provides the community with a hands-on example usually only shown by graphs. It showcases the power of state-of-the-art signal processing paired with control theory. Attendees can experience and explore the occlusion effect and our proposed solution. We hope to facilitate an open conversation of the problem and shortcomings of available approaches. On-site, we will bring our real-time demonstrator. The attendees will be able to wear headphones and switch between different modes via an Android app. They can try out different actions creating strong and weak occlusion signals, including but not limited to talking, drinking, chewing, walking and jumping. For online attendees we are able to transmit the audio signal of an inner microphone within the ear canal.

More Information
Wed, 11 May, 23:00 - 23:45 China Time (UTC +8)
Wed, 11 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demo we aim to show our Text To Speech (TTS) system, based in Fastspeech, which is capable of synthesizing high quality Spanish voices. Moreover, we will be showing a custom algorithm which makes possible using our model for voice cloning, at a low computational cost. The presentation will be structured in two parts: 1. General overview of our machine learning end-to-end text-to-speech algorithm. 2. Audio demos. 2.1. Multimedia content generation. 2.2. Voice cloning of famous voices and voice reconstruction of patients suffering phonation pathologies. 3. Q&A. Nowadays, very few spanish TTS exist in the market, and most of them have a neutral, Spain or Mexican accent. We focused our efforts in providing an argentinian spanish TTS, which feels more natural to argentinian users and constitutes the main novelty of our project. We also experimented with accents from other latinamerican countries like Colombia and Chile, by curating a multi-accent spanish dataset. The innovations we hope to show to the signal processing community at ICASSP are: - Multi-accent model: our model supports a wide variety of spanish accents (Argentina, Chile, Colombia, España, México, Perú, Puerto Rico, Uruguay and Venezuela) - Voice cloning: our model can be adapted to a particular speaker with less than 10 minutes of speech from the target speaker. - Prosody control: duration, pitch and energy of the synthetic voice can be controlled, allowing for voice conversion. This project has a high impact to the signal processing community as the speech technologies focused in latinamerican countries are scarce, and we think that this project can promote interest in developing high quality speech synthesis technologies for countries in that region. We plan to allow interaction with our TTS through a website during the conference, so that attendees can listen to the synthesized voices in realtime and play with the different available parameters.

More Information
Wed, 11 May, 23:00 - 23:45 China Time (UTC +8)
Wed, 11 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

We present a deep learning-based speech enhancment mobile application, named CITISEN. The CITISEN can perform three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC). For SE, pretrained SE models can be installed into CITISEN in order to reduce noise components from instant or saved recordings. The MA function finetunes the pretrained SE models to attain imporved SE performance. The BNC first removes the original background noise from the input utterances and then mixes the processed utterances with new background noise. In the demo dession, we will show how to install pretrained SE models to CITISEN and who to use CITISEN as a platform for utilizing and evaluating SE models and flexibly extend the models to address various noise environments and users.

More Information
Wed, 11 May, 23:00 - 23:45 China Time (UTC +8)
Wed, 11 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

Automatic Speech Recognition (ASR) systems have been rapidly improving in accuracy, yet they will inevitably continue to make some transcription errors. For downstream applications that involve direct presentation of ASR output to a user, it can be helpful if the system is able to represent alternatives in an efficient and intuitive manner. To this end, we have developed phrase alternatives: these are similar to traditional N-best lists or word-level alternatives, but have the significant advantages of being more compact and expressive. This Show and Tell demonstration will present three distinct applications of phrase alternatives. The first is a comparative evaluation of the famous Switchboard benchmark, in which the official NIST SCTK software is used to score the oracle Word Error Rate (WER) for various representations of alternatives, achieving nearly 0% WER. The second application integrates phrase alternatives in a Lucene / Elasticsearch text-based search indexing framework, enabling scalable high-recall audio search across a large collection of recordings. The third application is a novel transcript editing user interface, in which phrase alternatives enable an expert practitioner to apply manual corrections to ASR output while listening to audio playback at faster than real-time speed. Each of these applications uses the publicly available Mod9 ASR Engine, which loads any Kaldi-compatible models, and the software demonstrations can be presented with interactivity for both on-site and remote attendees.

More Information
Thu, 12 May, 23:00 - 23:45 China Time (UTC +8)
Thu, 12 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

Radar technology is one of the most fundamental and omnipresent signal processing technologies that has pushed the frontiers of hardware and algorithm design. Conventionally, dynamic range in radar systems poses severe limitations in system performance. Here, we demo a novel, real-time approach that leverages hardware-software co-design to enable a high dynamic range (HDR) FMCW Radar. Our work is based on the Unlimited Sensing Framework (USF). In the acquisition part, modulo sampling of the radar baseband signals using the US-ADC results in folded samples that do not clip or saturate (c.f. https://youtu.be/JuZg80gUr8M). Thereon, live reconstruction and processing of the non-linear samples results in HDR recovery. The hardware presented will consist of: a FMCW radar, Unlimited Sampling ADCs that are interfaced with another acquisition board, and a computer that will show both the classic acquisition (and its possible limitations) and the unlimited samples and the live reconstruction of the radar signal. The radar will use a modulation that is able to recover the range of multiple targets.

This live demo will be the first demonstration of the Unlimited Sampling approach, as a practical solution for a real-time application. Both the hardware and algorithmic validation of the Unlimited Sensing Framework (USF) have substantiated the clear benefits of non-linear acquisition when it comes to HDR sampling, thus circumventing the sensor saturation and clipping problems. Inspired by the USF, the practical live demo will highlight the potential advantages that modulo sampling has for the signal processing community for radio frequency based applications such as radar, telecommunications, …

The demo will be performed live and the attendees will be able to interact with the radar front-end and see the effect on the Unlimited Samples and the reconstruction similar to this live demo (https://www.youtube.com/watch?v=cENWT5mQDXA&t=592s). The showcased USF-Radar will recover the range of multiple targets (eg. the attendees) from real-time modulo measurements acquired by dedicated modulo ADCs. The presence of corner reflectors at different ranges in front of the radar will change the signal’s frequency content of the sampled signal and of their modulo counterparts and the attendees will be able to see the reconstruction performed live.

More Information
Thu, 12 May, 23:00 - 23:45 China Time (UTC +8)
Thu, 12 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demonstration, we propose a dedicated phantom that simulates human thoracic displacements through which realistic experiments for contactless vital signs monitoring using FMCW radars can be performed, circumventing the need to perform human trials. In addition, we present a technique for contactless monitoring of heart rate (HR) and respiration rate (RR) using the simulated thoracic pattern, based on sparse recovery methods and optimization [1]. Contactless technology for the monitoring of human vital signs, such as respiration and heartbeat, has become a necessity in recent years due to the rising cardiopulmonary morbidity, the risk of transmitting diseases, and the heavy burden on the medical staff. FMCW radars have shown great potential in addressing these needs, however, to enable the widespread use of this technology one must provide uncompromising estimation results. To address this challenge, methods must be developed that remotely extract physiological parameters from realistic FMCW signals. However, these signals are difficult to obtain for reasons of privacy, costs, and regulation. Furthermore, existing processing techniques, do not have high resolution and do not provide adequate performance. Using the proposed phantom, a unique and repetitive simulation can be performed to examine the performance of different methods for contactless human vital signs monitoring, while dealing with real physical phenomena that are difficult to emulate in a software simulation. In addition, the ability to compare to the Ground-Truth is enhanced, since the input signal to the phantom is controlled. Our demonstration platform consists of (1) a Vibration generator for generating mechanical thoracic displacements. (2) A flat circular metal plate (24 cm diameter), for the purpose of imitating a human thorax from which the transmitted radar signals are reflected. (3) TI IWR1642 77GHz mmWave sensor. (4) A dedicated experimental setup. By using a dedicated GUI, both on-site and online attendees can evident contactless human vital signs monitoring by several methods utilizing the proposed phantom. We also show the advantage of using our proposed method based on sparse recovery.

[1] Y. Eder, D. Khodyrker, and Y.C. Eldar, “Sparsity Based Contactless Vital Signs Monitoring of Multiple People via FMCW Radar”, In Preparation.

More Information
Thu, 12 May, 23:00 - 23:45 China Time (UTC +8)
Thu, 12 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

This demo shows a duplex dual function radar and communications (DFRC) system based on agile frequency modulated continuous waveform (FMCW) MIMO radar for vehicular applications. Due to the surroundings and other vehicles block, each automotive radar will have detection blind spot. However, since the blind spot of a vehicle may be detected by other vehicles, the spot information of a vehicle can be obtained through communication with others. In this duplex DFRC system, each radar can broadcast target information around itself and receive information from other radars at the same time of detection, which indirectly improves the detection coverage of radars, reduces the blind area of vehicles and improves safety. The joint design of radar and communications also leads to potential gains in system size, power consumption, and spectrum efficiency. In this duplex DFRC demo, multiple identical FMCW waveforms are mixed with different carriers, and transmitted from a selected subset of MIMO radar transmit elements. The digital message communicated is embedded in the selection of transmit antenna subset and the selection of carriers, which incorporates communication capabilities with minimal effect to the radar transmission. We design a dedicated platform for this demo, which consists of a PC and two DFRC transceivers. The PC serves as the controller and processor, which contains a GUI to define parameters interactively. The DFRC transceivers are developed on the TI mmWave evaluation module, which enables to transmit and receive configured FMCW waveforms with a MIMO radar architecture. As a duplex system, the transceivers can simultaneously detect the target, transmit and receive communication waveforms. This demo has two operation modes. In the hardware mode, the waveforms are transmitted over the air. Audiences can input the transmit messages on both and receive them on the other side. In addition, radar senses the targets in real time and shows the results in the GUI. In the simulation mode, we show that the proposed radar scheme achieves a similar resolution performance compared with traditional automotive radar without conveying the communication message.

More Information
Fri, 13 May, 23:00 - 23:45 China Time (UTC +8)
Fri, 13 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demo, we present a software built for recovering the vasculature of breast lesions from contrast enhanced ultrasound scans. Specifically, we present demonstrations on in vivo human scans of three different breast lesions acquired with a clinical ultrasound scanner.

Breast cancer is the most common malignancy in women. Early diagnosis of breast cancer is primordial to enable appropriate treatments and improve prognosis. Ultrasound is a widely available and safe imaging tool. It is often used as an adjunct to mammography for screening, especially in women with dense breast tissue. However, it is used only as a support tool and not as a main tool for diagnosis due to its inherent disadvantages, such as low specificity and resolution. We recently proposed a way to enhance the use of ultrasound as a diagnostic tool for early breast cancer detection [1]. By using contrast-enhanced ultrasound in combination with an advanced super-resolution algorithm we were able to demonstrate the microvascular profile of breast lesions. We use a model based deep learning method for super resolution ultrasound imaging to achieve sub diffraction resolution. The network exploits the properties of the ultrasound signal to devise a parameter efficient network that can generalize well. By leveraging our trained network, the microvasculature structure is recovered in a short time, overcoming challenges such as prior knowledge about the system PSF and limited clinical data for training.

Our demonstration platform consists of an interactive user interface that enables the user to choose an ultrasound scan to reconstruct. The user will be able to see the data as seen to the physician in real time, and will enable the user to process the data and generate a super-resolved image showing microvascular structures that were unseen in the original scan. The user will be able to compare the super-resolved images that exhibit different structures, thus enabling to better differentiate between the lesions.

References: [1] Bar-Shira, O., Grubstein, A., Rapson, Y., Suhami, D., Atar, E., Peri-Hanania, K., Rosen, R. and Eldar, YC. "Learned super resolution ultrasound for improved breast lesion characterization." In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 109-118. Springer, Cham, 2021.

More Information
Fri, 13 May, 23:00 - 23:45 China Time (UTC +8)
Fri, 13 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

Ultrasound Channel Data Transfer over WIFI Alon Mamistvalov, Danah Yatim, Shlomi Savariego, Nimrod Glazer, and Yonina C. Eldar

In this demo, we present a software prototype for transferring high quality ultrasound imaging over Wireless in real-time. The most widely used technique in US imaging is delay and sum (DAS) beamforming, where appropriate delays are applied to signals acquired by transducer elements. However, performing high-resolution digital beamforming requires sampling rates that are much higher than the signal Nyquist rate. Moreover, producing US images that exhibit good resolution and high image contrast typically requires many transducer elements. This leads to large amounts of data making it impractical to transmit US channel data over WIFI. We use a compressed frequency domain convolutional beamforming (CFCOBA) [1] scheme for US imaging which allows to recover high quality images from small size data sets. This method combines sparse Fourier domain beamforming [2], sparse convolutional beamforming (SCOBA) [3] and compressed sensing methods to enable the reconstruction of high resolution images from sub–Nyquist sampled measurements taken at a sparse subset of array elements. We demonstrate the above-mentioned beamforming technique through a software prototype of real-time Ultrasound imaging over WIFI. In our system we use a Verasonics US machine to transmit ultrasound signals to the Tx computer. On the Tx computer we sample the signal at sub -Nyquist rate using subset of the original element array. The data is transferred to the Rx computer over wireless (The protocol used for data transfer is TCP, enabling reliable data stream in our system). On the Rx computer we reconstruct the image using a compressed sensing algorithm. In our demo we enabled overall data reduction of 21~ times less data. Instead of transferring 20~ Mbyte over WIFI per each frame, we transfer only 0.95~ Mbyte per frame, with real-time rate: 2 Ultrasound frames per second. Hence, paving the way towards wireless ultrasound imaging.

References: [1] T. Chernyakova and Y. C. Eldar, “Fourier-domain beamforming: the path to compressed ultrasound imaging,” IEEE transactions on ultrasonics, ferroelectrics, and frequency control, vol. 61, no. 8, pp. 1252–1267, 2014. [2] A. Mamistvalov and Y. C. Eldar, “Compressed fourier-domain convolutional beamforming for sub-nyquist ultrasound imaging,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 69, no. 2, pp. 489–499, 2022. [3] R. Cohen and Y. C. Eldar, “Sparse convolutional beamforming for ultrasound imaging,” IEEE transactions on ultrasonics, ferroelectrics, and frequency control, vol. 65, no. 12, pp. 2390–2406, 2018.

More Information
Fri, 13 May, 23:00 - 23:45 China Time (UTC +8)
Fri, 13 May, 15:00 - 15:45 UTC
Virtual
Gather.Town
Show & Tell

In this demo, we present a low-rate, high-dynamic-range analog to digital conversion (ADC).

While sampling signals through an ADC, typically, it is assumed that the signal’s dynamic range is within the dynamic range of the ADC. However, in many applications, such as radar and ultrasound imaging, the dynamic range of the received signal could be beyond that of the ADC which results in clipping and leads to inaccurate reconstruction. A modulo preprocessing can be used to avoid clipping and sampling signals beyond the dynamic range of the ADC [1]. The modulo step folds the signal to the dynamic range of the ADC, and the folded signal is sampled. During reconstruction, the true samples are recovered from the folded ones by using an unfolding algorithm. Typically, the unfolding algorithms operate at a much higher rate compared to the rate without a modulo operation. This results in a large number of bits per second (NMBPS) post sampling and quantization which may not be suitable in many applications as a large amount of storage or transmission bandwidth is required.

In his demo, we propose modulo hardware that produces side information in addition to performing the folding operation. Specifically, the hardware gathers the information of the folding instants. It is well known that if the folded samples and the folding instants are given then the algorithm does not require a large sampling rate. However, this requires storage or transmission of quantized values of folded samples and the folding instants which again increases the NMBPS. To keep the NMBPS small, the proposed hardware distributes the information of the folding instants across all the samples such that each folded sample is padded with a few additional bits. In this way, the overall NMBPS is much lower than the methods where the side information is not stored.

References: [1] A. Bhandari, F. Krahmer and R. Raskar, "On Unlimited Sampling and Reconstruction," in IEEE Transactions on Signal Processing, vol. 69, pp. 3827-3839, 2021.

More Information