An Architecture for Robust Partial Tracking and Onset Localization in Single Channel Audio Signal Mixes


Author: Ludger Solbach
Language: English
Pages: 143
Year: 1998
Download: solbach.ps.gz (3.82 MByte), examples
Abstract: The primary achievement of this work is the development and evaluation of an automated system for partial parameter extraction and partial resynthesis from single channel recordings in nonstationary, noisy environment. There are mostly two factors in audio signal analysis that have been neglected in the past: the relevance of robustness against background noise and the importance of appropriately dealing with the time-frequency resolution trade-off. Moreover, resynthesis was usually performed for mere a-posteriori validation of the analysis process, without making use of the beneficial potential of adaptive feedback cancellation.

In the proposed architecture the time-frequency trade-off is taken into account by the introduction of two different resolutions, one for signal components concentrated in frequency (partial trackers) and another one for those concentrated in time (onset detector). The task of each partial tracker is to track a partial tone originating from an onset. They are realized as frequency-locked loops with gammatone tracking filters of variable bandwidth. The number of partial trackers may vary in dependence on the extracted properties of the input signal. The second kind of module used in the architecture is the so-called master module. The master module's task is to create or delete partial trackers triggered by the detection of an onset or offset. Central to this module is a wavelet transformer performing the preprocessing required by the onset detector and a noise floor estimator. After the detection of an onset by threshold level crossing, new partials are roughly localized in frequency through the analysis of the wavelet filter bank output. Then, the system steps back to the onset location and runs a second pass taking the newly gained insight into account. Each partial tracker continuously reports its own partial tone back to the master module, where an overall residual signal is formed by subtraction from the overall input signal. This adaptive feedback cancellation mechanism facilitates noise floor estimation, onset detection and the separation of partials lying close to each other in frequency. Automated threshold adaptation and continuous noise floor estimation serve for keeping the rate of false onset alarms low. The total system threshold is continuously updated taking previous signal onsets and noise floor estimates into account. Through this design, the different system components - partial trackers, onset detector and noise floor estimator - do not operate independently from each other. Instead they cooperate, each one taking advantage of the insight acquired by its collaborators.

Although signal-theoretic considerations rather than physiological or psychoacoustic findings were followed as guidelines in the development of the architecture, the proposed approach leads to a system bearing some similarities with properties of the human auditory system, most notably temporal and spectral masking. The system's ability to localize signal components precisely in time and frequency is examined in various experiments. Several examples are given to further illustrate the capabilities of the architecture.


solbach@hypersonic.de