Download PDFOpen PDF in browserDistribution Mismatch Correction for Acoustic Scene ClassificationEasyChair Preprint 107715 pages•Date: August 24, 2023AbstractWhile deep learning methods have shown immense benefits for Acoustic Scene Classification (ASC) tasks in terms of performance, they also introduce new challenges as these methods are prone to suffer from large performance degradation for out of distribution data. To build robust ASC models that can achieve reliable performance across multiple recording devices, the architecture has to be able to quickly adapt to changing input and activation distributions. We present ASCMobConvNet, a CNN architecture based on Mobile Inverted Bottleneck Convolutions. In order to better adapt to domain shifts and the resulting change in activation distributions, it uses sub-spectral normalization layers in combination with residual normalization instead of batch normalization layers. Furthermore, the model corrects non-parametric mismatches in the activation distributions through the integration of Wasserstein distribution correction layers. Using our proposed architecture we are able to achieve an test accuracy of 68.10% on the TAU Urban Acoustic Scenes 2020 Mobile, development dataset. Using Wasserstein distribution correction layers we can further improve the accuracy by 0.68%. Keyphrases: acoustic scene classification, distribution correction, low-complexity neural networks
|