Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-28719
Publication type: | Article in scientific journal |
Type of review: | Open peer review |
Title: | Real world music object recognition |
Authors: | Tuggener, Lukas Emberger, Raphael Ghosh, Adhiraj Sager, Pascal Satyawan, Yvan Putra Montoya, Javier Goldschagg, Simon Seibold, Florian Gut, Urs Ackermann, Philipp Schmidhuber, Jürgen Stadelmann, Thilo |
et. al: | No |
DOI: | 10.5334/tismir.157 10.21256/zhaw-28719 |
Published in: | Transactions of the International Society for Music Information Retrieval |
Volume(Issue): | 7 |
Issue: | 1 |
Page(s): | 1 |
Pages to: | 14 |
Issue Date: | 2024 |
Publisher / Ed. Institution: | Ubiquity Press |
ISSN: | 2514-3298 |
Language: | English |
Subjects: | Optical music recognition; Deep learning; Data augmentation; Adversarial training; Model ensemble; Open data |
Subject (DDC): | 006: Special computer methods |
Abstract: | We present solutions to two of the most pressing issues in contemporary optical music recognition (OMR).We improve recognition accuracy on low-quality, real-world (i.e. containing ageing, lighting, or dirt artefacts among others) input data and provide confidence-rated model outputs to enable efficient human post-processing. Specifically, we present (i) a sophisticated input augmentation scheme that can reduce the gap between sanitised benchmarks and realistic tasks through a combination of synthetic data and noisy perturbations of real-world documents; (ii) an adversarial discriminative domain adaptation method that can be employed to improve the performance of OMR systems on low-quality data; (iii) a combination of model ensembles and prediction fusion, which generates trustworthy confidence ratings for each prediction. We evaluate our contributions on a newly created test set consisting of manually annotated pages of varying real-world quality, sourced from International Music Score Library Project (IMSLP) / the Petrucci Music Library. With the presented data augmentation scheme, we achieve a doubling in detection performance from 36.0% to 73.3% on noisy real-world data compared to state-of-the-art training. This result is then combined with robust confidence ratings paving the way forOMR to be deployed in the realworld. Additionally, we showthe merits of unsupervised adversarial domain adaptation for OMR raising the 36.0% baseline to 48.9%. All our code and data are freely available at: https://github.com/raember/s2anet/tree/TISMIR_publication. |
URI: | https://digitalcollection.zhaw.ch/handle/11475/28719 |
Related research data: | https://github.com/raember/s2anet/tree/TISMIR_publication |
Fulltext version: | Published version |
License (according to publishing contract): | CC BY 4.0: Attribution 4.0 International |
Departement: | School of Engineering |
Organisational Unit: | Centre for Artificial Intelligence (CAI) Institute of Computer Science (InIT) |
Published as part of the ZHAW project: | RealScore – Scanning of Real-World Sheet Music for a Digital Music Stand |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2024_Tuggener-etal_Real-world-music-object-recognition.pdf | Published Version | 2.13 MB | Adobe PDF | View/Open |
2023_Tuggener-etal_Real-world-music-object-recognition_TISMIR.pdf | Accepted Version | 1.07 MB | Adobe PDF | View/Open |
Show full item record
Tuggener, L., Emberger, R., Ghosh, A., Sager, P., Satyawan, Y. P., Montoya, J., Goldschagg, S., Seibold, F., Gut, U., Ackermann, P., Schmidhuber, J., & Stadelmann, T. (2024). Real world music object recognition. Transactions of the International Society for Music Information Retrieval, 7(1), 1–14. https://doi.org/10.5334/tismir.157
Tuggener, L. et al. (2024) ‘Real world music object recognition’, Transactions of the International Society for Music Information Retrieval, 7(1), pp. 1–14. Available at: https://doi.org/10.5334/tismir.157.
L. Tuggener et al., “Real world music object recognition,” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, pp. 1–14, 2024, doi: 10.5334/tismir.157.
TUGGENER, Lukas, Raphael EMBERGER, Adhiraj GHOSH, Pascal SAGER, Yvan Putra SATYAWAN, Javier MONTOYA, Simon GOLDSCHAGG, Florian SEIBOLD, Urs GUT, Philipp ACKERMANN, Jürgen SCHMIDHUBER und Thilo STADELMANN, 2024. Real world music object recognition. Transactions of the International Society for Music Information Retrieval. 2024. Bd. 7, Nr. 1, S. 1–14. DOI 10.5334/tismir.157
Tuggener, Lukas, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, et al. 2024. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval 7 (1): 1–14. https://doi.org/10.5334/tismir.157.
Tuggener, Lukas, et al. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, 2024, pp. 1–14, https://doi.org/10.5334/tismir.157.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.