Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-28719
Publication type: Article in scientific journal
Type of review: Open peer review
Title: Real world music object recognition
Authors: Tuggener, Lukas
Emberger, Raphael
Ghosh, Adhiraj
Sager, Pascal
Satyawan, Yvan Putra
Montoya, Javier
Goldschagg, Simon
Seibold, Florian
Gut, Urs
Ackermann, Philipp
Schmidhuber, Jürgen
Stadelmann, Thilo
et. al: No
DOI: 10.5334/tismir.157
10.21256/zhaw-28719
Published in: Transactions of the International Society for Music Information Retrieval
Volume(Issue): 7
Issue: 1
Page(s): 1
Pages to: 14
Issue Date: 2024
Publisher / Ed. Institution: Ubiquity Press
ISSN: 2514-3298
Language: English
Subjects: Optical music recognition; Deep learning; Data augmentation; Adversarial training; Model ensemble; Open data
Subject (DDC): 006: Special computer methods
Abstract: We present solutions to two of the most pressing issues in contemporary optical music recognition (OMR).We improve recognition accuracy on low-quality, real-world (i.e. containing ageing, lighting, or dirt artefacts among others) input data and provide confidence-rated model outputs to enable efficient human post-processing. Specifically, we present (i) a sophisticated input augmentation scheme that can reduce the gap between sanitised benchmarks and realistic tasks through a combination of synthetic data and noisy perturbations of real-world documents; (ii) an adversarial discriminative domain adaptation method that can be employed to improve the performance of OMR systems on low-quality data; (iii) a combination of model ensembles and prediction fusion, which generates trustworthy confidence ratings for each prediction. We evaluate our contributions on a newly created test set consisting of manually annotated pages of varying real-world quality, sourced from International Music Score Library Project (IMSLP) / the Petrucci Music Library. With the presented data augmentation scheme, we achieve a doubling in detection performance from 36.0% to 73.3% on noisy real-world data compared to state-of-the-art training. This result is then combined with robust confidence ratings paving the way forOMR to be deployed in the realworld. Additionally, we showthe merits of unsupervised adversarial domain adaptation for OMR raising the 36.0% baseline to 48.9%. All our code and data are freely available at: https://github.com/raember/s2anet/tree/TISMIR_publication.
URI: https://digitalcollection.zhaw.ch/handle/11475/28719
Related research data: https://github.com/raember/s2anet/tree/TISMIR_publication
Fulltext version: Published version
License (according to publishing contract): CC BY 4.0: Attribution 4.0 International
Departement: School of Engineering
Organisational Unit: Centre for Artificial Intelligence (CAI)
Institute of Computer Science (InIT)
Published as part of the ZHAW project: RealScore – Scanning of Real-World Sheet Music for a Digital Music Stand
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2024_Tuggener-etal_Real-world-music-object-recognition.pdfPublished Version2.13 MBAdobe PDFThumbnail
View/Open
2023_Tuggener-etal_Real-world-music-object-recognition_TISMIR.pdfAccepted Version1.07 MBAdobe PDFThumbnail
View/Open
Show full item record
Tuggener, L., Emberger, R., Ghosh, A., Sager, P., Satyawan, Y. P., Montoya, J., Goldschagg, S., Seibold, F., Gut, U., Ackermann, P., Schmidhuber, J., & Stadelmann, T. (2024). Real world music object recognition. Transactions of the International Society for Music Information Retrieval, 7(1), 1–14. https://doi.org/10.5334/tismir.157
Tuggener, L. et al. (2024) ‘Real world music object recognition’, Transactions of the International Society for Music Information Retrieval, 7(1), pp. 1–14. Available at: https://doi.org/10.5334/tismir.157.
L. Tuggener et al., “Real world music object recognition,” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, pp. 1–14, 2024, doi: 10.5334/tismir.157.
TUGGENER, Lukas, Raphael EMBERGER, Adhiraj GHOSH, Pascal SAGER, Yvan Putra SATYAWAN, Javier MONTOYA, Simon GOLDSCHAGG, Florian SEIBOLD, Urs GUT, Philipp ACKERMANN, Jürgen SCHMIDHUBER und Thilo STADELMANN, 2024. Real world music object recognition. Transactions of the International Society for Music Information Retrieval. 2024. Bd. 7, Nr. 1, S. 1–14. DOI 10.5334/tismir.157
Tuggener, Lukas, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, et al. 2024. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval 7 (1): 1–14. https://doi.org/10.5334/tismir.157.
Tuggener, Lukas, et al. “Real World Music Object Recognition.” Transactions of the International Society for Music Information Retrieval, vol. 7, no. 1, 2024, pp. 1–14, https://doi.org/10.5334/tismir.157.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.