E194-01 - Forschungsbereich Information und Software Engineering
Proceedings of the 2nd International Workshop on Reading Music Systems
2nd International Workshop on Reading Music Systems
Delft, The Netherlands, EU
Number of Pages:
Object Detection; Optical Music Recognition; Music Staff
Music scores written in modern notation use staves as a reference system for assigning semantics to the individual symbols that appear in the score. Detecting this structural element is, therefore, a natural step in most Optical Music Recognition systems. However, many systems struggle to reliably detect staves. This paper investigates whether computers can learn to detect staves with a convolutional neural network given only a small set of images for which annotation are available. After an initial training phase, the network is asked to make prediction on a larger test dataset. A human annotator reviews the predictions and approves or rejects samples. Approved samples will be added to the training set for the next iteration to incrementally expand the training set and allow the network to operate well on a variety of music scores.
After four iterations, we were able to obtain staff bounding box annotations for 14,000 out of 20,000 scores in our dataset. Although the evaluated approach has structural flaws that lead to imprecise results and deficits when detecting non-straight staves, it can serve as a viable starting point for future staff detection systems.
Visual Computing and Human-Centered Technology: 100%