Semantically Annotated SVG export
Dear Musescore Community,
I'm a student of computer science and currently working on an optical music recognition* algorithm for my bachelor thesis. It's aim is not the full recognition of a score but only the recognition and classification of the symbols.
The desired output is a list of symbols, their class and location in a score image, like so:
class: "notehead-filled", attributes: {position: {x: 120, y: 42}}
class: "beam", attributes: {position: {x: 200, y: 400}, angle: 12, thickness: 12, length: 42}
...
the data would then be further processed by a module that actually "reads" the music and arranges the content in e.g. MusicXML.
To automatically evaluate my results I thought about using the SVG export functionality of MuseScore to generate for a given score ground truth data, i.e. a representation like the above to compare my recognition results against.
MusicXML would be an option, too, but usually doesn't supply enough geometrical information about shapes like ties and slurs (although it has the capability using bezier splines) for me to be of any use. Furthermore, its tree structure and relative offset positioning system are relatively uncomfortable if one only wants to know what symbols are where and how they are shaped.
The SVG file format is close to that representation, but it is missing the semantic information.
If each element could have an extra attribute denoting its meaning, the problem would be perfectly solved. An SVG line element for instance could have an additional class attribute set to "ledger-line". An SVG polygon tag could get the class attribute set to "beam", and so on.
I think this could be fairly easy to implement, depending on the exact way the SVG data is written and what data is accessible at that stage. Since the amount and complexity of source code scared me off when trying to locate the corresponding pieces of code, I'd actually prefer to avoid writing this on my own. And writing this as a plugin is probably way beyond what the plugin engine was thought for.
Could someone write this, if it doesn't take too much efforts?
Or give me a hint where to start and how complex this would approximately be?
Best Regards,
Carl
*) For those who haven't heard about OMR: it's the process of converting scanned music back into a symbolic format, like smartscore, sharpeye, capella-scan, etc. do. As far as I know, an OMR module is planned for MuseScore 2 as well.
Comments
MuseScore uses the Qt toolkit for exporting scores. Qt uses a "QPainter" and a "QPaintDevice" abstraction to allow export to different formats. Nearly the same code can be used independent of the actual output format which is selected by using an appropriate PaintDevice. This means that at the SVG level (using the SVG paint device for output) all semantic information of the painted objects are lost. The interface does not allow to augment the resulting SVG code in any way.
Please look at libmscore/scorefile.cpp Score::print(..) for an example of printing an score page. Its < 20 lines of code and easy to modify to output a line of text for every painted object by replacing the "e->draw(painter)" statement. The element "e" has a name and an absolute page position, so something like "printf("%s %f %f\n", e->name(), e->pagePos().x(), e->pagePos().y())" should work.
In reply to MuseScore uses the Qt toolkit by [DELETED] 3
The raw types and positions are actually even better for the first evaluation stages.
I was a little stuck searching for the SVG code since a lot of GUI elements use SVG and I didn't have the time to dig into the developer handbook, so I didn't know about these abstraction mechanisms that MuseScore uses.
Thanks!
In reply to MuseScore uses the Qt toolkit by [DELETED] 3
I wrote some lines to output the element name (and resolve types like sharp or natural for the Accident class).
But how are the pagePos values measured? According to the developer handbook,
The screen position of an visible element is computed from three values:
AP Anchor position of element. This is usually the position of the parent element.
LO Layout Offset, computed in the layout engine. MuseScore calculates it as the normal position of the element
UO User Offset, created by dragging an element by the user
but the unit is not stated. Is it millimeters, 10th staff line spacings, pixels, inch?
Where's the difference between canvas position and page position?
Best regards,
Carl
Edit: All values are measured in points, so by multiplying with d/72 the pixel coordinates can be calculated for an export resolution of d DPI.