Fix for #278916 causes crashes
I’ve got crashes on a piece of mine. Specifically, when
- always, when uploading it to mu͒.com without removing the parts
- always, trying to convert two instances of it at the same time using the
-j
option (for example, a normal one and one that starts with an even page number) - sometimes, when loading it in an interactive session after having worked on some files already
gdb bt for Debian’s 3.2.3:
Thread 1 "musescore3" received signal SIGSEGV, Segmentation fault. 0x00007ffff58e7270 in QVariant::QVariant(QVariant const&) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5 (gdb) bt #0 0x00007ffff58e7270 in QVariant::QVariant(QVariant const&) () from /usr/lib/x86_64-linux-gnu/libQt5Core.so.5 #1 0x0000555555f8735d in Ms::MStyle::value (this=<optimized out>, idx=<optimized out>) at /usr/include/c++/10/array:55 #2 0x000055555603ea04 in Ms::Score::styleV (idx=Ms::Sid::NOSTYLE, this=<optimized out>) at ./libmscore/score.h:838 #3 Ms::ScoreElement::styleValue (this=0x555557688a60, pid=<optimized out>, sid=<optimized out>) at ./libmscore/scoreElement.cpp:841 #4 0x000055555603ec54 in Ms::ScoreElement::initElementStyle (this=this@entry=0x555557688a60, ss=ss@entry=0x555556821c20 <Ms::textLineSegmentStyle>) at ./libmscore/scoreElement.cpp:211
Code in question:
void ScoreElement::initElementStyle(const ElementStyle* ss) { _elementStyle = ss; size_t n = _elementStyle->size(); delete[] _propertyFlagsList; _propertyFlagsList = new PropertyFlags[n]; for (size_t i = 0; i < n; ++i) _propertyFlagsList[i] = PropertyFlags::STYLED; for (const StyledProperty& spp : *_elementStyle) // setProperty(spp.pid, styleValue(spp.pid, spp.sid)); setProperty(spp.pid, styleValue(spp.pid, getPropertyStyle(spp.pid))); }
This code is unchanged since 2018, so it applies to most 3.x versions, including those newer than Debian’s.
Supporting code:
Sid ScoreElement::getPropertyStyle(Pid id) const { for (const StyledProperty& p : *_elementStyle) { if (p.pid == id) return p.sid; } return Sid::NOSTYLE; }
The crash occurs because getPropertyStyle returns Sid::NOSTYLE, which is -1.
Looking at https://musescore.org/en/node/278916 the code itself seems to have fixed something, but I cannot see, from looking at the code, what/how… as far as I can see, getPropertyStyle would just return the same as spp.sid is, always? (Because both access _elementStyle.)
In the same file, I see this:
QVariant ScoreElement::propertyDefault(Pid pid) const { Sid sid = getPropertyStyle(pid); if (sid != Sid::NOSTYLE) return styleValue(pid, sid); // qDebug("<%s>(%d) not found in <%s>", propertyQmlName(pid), int(pid), name()); return QVariant(); }
This makes me think that changing the line to…
setProperty(spp.pid, propertyDefault(spp.pid));
… might fix the bug… if passing just a new QVariant to setProperty is the correct thing to do here.
The question is, why does it return Sid::NOSTYLE in the first place?
More detail:
(gdb) frame 4 #4 0x000055555603ec54 in Ms::ScoreElement::initElementStyle (this=this@entry=0x55555768d3d0, ss=ss@entry=0x555556821c20 <Ms::textLineSegmentStyle>) at ./libmscore/scoreElement.cpp:211 211 setProperty(spp.pid, styleValue(spp.pid, getPropertyStyle(spp.pid))); (gdb) print spp $68 = (const Ms::StyledProperty &) @0x5555568a7aa0: {sid = Ms::Sid::textLinePosAbove, pid = Ms::Pid::OFFSET} (gdb) print ss $69 = (const Ms::ElementStyle *) 0x555556821c20 <Ms::textLineSegmentStyle> (gdb) print *ss $70 = std::vector of length 2, capacity 2 = {{sid = Ms::Sid::textLinePosAbove, pid = Ms::Pid::OFFSET}, { sid = Ms::Sid::textLineMinDistance, pid = Ms::Pid::MIN_DISTANCE}} (gdb) print spp.pid $71 = Ms::Pid::OFFSET (gdb) print getPropertyStyle(spp.pid) $72 = Ms::Sid::NOSTYLE (gdb) print _elementStyle $73 = (const Ms::ElementStyle *) 0x555556821c20 <Ms::textLineSegmentStyle> (gdb) print *_elementStyle $74 = std::vector of length 2, capacity 2 = {{sid = Ms::Sid::textLinePosAbove, pid = Ms::Pid::OFFSET}, { sid = Ms::Sid::textLineMinDistance, pid = Ms::Pid::MIN_DISTANCE}}
Scratch the above. WHY does getPropertyStyle not return Ms::Sid::textLinePosAbove but Ms::Sid::NOSTYLE? Oh wait, it’s not a ScoreElement, it’s an Ms::TextLineSegment… HAH! and that has:
Sid TextLineSegment::getPropertyStyle(Pid pid) const { if (pid == Pid::OFFSET) { if (spanner()->anchor() == Spanner::Anchor::NOTE) return Sid::NOSTYLE; else return spanner()->placeAbove() ? Sid::textLinePosAbove : Sid::textLinePosBelow; } return TextLineBaseSegment::getPropertyStyle(pid); }
A breakpoint later:
(gdb) print _spanner->_anchor $81 = Ms::Spanner::Anchor::NOTE
Well, that’s that. So this… somewhat… applies to note-anchored textlines?
Comments
Okaaay… propertyDefault is also lots of overridden, so it’s not the solution either…
Maybe this could be a fix? I’ll test it…
Bah, I always forget that neither the pre nor the code tag work here. So you’ll have to have bogus colouring…
Yasss… this fixes my reproducible crash!
@Jojo-Schmitz might want this for 3.7
So it is 3.2.3, right? There's no 3.x-dev (any longer / any more)
And you're referring to #278916: Numbers only option has no effect on ottava display
And yes, I'd want it for my https://github.com/musescore/MuseScore/pull/9000
(Edit: is is added to that meanwhile)
code tags do work: <cpp>...</cpp>
I wonder whether this is needed for master (which will become MuseScore 4 shortly) too?
Do you have some clear steps to reproduce the issue? Along with a sample score (you mentione a score of your's in the initla post)?
Probably even earlier.
@Jojo-Schmitz you know, the best thing about reproducers is when they don’t reproduce on another system which is supposedly mostly identical, version-wise.
But I was able to see it in valgrind, with even the easiest call:
My previous reproducer involved a
-j
JSON file conversion where it was converted multiple times (PDF, part PDFs, …).Bad news: the fix isn’t eliminating all invalid memory accesses either:
Maybe someone else has a better idea?
Looks like the
getPropertyStyle
in the first place also accesses uninitialised memory.I wonder if the
_elementStyle
, which is set to thess
argument of theinitElementStyle
function, is maybe not correctly initialised in the first place?cough
cough… cough…
What does
std::vector
initialise and how? Are perhaps all other values uninitialised? Or even… not allocated? Is the size used correct wrt. the actual size? Hm well, it does use…… but… I don’t know CFrustFrust enough. Meh. (And I have mentally switched to the Tₑχ/LᴬTᴇΧ side of my project yesternight already…)
In reply to So it is 3.2.3, right? There… by Jojo-Schmitz
The “cpp” tag is not the “code” tag, though. Both “code” and “pre” are documented (in the help you can open when you report a new issue) but do not work, whereas “cpp” and “xml” work, but have wrong syntax highlighting for things that aren’t C++ or XML.
So your propoposed code change does not fix the issue?
What exactly does reproduce the crash, step by step?`
"cpp" in not the "code" tag, but a code tag.
Perhaps, but I was talking about the “code” and “pre” tag. I need one that works but has no syntax highlighting. The code and pre tags do not work, period.
The proposed change at least hides the crash from me. It seems to fix part of the problem, but not the complete problem space spanned by the underlying issue, as the valgrind output shows.
You can see/reproduce that with the score I attached by running “valgrind mscore3 -o foo.pdf foo.mscz” (FSVO foo being the score name).
Note that running under Valgrind is very slow.
So that doesn't reproduce the crash you're talking about in the initial post(s) but `just' shows a memory leak, or am I missing something?
You’re missing something. Valgrind does not just show leaks. These are not leak reports.
Specifically, both
Conditional jump or move depends on uninitialised value(s)
andUse of uninitialised value of size 8
are certain crashes that are just mitigated by accident because they use memory that happens to be at the accessed location at this point in time, but does not always have to be (for example, from a previous allocation of a different object (which means it involves memory corruption) or uninitialised stack space).So these point out places where the program will crash if the conditions are right (and otherwise may misbehave and/or corrupt its memory, including creating data corruption, including in the scores when saved).
Let’s look at one I could figure out quickly:
We have:
So the read access to
symId
accesses uninitialised memory. This is passed in from the caller, so now we look at…… so the thing passed in is
_symId
, which is initialised…… too late. The read access occurs in
initElementStyle
(line 45) but the memory place is initialised only in line 46, afterwards. (Note that, in the valgrind backtrace,articulation.cpp:45
is indeed there, four frames down.)Is
initElementStyle
always called first? If so, maybe my first hunch was correct after all and it should usepropertyDefault
instead?Shiiit.
initElementStyle uses anything used by ::getPropertyStyle(pid), ::styleValue(pid, sid) or ::propertyDefault(pid) (and ::setProperty(pid, qv) in theory) so these must be set first, but some variables that are currently set after initElementStyle must stay being set afterwards because they override what the setProperty call done by initElementStyle writes… gah!
In reply to Shiiit. initElementStyle… by mirabilos
Worse: our symId example must be set both *before* initElementStyle (because it’s used) _and afterwards (because ::setProperty sets it)… unless the value set by ::setProperty is the same, in which case the assignment just must be moved before the initElementStyle call.
This is true for probably many…
I’ve updated my patch, see: https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=alioth/musescor…
Note this is against 3.2.3
With this patch applied, I can convert the score to PDF under Valgrind with no errors.
I suspect the places in question, possibly more, can be found quicker by building with ASan/UBSan (needs recent GCC or Clang+LLVM, ideally on GNU/Linux) and running lots of tests with it.
The method here is: anything
initElementStyle
uses (whether the scoreElement one or the textbase one), recursively, must be initialised in the constructor before the call toinitElementStyle
happens. (I copied the assignments so the slots stay the same value as before the patch in case thesetProperty
calls ininitElementStyle
change them to different values.) This needs reviewing 2–3 methods each in several dozen files manually(!).I’ve done that for 3.2.3 to the best of my power (though I naïvely assume
setProperty
does not access anything and hope that’ll do), but if anyone would apply this to 3.6.2 or @Jojo-Schmitz ’s 3.7 branch, they’d have to do it for that.In reply to I’ve updated my patch, see:… by mirabilos
Wow. I'm in awe.
In reply to I’ve updated my patch, see:… by mirabilos
in 3.6.2, or at least in my 3.7, those intitialisations do happen before the
initElementStyle();
(but not again after!), except for the Lyrics case.So I don't see how your patch (which rather is a diff of diffs) would apply there
In reply to in 3.6.2, or at least in my… by Jojo-Schmitz
> in 3.6.2, or at least in my 3.7, those intitialisations do happen before the initElementStyle(); (but not again after!), except for the Lyrics case.
This is why I said the necessary changes have to be figured out for each branch separately. Using valgrind, ideally, which is extremely slow, or perhaps ASan+UBSan, which is still slow but faster than Valgrind and may also flag these uses (I hope).
> So I don't see how your patch (which rather is a diff of diffs) would apply there
I did write that this is not a patch to apply as-is. (And yes, it’s a diff of a diff because it’s the change I applied to my former patch.)
commit 7ec409722d79bfca2eb1877006b1efc7d5aaa308 changed the order of the initialisations to something I’d have expected, so this is deliberate. Still, at least
_no
can be accessed before it’s initialised (interestingly enough, there’s an_even
, but the code using it might have switched to_no
later…).I guess I’ll backport that commit and reduce the patch accordingly. The change is deliberate, though I wonder how much of it even was a change.
Only overlap is articulation and dynamic, so the others are still needed…
In reply to commit… by mirabilos
How much work would it be to roll forward your debian-packaging changes to 3.7?
In reply to How much work would it be to… by larryz
(Replying to larryz but this is off-topic.)
Quite a lot (plus I’ll have to write something to add (optionally) back the old accidental entry mode that got lost in 3.3 before I’d consider using it myself).
There’s been some discussion around Debian about this already. Basically, upgrading to 3.6.2 (over a year old at this time, with tons of known bugs and no upstream support either) might be a disservice to users; upgrading to Jojo’s 3.7 branch is a disservice to other users because the mu͒.com backend will lack the fixes from there.
Right now, I’m telling users that, unless they need Leland or chord playbacks or a very small handfull of other features, they can load the newer scores with the highly-patched¹ 3.2.3 in Debian and the PPA, click the error message away, and it’ll work. The mu͒.org developers certainly have been telling people for years to use the AppImage² instead, so these who do need the features mentioned could probably do that.
① over 130 patches by now
② which is probably illegal to distribute due to the licences of the components involved, but let’s not care about that apparently
(Off-Topic Part 2)
Mirabilos: Quite a lot (plus I’ll have to write something to add (optionally) back the old accidental entry mode that got lost in 3.3 before I’d consider using it myself).
If understood correctly (apparently one can never be sure these days), I got implemented post-accidental applicators as shortcut commands in the nick of time before they stopped 3.x development (in 3.6) - Just making sure you're aware of their existence:
In reply to (Replying to larryz but this… by mirabilos
3.6.2 should have been ported to Debian long ago. Not doing it now, just because a long time has passed, is not making any sense, it is, and will remain to stay the last 3.x version. Either port it or get the utterly outdated 3.2.3 removed from the repo. There too an AppImage is available after all.
3.6.2 sure does have bugs, but surely not tons of them and surely not more nor worse ones than 3.2.3. That one should never have gotten highly patched in the first place, that effort had better been spent elsewhere, like in doing the 3.6.2 port.
Whether to also do my 3.7 for Debian is debatable.
@worldwideweary interesting, I only saw that someone said someone had to do that, in the relevant thread, but never that someone actually did it. I’ll have a look at it given time.
@Jojo-Schmitz that’s your opinion, which you are entitled to; however, as distro maintainer, I have to weigh multiple concerns and steakholders etc. and am entitled to my own. And I had been very disappointed with the constant push from upstream towards AppImage and against packaged solutions (even when the packages were delivering the latest up-to-date things), and this went on for a long time, so excuse me for weighting the opinions of the people who’d tell users to not use my thing afterwards anyway less strongly.
Back to topic, ok?
Sure: does this problem still exists in 3.6.2? Does it stiill exist in my 3.7. Does it still exist in the current master builds?
What are the exact steps to reproduceit?
In reply to Sure: does this problem… by Jojo-Schmitz
It definitely occurs on the mu͒.com backend, which is 3.6.2 plus some patches TTBOMK.
The mu͒.com backend crashes on http://www.mirbsd.org/music/free/Bortnjanski%20--%20Tebe%20poem.mscz but this is no guarantee. (While editing it, I occasionally got crashes, but mostly involving loading and closing this score more than once in one session, or at least another score.) This is because this is uninitialised memory access, which makes it up to chance.
Valgrind detects it, e.g.
valgrind mscore3 -o x.$extension scorename.msc[xz]
for extension in PDF, MP3, … but this is veeeeeeeery slow. To find all of them, you need a score that exercises every single class, ideally in multiple constellations.If compiling MuseScore with ASan+UBSan, at least some of these accesses should be detected. It’s not as throrough as Valgrind but not that slow, only veeery slow.