Non-ASCII characters in lyrics break aligment?
Context: Linux Mint 17.1, self-compiled Github master and released 2.0 version. In both, I observe the following:
If a lyric ends with a non-ASCII character -- like for instance an accented vowel --, it is aligned as if the non-ASCII character be not there. A couple of screen shots:
Note the different alignment of the «Su» without an accent and the «Sù» with an accent.
Note the «to è» aligned as the «<space>è» part was not there.
Anybody can confirm? Perhaps on different OS?
Thanks,
M.
Attachment | Size |
---|---|
Non_ASCII_Lyrics_1.png | 9.11 KB |
Non_ASCII_Lyrics_2.png | 6.01 KB |
Comments
Presumably a result of the code I added at to ignore "special characters", in response to requests like #24856: Ignore "parenthesized expressions" that precede lyric syllables. I guess the regular expressions used in Lyrics::layout1() could stand some tweaking.
In reply to Presumably a result of the by Marc Sabatella
Do you refer to:
"(^[\\d\\W]*)([^\\d\\W].*?)([\\d\\W]*$)"
?I did a test and, indeed, the third capture group captures the trailing «ù» and «<space>è» of the above examples.
Which is wrong, because, as 'ù' is obviously not a digit, if it is matched, it is intended as a non-word character, but it is a word character.
A Qt bug?
Now, I understand that to everybody, his own issues seem more important, but this affects rather severely any Italian text (which is not so secondary in pre-classic and classic Western music) and possibly French texts also to some degrees (past participles come easily to mind).
I though that, if the code has been added to look for leading and trailing texts in parentheses, one could look specifically for parentheses and avoid the word/non-word character class which seems unreliable.
However, parentheses may legitimately occur in 'normal' lyrics, like for instance in:
NOT FOUND: 1
where they signal an editorial addition.
So, I think we hit some wall, here...
Thanks,
M.
In reply to Do you refer to: by Miwarre
Parentheses weren't even the case I was most concerned about. If you follow the links to other threads, you'll see it started with a request to ignore a leading "..." then kind of mushroomed from there.
Anyhow, I definitely agree that we need a fix. I'd even settle for just backing code out completely, but I know a few people might not like that. In one of those threads, David Bolton posted a specific list of characters he thought should skip. At the time it seemed simpler to just use the predefined Qt character classes, but if we can't rely on them, we should probably revisit that.
Indeed it happens also with initial non-ASCII characters:
NOT FOUND: 1
(in «è il»)
In reply to Indeed it happens also with by Miwarre
Would be great to submit a proper issue in the issue tracker so we don't loose track of this one.
In reply to Would be great to submit a by [DELETED] 5
Done: #64856: Accented characters at beginning/end of lyrics ignored with respect to alignment
I should be able to take a look at this in the next day or two. Edit: Done :-)