Non-ASCII characters in lyrics break aligment?

• Jun 10, 2015 - 22:06

Context: Linux Mint 17.1, self-compiled Github master and released 2.0 version. In both, I observe the following:

If a lyric ends with a non-ASCII character -- like for instance an accented vowel --, it is aligned as if the non-ASCII character be not there. A couple of screen shots:

Non_ASCII_Lyrics_1.png

Note the different alignment of the «Su» without an accent and the «» with an accent.

Non_ASCII_Lyrics_2.png

Note the «to è» aligned as the «<space>è» part was not there.

Anybody can confirm? Perhaps on different OS?

Thanks,

M.

Attachment Size
Non_ASCII_Lyrics_1.png 9.11 KB
Non_ASCII_Lyrics_2.png 6.01 KB

Comments

In reply to by Marc Sabatella

Do you refer to: "(^[\\d\\W]*)([^\\d\\W].*?)([\\d\\W]*$)"?

I did a test and, indeed, the third capture group captures the trailing «ù» and «<space>è» of the above examples.

Which is wrong, because, as 'ù' is obviously not a digit, if it is matched, it is intended as a non-word character, but it is a word character.

A Qt bug?

Now, I understand that to everybody, his own issues seem more important, but this affects rather severely any Italian text (which is not so secondary in pre-classic and classic Western music) and possibly French texts also to some degrees (past participles come easily to mind).

I though that, if the code has been added to look for leading and trailing texts in parentheses, one could look specifically for parentheses and avoid the word/non-word character class which seems unreliable.

However, parentheses may legitimately occur in 'normal' lyrics, like for instance in:
NOT FOUND: 1
where they signal an editorial addition.

So, I think we hit some wall, here...

Thanks,

M.

Attachment Size
Non_ASCII_Lyrics_4.png 3.26 KB

In reply to by Miwarre

Parentheses weren't even the case I was most concerned about. If you follow the links to other threads, you'll see it started with a request to ignore a leading "..." then kind of mushroomed from there.

Anyhow, I definitely agree that we need a fix. I'd even settle for just backing code out completely, but I know a few people might not like that. In one of those threads, David Bolton posted a specific list of characters he thought should skip. At the time it seemed simpler to just use the predefined Qt character classes, but if we can't rely on them, we should probably revisit that.

Do you still have an unanswered question? Please log in first to post your question.