"Project - Subtitles - Import Subtitle File..." ruins non-ascii characters when importing SRT file in UTF-8 encoding.

Jean-Baptiste Mardelle jb at kdenlive.org
Tue Mar 7 05:55:46 GMT 2023


On Dienstag, 7. März 2023 04:01:29 CET Stub wrote:
> Jean-Baptiste Mardelle wrote:
> | Thanks for your report. I could reproduce the problem using very short
> | subtitle files, where it incorrectly thinks an UTF-8 file is encoded as
> | japanese. I will update the import dialog to show the detected encoding,
> | better handle poor detection confidence and allow user to enforce an
> 
> encoding.
> 
> I am quite puzzled by the following. When I have a kdenlive project and I
> have created manually new subtitles, there are two files after saving my
> project:
> myproject.kdenlive
> myproject.kdenlive.srt
> 
> Note: the srt file here is in UTF-8.
> When I save and close kdenlive, and then open my project again, it loads
> the "myproject.kdenlive.srt" file without any problem (non-ascii characters
> are *not* scrambled).
> 
> However, when I make a copy of the srt file, for example "cp
> myproject.kdenlive.srt new.srt", delete all the subtitles in kdenlive, and
> then use "Project - Subtitles - Import Subtitle File..." for the new.srt
> file, the non-ascii characters are scrambled.

The problem is now fixed in git master and the upcoming 23.04.0 version. The 
thing is that when loading the subtitle file created by Kdenlive 
(myproject.kdenlive.srt), it is assumed to be in UTF-8 (as Kdenlive saved it). 
But when importing another subtitle file, we try to detect the character 
encoding using a library designed for that. It turns out this library cannot 
properly determine the correct encoding in many cases, returning an "unsure" 
guess that we used as result. I have now added a way to manually correct the 
encoding on import.

Best regards,
Jean-Baptiste

> My confusion:
>  the automatic loading of the UTF-8 subtitle files work fine
>  when importing the same UTF-8 subtitle files, it scrambles the non-ascii
> characters.
> 
> -S.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kdenlive/attachments/20230307/ba6e162c/attachment.sig>


More information about the kdenlive mailing list