Sphinx Application Documentation - Image duplication

Julius Künzel jk.kdedev at smartlab.uber.space
Sun Jan 22 18:51:36 GMT 2023


Hi Ben, hi all,

I did a little research about this recently and unfortunately it seems to me as if there is not really a solution on the Sphinx side. One need to have separate build dirs for every language and it copies all static files (css, js, images,..) to every build dir. That's just how it works :-/ (Correct me in case anyone knows I am wrong).
However we can of course try to solve this on our and and make our deploy tools smart in a way that they keep only one version of each image file and replace the others with symlinks.
It should be more or less easy to detect images that are translated since they follow the pattern filename.de.png where "de" is the language code, so this image would be special for German, while for all other languages filename.png is used.

I hope that helps so far. I might be able to look into this, but probably not very soon so if anybody else can work on this I am more than happy.

Cheers,
Julius




15. Januar 2023 um 07:45, "Ben Cooksley" <bcooksley at kde.org> schrieb:


> 
> Hi all,
> 
> For some time now it has been known to me that the system for generating application documentation websites using Sphinx with l10n support has had issues with duplicating data - particularly images.
> 
> That leads to the following outcome, where aside from sites that we expect to be quite large (like www.kde.org http://www.kde.org/  and api.kde.org http://api.kde.org/ ) all of the application documentation sites are quite big as well:
> 
> root at nicoda /srv/www # du -h --max-depth=1 ./generated/ | grep G
> 2.3G    ./generated/cutehmi.kde.org http://cutehmi.kde.org/ 
> 3.7G    ./generated/docs.digikam.org http://docs.digikam.org/ 
> 2.4G    ./generated/api.kde.org http://api.kde.org/ 
> 2.3G    ./generated/docs.krita.org http://docs.krita.org/ 
> 1.4G    ./generated/www.kde.org http://www.kde.org/ 
> 7.9G    ./generated/docs.kdenlive.org http://docs.kdenlive.org/ 
> 29G     ./generated/
> 
> This stands in comparison to the Docbook documentation site for all other KDE applications:
> 
> root at nicoda /srv/www # du -h --max-depth=1 . | grep G
> 29G     ./generated
> 16G     ./api.kde.org-legacy
> 6.0G    ./docs.kde.org http://docs.kde.org/ 
> 51G     .
> 
> It would be nice if we could please look into some fixes for this, as it looks like Sphinx is duplicating the images - once for every language - when that isn't necessary.
> I could understand if the screenshots were updated as part of the translation, but it looks like they're not in the majority of cases - below being just a sample:
> 
> root at nicoda /srv/www/generated/docs.krita.org http://docs.krita.org/  # sha256sum zh_CN/_images/Krita_cpb_mixing.gif
> 12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7  zh_CN/_images/Krita_cpb_mixing.gif
> root at nicoda /srv/www/generated/docs.krita.org http://docs.krita.org/  # sha256sum en/_images/Krita_cpb_mixing.gif
> 12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7  en/_images/Krita_cpb_mixing.gif
> 
> While this isn't a massive issue right now, it is a future scalability issue as for Krita at least each language costs 178MB or so, while for Digikam that sits at 415MB per language and Kdenlive is 392MB.
> 
> Many thanks,
> Ben
> 


Julius Künzel
Volunteer KDE Developer, mainly hacking Kdenlive
KDE GitLab: https://my.kde.org/user/jlskuz/
Matrix: @jlskuz:kde.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-www/attachments/20230122/c7d7a6d8/attachment.htm>


More information about the kde-www mailing list