Sphinx Application Documentation - Image duplication

Ben Cooksley bcooksley at kde.org
Sun Jan 22 20:42:07 GMT 2023


On Mon, Jan 23, 2023 at 8:59 AM L. E. Segovia <amy at amyspark.me> wrote:

> Hi all,
>

Hi Amy,


>
> If I understand correctly, by doing what you said you would instead have
> a copy of each image per supported language-- only that squashed into a
> massive monolithic folder. Did you instead mean to symlink the
> "localized" image files to the source copies?
>

>From what Julius has found, all translated images have the language code
injected into the filename - with untranslated images being left unchanged.

Therefore, as pseudo-code:
- mv $root/en/_images/* $root/_images/
- mv $root/it/_images/* $root/_images/
- mv $root/fr/_images/* $root/_images/

Should achieve the objective of de-duplicating all of the images, as the
untranslated English screenshots should all overwrite each other - leaving
just a single copy of the English screenshots and all the translated ones
behind.


>
> Another idea I have is to preserve the localization step as is, but
> ignore the generated image folder, and in a postbuild step replace the
> <img src="localized image path"> with the path to the source folder. (I
> do something like this with a HTMLPipeline filter for my blog's emojis.)
>
> Cheers,
>
> amyspark
>

Cheers,
Ben


>
>
> PS: I've trimmed the CC as I wasn't sure if I should mail four lists at
> once. Feel free to forward the email if necessary.
>
> On 22/01/2023 16:02, Ben Cooksley wrote:
> > On Mon, Jan 23, 2023 at 7:51 AM Julius Künzel
> > <jk.kdedev at smartlab.uber.space <mailto:jk.kdedev at smartlab.uber.space>>
> > wrote:
> >
> >     __
> >     Hi Ben, hi all,
> >
> >
> > Hi Julius,
> >
> >
> >
> >     I did a little research about this recently and unfortunately it
> >     seems to me as if there is not really a solution on the Sphinx side.
> >     One need to have separate build dirs for every language and it
> >     copies all static files (css, js, images,..) to every build dir.
> >     That's just how it works :-/ (Correct me in case anyone knows I am
> >     wrong).
> >     However we can of course try to solve this on our and and make our
> >     deploy tools smart in a way that they keep only one version of each
> >     image file and replace the others with symlinks.
> >     It should be more or less easy to detect images that are translated
> >     since they follow the pattern |filename.de.png where "de" is the
> >     language code, so this image would be special for German, while for
> >     all other languages filename.png is used.|
> >
> >
> > I had a very strong feeling that would be the case (very much seems that
> > Sphinx actually doesn't have proper i18n/l10n support and it's been
> > hacked in / bolted on later).
> >
> > My initial thinking on a quick and (somewhat) dirty solution to this had
> > been to merge all of the image files into a single folder at top level
> > and then symlink that from each language.
> > Knowing that translated images actually have a separate filename
> > convention indicates that this might just be crazy enough to work.
> >
> > Thoughts?
> >
> >
> >     I hope that helps so far. I might be able to look into this, but
> >     probably not very soon so if anybody else can work on this I am more
> >     than happy.
> >
> >     Cheers,
> >     Julius
> >
> >
> > Regards,
> > Ben
> >
> >
> >     |
> >     |
> >
> >     15. Januar 2023 um 07:45, "Ben Cooksley" <bcooksley at kde.org
> >     <mailto:bcooksley at kde.org?to=%22Ben%20Cooksley%22%20%3Cbcooksley%
> 40kde.org%3E>> schrieb:
> >
> >         Hi all,
> >
> >         For some time now it has been known to me that the system for
> >         generating application documentation websites using Sphinx with
> >         l10n support has had issues with duplicating data - particularly
> >         images.
> >
> >         That leads to the following outcome, where aside from sites that
> >         we expect to be quite large (like www.kde.org
> >         <http://www.kde.org/> and api.kde.org <http://api.kde.org/>) all
> >         of the application documentation sites are quite big as well:
> >
> >         root at nicoda /srv/www # du -h --max-depth=1 ./generated/ | grep G
> >         2.3G    ./generated/cutehmi.kde.org <http://cutehmi.kde.org/>
> >         3.7G    ./generated/docs.digikam.org <http://docs.digikam.org/>
> >         2.4G    ./generated/api.kde.org <http://api.kde.org/>
> >         2.3G    ./generated/docs.krita.org <http://docs.krita.org/>
> >         1.4G    ./generated/www.kde.org <http://www.kde.org/>
> >         7.9G    ./generated/docs.kdenlive.org <http://docs.kdenlive.org/
> >
> >         29G     ./generated/
> >
> >         This stands in comparison to the Docbook documentation site for
> >         all other KDE applications:
> >
> >         root at nicoda /srv/www # du -h --max-depth=1 . | grep G
> >         29G     ./generated
> >         16G     ./api.kde.org-legacy
> >         6.0G    ./docs.kde.org <http://docs.kde.org/>
> >         51G     .
> >
> >         It would be nice if we could please look into some fixes for
> >         this, as it looks like Sphinx is duplicating the images - once
> >         for every language - when that isn't necessary.
> >         I could understand if the screenshots were updated as part of
> >         the translation, but it looks like they're not in the majority
> >         of cases - below being just a sample:
> >
> >         root at nicoda /srv/www/generated/docs.krita.org
> >         <http://docs.krita.org/> # sha256sum
> >         zh_CN/_images/Krita_cpb_mixing.gif
> >         12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7
> >          zh_CN/_images/Krita_cpb_mixing.gif
> >         root at nicoda /srv/www/generated/docs.krita.org
> >         <http://docs.krita.org/> # sha256sum
> en/_images/Krita_cpb_mixing.gif
> >         12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7
> >          en/_images/Krita_cpb_mixing.gif
> >
> >         While this isn't a massive issue right now, it is a future
> >         scalability issue as for Krita at least each language costs
> >         178MB or so, while for Digikam that sits at 415MB per language
> >         and Kdenlive is 392MB.
> >
> >         Many thanks,
> >         Ben
> >
> >
> >
> >     Julius Künzel
> >     Volunteer KDE Developer, mainly hacking Kdenlive
> >     KDE GitLab: https://my.kde.org/user/jlskuz/
> >     <https://my.kde.org/user/jlskuz/>
> >     Matrix: @jlskuz:kde.org <http://kde.org>
> >
>
> --
> amyspark 🌸 https://www.amyspark.me
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-www/attachments/20230123/2ba722cf/attachment.htm>


More information about the kde-www mailing list