[digikam] [Bug 382361] New: Continuous net usage, server CPU and IO usage while doing nothing but keeping a collection opened [network share & mysql db]

Thomas Debesse bugzilla_noreply at kde.org
Sat Jul 15 07:04:56 BST 2017


https://bugs.kde.org/show_bug.cgi?id=382361

            Bug ID: 382361
           Summary: Continuous net usage, server CPU and IO usage while
                    doing nothing but keeping a collection opened [network
                    share & mysql db]
           Product: digikam
           Version: 5.6.0
          Platform: Other
                OS: MS Windows
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: Thumbnails
          Assignee: digikam-devel at kde.org
          Reporter: dev at illwieckz.net
  Target Milestone: ---

Here Digikam (on Windows workstation) is configured to store its core, thumb
and face database in a MySQL server (on another host running Linux), and to
store pictures in a network share (a CIFS share hosted by a Samba service
running on the same host the mysql server runs). The Digikam configuration is
stored in local workstation's drive (see bug #382358). So, only database and
pictures file system is currently used through the network.

```
 ___________________                             ___________________
|                   |    ___________________    |                   |
|      server       |   (                   )   |    workstation    |
|___________________|   ( 10 Gbit/s network )   |___________________|
|  _______________  |   (                   )   |  _______________  |
| [ MySQL service ]←―――――――――――――――――――――――――――――→[               ] |
|  ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾  |   (                   )   | [    digiKam    ] |
|  _______________  |   (                   )   | [    client     ] |
| [ Samba service ]←―――――――――――――――――――――――――――――→[_______________] |
|  ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾  [   (___________________)   |                   |
|___________________|                           |___________________|

```

There is more than 70 000 pictures stored in that Samba share. I recently run
the tool to find (but not to recognizes) faces. It ended with almost ≈ 40 000
entries in the “Unknown” people tag.

I noticed that if I open an album or a tag with a few pictures, the network
usage grows a bit while the thumbnails download, then when the thumbnails are
loaded, the network usage goes down to 0.

But if I open the “Unknown” people tag view with 40 000 entries, the thumbnails
load for visible pictures but never goes down. On the server side I see a
straight 50 MBit/s rate on Outgoing (from server to network) and a 1 MBit/s
rate on Incoming. At the same time on the Workstation side I see the same rate
(but 50 MBit/s on Incoming from network to workstation and 1 MBit/s on
Outgoing, of course).

At the same time, `tcpdump` reports me an endless rate of MySQL connections
coming from the workstation and going to the workstation, and `iotop` is
showing me multiple processes continuously eating 30% of IO. In this case,
samba does not make any IO at this time (the thumbnails are probably already
generated, so, no need to load full files).

So, If I just open a large collection, and let the digiKam window opened and go
away doing nothing, digiKam seems to cause a continous MySQL/IO/network (which
is like being endless due to the large amount of thumbnails to load).

I suppose digiKam is downloading thumbnails for the complete list of pictures
available on the current view, because when I open a view with only small
number of pictures, the MySQL/IO/network usage goes down once all the
thumbnails are loaded.

Even if both workstation and server are living on a fast 10Gbit/s network
(which is more a 2.5GBit/s network due to some Windows network inefficiency),
this behavior seems to produce small annoying freezes in digiKam. By the way,
2.5 GBit/s sounds to be large enough for a 50Mbit/s network usage. The
annoyances probably comes from the useless disk IO on the server, the useless
MySQL usage, and perhaps from the useless network interruptions involved making
troubles with priority of other requests.

For example these very annoying micro freeze comes each time I apply a people
tag to a picture face. On normal usage, even with every “save metadata to
picture” option being activated, this kind of action is instantaneous: the tag
is applied, the picture is saved, and the face disappear from "Unknown" tag
list. 

But with this kind of large Unknown people tag list, with background useless
network rate, useless IO usage and useless MySQL queries, when tagging someone,
users have to suffer a ~1sec wait to be able to apply a people tag on another
face, even when all “save metadata to picture” options are disabled.

The configuration is solid: The server has two Intel Xeon E5430 CPU (2×4 cores)
with 32Gb of ram and RAID10 made from 4× 7200rpm drive with 64Mb cache each
one, running Debian jessie with latest kernel from backports. The Workstation
has an Intel E3-1270 v5 CPU with 32 GB/ram and SSD RAID1 (which is only
involved for config file in that setup), running Windows 10. The server and the
workstation both have a 10GB/s interface and are connected via a 10GB/s switch.
The MySQL server was tuned to fit the digiKam queries and uses a ramdisk as tmp
disk (so, even when doing disk cache, it's still on ram). The server is a bit
old (the ram speed is probably a kind of bottleneck in some scenario), so I
expect some slower RAM IO than more modern servers can do, but it's still not
an anemic one. The same way I know my server's SATA controller is a bit
limited, but that RAID10 setup is still able to read 240MB/s sequentially and
to read 4×100MB/s at the same time.

So, it looks like those micro-freezes comes from digiKam endlessly querying and
downloading thumbnails. Of course, this useless network, disk IO and CPU usage
is able to slow down everyone else on the network.

If it's not implemented, it would be very cool to get digiKam only loading
thumbnails from visible pictures. Pre-downloading thumbnails for one or two
next screen would be very ok to get smooth scrolling, but no more.

Also, the thumbnails download must prevent querying one thumbnail at a time. I
explain: We can imagine digiKam downloading the thumbnails for the 24 pictures
currently displayed on the screen, then, to get smooth scrolling, downloading
the 24 next thumbnails, then the 24 previous thumbnails, then 24 next-next
thumbnails.

But digiKam must NOT download the only one next thumbnail needed because only
one picture was removed from the current viewed collection. When the user is
setting people tags on pictures within “Unknown” face list, once the picture is
tagged, the picture disappear from that “Unknown” list.

If digiKam implements a “download missing invisible thumbnail up to N each time
preloaded invisible thumbnail number < N” algorithm, it would mean digiKam
would query a thumbnail and download it every time the user is tagging someone
in a picture, triggering annoying MySQL queries and thumbnail
download/generation each one a people tag is applied. In this case, it would be
better to chunk the thumbnail query/download, so the micro freeze only comes
from time to time, not everytime. It would be OK for an use to wait some time
after having tagged 10/20 pictures than after each picture.

Also, since people will probably tag different people on the picture list
(having to manually select another label etc.) it would probably make that
chunked thumbnail download less visible, because of the higher chance of doing
it while user is doing something else.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Digikam-devel mailing list