[Owncloud] Archive reading resource hog (bug oc-1224)
samtuke at owncloud.com
Wed Sep 19 18:08:56 UTC 2012
Here's a status update on bug oc-1224 that I've been working on.
When files_archive is enabled, and a new archive is added to ownCloud, all
subfiles and directories are opened on the server and added to the filecache.
This hogs resources to a show-stopping extent.
As far as I can see it's not necessary to do this. With gzipped files, if only
the archive file itself is read into the cache when it is first uploaded, then
you can browse through the archive in the web interface quite happily, as each
time you open a subdirectory, the contents of that directory are read into the
cache. I have used path regexes to prevent recursive scanning of archives, and
this works for gzip files.
Zip files however don't work the same way however. Unless the whole archive is
scanned into the file cache when it's first added, the archive is not browsable
via web interface. I'm not sure why rescans are triggered for gzip archives
and not for zip archives.
However, even when recursive scanning of gzip archives is prohibited, the
resources required to scan even a few files within an archive are impractical.
On my dual core machine, a gzip file with only three subfiles (small images) and
three subdirectories takes about 30 seconds to scan. Scanning the contents of
the top level of any real world web app archive (like tinymce or phplist),
which has about ten files / directories in its root folder, takes more than 5
I'm currently trying to identify exactly where the bottleneck is - commenting
out all of scanfile() (in filecache.php) doesn't ease things, so the issue must
presumably lay somewhere in scan(). I think I need to get xdebug working again
to investigate further.
That's it from me for this week.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 230 bytes
Desc: This is a digitally signed message part.
More information about the Owncloud