I need your benchmarking help!

Sun Oct 25 15:25:31 UTC 2009

Hi Jeff,
here are my benchmarking results (3904 tracks, 418 folders):

* amarok2 master
  - scan time: 04:16.4

* amarok2 jefferai-work.git->uidhash
  - scan time: 04:01.0

hope this helps.
-- 

Regards, 
Andreas

www - @

on Friday 23 October 2009 at 02:37:57, Jeff Mitchell <mitchell at kde.org> wrote:
> No apologies for the cross-post  :-)
> 
> Over the last few days I've been working extremely hard -- just ask my
> poor neglected wife -- on solving one of the most longstanding issues in
> A2 -- and, for that matter, A1 -- scanning performance.
> 
> I've done all kinds of tweaks in the amarokcollectionscanner binary
> itself (
> http://blog.jefferai.org/2009/10/14/speed-never-gets-old-at-least-in-softwa
> re-1129 ) but the problem remained that, when it came down to it, we were
>  still just accessing the database too damn much.
> 
> So, I finally bit the bullet and did what me and Leo had figured a while
> back was the only way to solve this problem -- replace all SQL queries
> in the middle of the scan with batch queries at the front and back and a
> series of hashes with types like
> QHash<int, QLinkedList<QStringList*> *>. In other words, the
> ScanResultProcessor now writes a bunch of code to populate the hashes
> with SQL at the start, then uses *only* the hashes during all of the
> "inserts" and queries, and then writes out all the hashes to SQL at the
> end. While maintaining cache coherency the whole way. If I didn't mess
> up. Which I didn't. I think. Pretty sure. Possibly.
> 
> If you didn't understand that, don't worry. All you have to know is that
> it was an absolute fuckload of work, and now I need some help
> determining whether or not it was all worth it, which means I need
> people benchmarking.
> 
> If you'd like to help out -- and please, do help out -- you'll need a a
> large enough collection that a normal full scan takes a noticeable
> amount of time, and you'll need to be running Amarok built from Git.
> Here's what you do:
> 
> 1) Update master and build. Open Amarok and run two full rescans
> (Settings->Collection->Fully Rescan Collection). So click, let it run
> until the progress bar reaches 100%, then click again. The second time,
> time it with a stopwatch or some such thing. (The reason it's done twice
> is so that the effects of disk caching can be reasonably ignored between
> this version and the new one.)
> 
> 2) Add my clone as a remote (use Google if you need help). My clone is
> at git://gitorious.org/~jefferai/amarok/jefferai-work.git and the branch
> you want is called "uidhash". Build it.
> 
> 3) Run two full rescans again, timing the second one.
> 
> 4) Close Amarok and re-open it.*
> 
> 5) If you see any oddities (that aren't fixed by switching back to
> master and rebuilding, then running a full rescan, then closing and
> opening Amarok -- yes, all those steps) please be sure to report them
> along with your benchmark results.
> 
> Many thanks in advance to those that help out.
> --Jeff
> 
> * The reason you need to close and reopen Amarok is that there are
> longstanding bugs in the collection browser that cause it to not be
> updated properly when new data is scanned. So it can *look* like your
> collection is messed up when what's really happening is that the browser
> is using bad cached data. Since the browser reloads from SQL every time
> you open Amarok, closing and opening Amarok ensures that you're seeing
> the browser without these bugs interfering, so you can actually see
> whether or not something is truly not working right.
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/amarok/attachments/20091025/0ddf3059/attachment.html>

I need *your* benchmarking help!

I need your benchmarking help!