Bug#44807: Hayes randomize mode unusable

Neil Stevens neil at qualityassistant.com
Sun Jul 7 13:25:50 BST 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday July 07, 2002 03:50, Eray Ozkural wrote:
> The mp3 directory is a tree with a number of songs at each node. Knowing
> that, we can make a perfect shuffling of all songs without any of these
> problems since we can give each song a number.

No we can't. at least in Hayes.  We can't enumerate files we can't see!  
That's the problem here.  There's a list of directories whose contents are 
unknown to us.  Hayes does not scan directories until absolutely 
necessary.

To go an open an entire Hayes list can be very expensive, especially if the 
directory is on another computer.  So, an essential tradeoff of Hayes is 
that the shuffle cannot be a perfect uniform distribution.  The best we 
can do is get progressively better estimations until the entire tree is 
opened.

> This might be a better
> sol'n *and* to construct the shuffle table you don't need to read the
> file info for each song.

Well, that would be possible, but I don't think getting a *perfect* uniform 
distribution is needed.  We just need to avoid 1) the same ordering and 2) 
high repeats, as you point out.

Hayes will solve 1) by using decent pseudo-random numbers from KApp.  Hayes 
won't have the problem of some files being played over and over a lot if 
we give a high initial weight for directories, like 1000 or something.  
That would make unopened directories will always be preferred to opened 
directories and to files.  And, as an added bonus, it would make unopened 
directories get opened frequently, bringing the distribution close to 
uniform more quickly.

Hm... Maybe the weighting for unopenend dirs should be scaled depending on 
the total known weights.

On real uniform distribution:  Once every directory is openend and the 
weights are accurate, then it *is* a uniform distribution.  Consider a 
directory that contains a file and a subdirectory, and that subdirectory 
contains two files.  The subdirectory would get a weight of 2, and the 
file gets a weight of 1:

      -- 1/3 --  file
     |                        --- 1/2 -- file
- -----                         |
     |-- 2/3 --  directory -- |-- 1/2 -- file


There are three files, and each has a 1/3 chance of being played.  This is 
basic probability here, but if you can find a counterexample to the 
weightings working, please let me know.

> You could simply assume every file is a media
> file, and then those unplayable files will be skipped anyway. This does
> away with 1. To address 2 you can give the user a choice to re-shuffle
> his index since we can handle shuffling by creating an array [1..n] and
> then running a random shuffle algorithm on it. There will be a model of
> the directory hierarchy in memory that maps index -> file.

I agree that if you know the entire tree in advance, then your method 
works.   It might even be best for Dub, for all I know. :-)

- -- 
Neil Stevens - neil at qualityassistant.com
"I always cheer up immensely if an attack is particularly wounding
because I think, well, if they attack one personally, it means they
have not a single political argument left." - Margaret Thatcher
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9KDNOf7mnligQOmERAqEAAKCTrg7xEStNmi8Uyrz+pIDkSbKDOQCeLSKj
uG763KtH6xdtCXZS7Pv+vVI=
=ZF+M
-----END PGP SIGNATURE-----


(Complete bug history is available at http://bugs.kde.org/db/44/44807.html)



More information about the kde-multimedia mailing list