KDevPlatform Itemrepository bucket freeItems

floris flo.ruijt at gmail.com
Tue Mar 1 18:04:42 UTC 2011


On Tue, 2011-03-01 at 10:15 +0100, Milian Wolff wrote:
> you have quite some in-depth knowledge on these 
> tasks, many of us (esp. me) don't have that. I for one am just a
> self-taught 
> programmer and hence would greatly appreciate it if you could shortly
> outline 
> what you want to achieve an 

well a short introduction then :)

the current structure uses a sorted linked list to manage all the free
space in a bucket, except the space that is right at the end.

I could have decided to use AVL trees too, but they require a bit more
information(pun intended): a red black tree needs RED or !RED an AVL
tree needs -1,0,1. and as i was tightly space bound i decided upon a
red-black tree.

a red-black tree is a binary-search-tree that guarantees that the height
of the longer subtree will not be more than twice the height of the
shorter subtree. see the end of this mail for a more information on
binary search trees and red-black trees.

this means searching will always stay O(log N), whereas a sorted
linked-list has O(N/2)==O(N) on average

to decide what to do with new free space, we need to know whether there
is free space right before or right after this item, that requires a
search over the data ON THE INDEX.

to decide where to allocate a new item we need to find an item with
sufficient or if possible exactly fitting size, this requires searching
over the free space ON THE SIZE.

therefore to profit we would need two red-black trees.

which is exactly where this header comes in. by making the algorithms
data-structure neutral we can uses indices instead of pointers and save
*a lot* of bytes.

As several of you noted, an existing implementation is to be preferred.
the problem is, no implementation i found was capable of handling
structs with offsets as left and rights, instead of the usual pointers,
and no parent pointers, the closest probably is boost::intrusive, but it
requires parent-pointers which would mean four bytes extra storage per
node, now David set my storage limit to 10 bytes per node, which is what
i have right now:
ushort: size + in the 2 most significant bits the two red/black flags

short: left_by_size
short: right_by_size
short: left_by_index
short: right_by_index

where the shorts are offsets from the current node.
as long as parent pointers are not permissible, it's impossible to use
boost::intrusive in a sane matter. i looked into the stl_tree header,
which implements a red-black tree also, but it 1) uses parent pointers
and 2) has (without modification) no support for different node types.

For me the only question remaining is whether it's faster in practice.
which i can only measure when it all works.

Floris


BINARY SEARCH TREES:
a binary search tree has a structure like this:
			5
                   /          \
            /                       \
	3				9
    /       \                       /        \
2		4		7		10


that is: every node has TWO children. on the left are all nodes that are
smaller than this node, on the right all nodes that are bigger. this
allows for fast searching:
say we need to find 4.
4<5 ?  yes look left.
4<3 ?  no check equal
4==3?  no, look right
4<4 ?  no check equal
4==4?  yes FOUND IT!

this might not look that impressive but as the number of items gets
bigger this will get more and more impressive imagine a perfectly
balanced tree with the numbers 1..1023  this tree has 2log(1023+1)==10
levels which means that in the worst case scenario we need 10 < and 10
== comparisons, if, on the other hand we have a sorted linked list with
1..1023  we'd have to do on average 512 ==comparisons. but this speedup
only holds if the tree is (reasonably) balanced if you inserted the
items in a sorted order(doing no rebalancing whatsoever) then the tree
would look like this:
				...
			4
		3
	2
1

ie. no right nodes just left nodes. this is basically a linked list with
extra 0 bytes! so this must be avoided. (there are two other insertion
orders in which the tree degenerates into linked list, those are left as
an excercise to the reader)

to solve this some kind of rebalancing has to be done after the
insertion. Two noteworthy solutions are the red-black tree and the
avl-tree.



RED-BLACK TREE
a red black tree is a binary search tree which has the following
additional properties:
1. every node has a color (red or black).
2. the root is black
3. a red node has no red children
4. every path from the root to leave traverses the same amount of
   black nodes

these properties mean that the maximal difference between the two
subtrees of the root can be the height of the shorter subtree:

				B
			R		B
		B				B
	R						...
...

as the amount of blacks is equal the left node can have at most
hight(right)-1 nodes more.

insertion and deletion is tricky but possible I will not cover it here
but the basic algorithm is: find the correct position, insert red node,
fix red violations(property 3 violated) untill the tree is consistent.

AVL TREE
an avl tree is a binary search tree that uses the 'balance factor'. the
balance factor is height(left_child)-height(right_child)
an avl tree guarantees that the balance factor of everey node is 
-1,0 or 1. this guarantees a more balanced tree then the red-black tree
at the cost of somewhat more rebalancing(gee!), to be done at insertions
and deltions.





More information about the KDevelop-devel mailing list