Making SqlQueryMaker case-insensitive

Maximilian Kossick maximilian.kossick at googlemail.com
Sun Oct 25 19:20:25 CET 2009


On Sat, Oct 24, 2009 at 4:06 PM, Jeff Mitchell <mitchell at kde.org> wrote:
> Maximilian Kossick wrote:
>> On Fri, Oct 23, 2009 at 11:05 PM, Jeff Mitchell <mitchell at kde.org> wrote:
>>> Jeff Mitchell wrote:
>>>> Whoever may know this (probably Max),
>>>>
>>>> In a branch I have code that updates the database to use utf8_bin
>>>> collation everywhere. This is a much-requested feature since it allows
>>>> for artists and albums and such with difference cases, but also became
>>>> necessary due to some other work I've been doing with scanning.
>>>>
>>>> A side effect of this, however, is that normal queries now need to
>>>> specify if they want to be case-insensitive. It's as simple as adding
>>>> "COLLATE utf8_unicode_ci" on the end of the query, but I'm not sure
>>>> where to do it or the best way -- whether all SqlQueryMaker queries
>>>> should be run this way, or whether it should be an option/flag, or...?
>>>>
>>>> I'd appreciate input (and hopefully code help) from someone that
>>>> actually knows what they're doing in SqlQueryMaker.
>>
>> Well, SqlQueryMaker is one place. ServiceSqlQueryMaker is another,
>> although I'm not 100% sure that it needs to be updated.
>>
>> The only issue I see is combining case sensitive and case insensitive
>> peredicates in one query, e.g. matching on the album name (which
>> should match exactly) and then filtering on the title (which should
>> match case insensitive).
>>
>> As far as I remember SqlQueryMaker checks if the Meta object passed to
>> one of its addMatch function whether it's part of the same collection,
>> and then uses the internal ID instead of the string. But for other
>> cases (e.g. when actually passing a ProxyCollectionMeta object to
>> SqlQueryMaker) it uses the name, and that has to match exactly.
>>
>> Can you link to some MySQL documentation?
>
> What documentation do you want/need?

Documentation on MySql collation, but I think I found it myself. If
the database is utf8_bin by default, you'll have to make each column
referenced by addFilter/excludeFilter use utf8_uicode_ci, as these
filters have to be case insensitive, e.g albums.name COLLATE
utf8_unicode_ci like '%foo%' (I'm not sure if this example works, but
all the Sql query makers have to be adapted like this)

See http://dev.mysql.com/doc/refman/5.0/en/charset-collate.html and
http://dev.mysql.com/doc/refman/5.1/en/charset-collate-tricky.html



>>> BTW, if you want to actually see that branch in order to play around
>>> with what might be necessary, it's at
>>> git://gitorious.org/~jefferai/amarok/jefferai-work.git, branch name
>>> "uidhash". It does require a schema update, so fair warning -- make a
>>> database backup if you care strongly about your stats and such, just in
>>> case.
>>
>> There are Amarok devs that have statistics? really? My statistics are
>> ful of tracks that I've played  (approximately) -123458384348 times :)
>
> Heh.
>
> --Jeff
>
>


More information about the Amarok-devel mailing list