XMP, Krita, KFileMetaInfo and Strigi

Fri Jun 22 18:09:27 BST 2007

2007/6/22, Cyrille Berger <cberger at cberger.net>:
> > What is important that all KDE apps show the same metadata for files.
> What do you mean by that ?
In konqueror 3 you can edit metadata of various files because of the
shared metadata code. This functionality is inherited by all KDE
applications. Having well thought through shared functionality like
this can benefit many apps if it is in a core library.
Consistency is important. A field 'title' should have the same value
in any application that shows metadata for a file or other resource.

> > If we start using different libs for this, this will not happen and
> > that is very confusing for the user.
> I don't think user should ever see a library, and not even been talked about a
> library. A user just have to see a User Interface, but unless I have missed
> something that's not about what we are talking about ?
Yes, the confusion I talk about is different apps showing different
metadata values. By using different code this is hard to avoid. Of
course agreeing on a standard is also possible, but would result in
code duplication and is also more errorprone.

> > > But the biggest problems and limitations comes from KFileMetaInfoItem
> > > - QVariant is not good enought:
> > >  - it lacks some important types like rationals
> >
> > a double is not a rational? QVariant::Double
> Yes it is, but a rational isn't a double, it's a division between two
> integers.
> For instance, there is no double representation of the rational 2 / 3
> (0.66666666666666666666666666666666 is just an approximation, even if the
> mathematical theory says that 0.666 followed only by 6 is equal to 2 / 3, but
> there is no way to represent such a thing in double, 2/3 is represented by
> 0.666..67 in a double anyway).
It is very simple and not expensive to find a nice rational from a
double. For the graphical applications you're referring to zoom
ratios, resultions and more are nicely expressed in ratios. The range
for the denominator is however very limited, making it even easier to
transform a double into a rational. There is no need to store it as
two integers.

> > >  - no distincion between ordered list / unordered list (I know it might
> > > seems unimportant, just treat an unordered list as ordered, but the fact
> > > is an application should never never sort an ordered list)
> > I'm not sure I understand, if a list must be sorted, the analyzer
> > should sort it and pass it on sorted. Can you provide the use case?
> No you didn't understand, their are two kinds of lists: lists for which order
> matters (a list of author, they are usually sorted in the order of importance
> of his work) and list for which order don't matters (like keywords). And
> while it's very understandable that an application wants to sort a list of
> keywords, it is unacceptable for a list of authors.
Yes, so the lists that the application might like to sort can already
be sorted by the analyzer so that the application does not ever have
to do the sorting. Then it will not do sorting when it is not
appropriate.

> > >  - there are two associatives array in XMP (and only QMap in QVariant),
> > > Structure and Alternative Array, there is in fact a huge difference
> > > between the two, Structure has a limited set of possible keys, while
> > > Alternative Array is not limited
> > Sounds like QValidator + QMap to me. Input validation when editing is
> > important. That is why KFileWritePlugin provides access to a
> > QValidator for the widgets.
>
> Unrelated.
> Lets take an exemple out of the XMP spec , it defines a Dimension structure as
> follow:
>  - one field named "w" of type real
>  - one filed named "h" of type real
>  - one filed named "unit" of type string
>
> How I would use that structure in the code is:
> QMap<QString, Value> dimensionValue;
> dimensionValue["w"] = 10.0;
> dimensionValue["h"] = 20.0;
> dimensionValue["unit"] = "cm";
>
> Then I create a metadata entry:
> metaDataList["nameofthetag"] = Value(dimensionValue, Value::TypeStructure );
How is this better than simply width and height with a default unit
where the GUI decides in which appropriate unit to display?

> The second example is called "Alternative Arrays". The main usecase for this
> is for translation of some metadata (but there are some other uses in the
> spec). So basically in your code you have
> QMap<QString, Value> description;
> description["fr-fr"] = "Ma description";
> description["en-us"] = "My description";
>
> then I create the metadata entry as
> metaDataList["dc:description"] = description;
This is can be stored as a QMap. How to get the appropriate text out
in the GUI is indeed not trivial, but a special type with appropriate
display rules would fix it.

> So what the difference ? The difference is that spec limit the number and the
> name of values in a structure (dimensionValue["x"] = 3.0; is invalid while
> adding description["de-de"] = "meine Beschreibung"). So
> there is a need for later purpose to make a distinction between those two
> (not counting that it helps when saving if you have an idea of the real type
> of the value).
I think you mean that description["x"] = 3.0 is not valid, right? So
you need to validate the input. Still sounds like QValidator to me
(although i agree that a QValidator that takes QVariants would be way
better).

> > >  - all the above can be more or less hacked in QVariant using UserType,
> > > even if a real API to manipulate them is better. But there is an other
> > > problem which need an extended QVariant . In XMP value can be associated
> > > to a "Property qualifier" (unless doing something really horrible like
> > > storing all value in QList<QVariant> with first item the value and the
> > > second item the property qualifier...). The typical use case of "property
> > > qualifier" is for a list of author of a document, imagine for instance a
> > > book with illustrations in it, you would have to authors "Paul" (the text
> > > writter) and "Jane" (the drawer), "property qualifier" allow to indicates
> > > that Paul was the text writer and Jane "the drawer".
> >
> > This type of information is indeed harder, but we should solve it at
> > the KDE level, not for one particular app. A UserType is not a hacked
> > solution in my opinion if it serializes nicely.
>
> And once again you miss the point :) Maybe I shouldn't have spoken of the
> UserType in this sentence. So forget about the first line and reread the
> rest.
I do not see how I missed the point. All data can be represented as a
QVariant. A QVariant is recursive after all. In this case you could
use a QList<QList<QString> > variant. This does require some difficult
typing and I can understand that one would want to do it more nicely.
So why not extend what we have instead of starting a different
framework. If you have the stuff to solve these problems, why not
share it with the rest of KDE by extending KFileMetaInfo e.g. by
subclassing or wrapping it. Then when the feature set becomes more
clear the improvements could go into the core.

> > > The whole framework also lack validation Schema but that's a slight
> > > problem that could be added at Krita level, even if I do think it's
> > > better to have Value (QVariant), Entry (KFileMetaInfoItem) and Schema
> > > interacting close togther, for instance to have
> > > KFileMetaInfoItem::setValue call the Schema to check if it can accept
> > > this value and then for instance try to convert it if
> > > possible.KFileMetaInfoItem
> >
> > QValidator in KFileWritePlugin
> As you are aware QValidator takes a QString in input and return true if the
> string is a valid answer.
> What is needed here is more complex, among other things:
>  - each tag is associated to a single type, so when the tag is initialized it
> needs to be initialized with the correct value
>  - structures needs to be checked to see if they have the correct fields (no
> field "x" for the Dimension and not missing a field "h")
Yes this sounds good. Something like this would not be easy though.

> > > > I'd love to help you with this effort, since obviously XPM is an
> > > > important file format.
> > >
> > > XMP isn't really a file format :) But yes it is very important, and not
> > > only for image application, but for most of document application, it has
> > > application for Video and Audio as well, even if in those area the use of
> > > XMP is even less widespread than for Images.
> >
> > And that is the reason it should be incorporated in the Nepomuk
> > ontology and work with Strigi and KFileMetaInfo. If this is not
> > possible, then these frameworks are inadequate and should be enhanced.
> I am bit confused now, but I talk with Sebastien about XMP/Exif/IPTC metadata
> and he told me Nepomuk couldn't help me directly in that matter, so I must
> have missed something.
Nepomuk is setting up an ontology for describing the relations between
many data types. It can probably not solve all problems you have, but
perfection is the enemy of success. It is better to collaborate on the
simple cases and let the only the complex cases be unique to some
applications.
We would love to have your input on the ontology. You can find it here:

> > > Anyway, here is how I see sharing code, and in a way that don't bloat
> > > both framework, I much prefer to have things keep simple, especially that
> > > neither KisMetaData nor the base KMetaData library (KFileWritePlugin +
> > > KFileMetaInfoItem) are that big.
> >
> > I completely agree, but I dont think KisMetaData should exist at all
> There are other missing stuff in strigi annalyzers and KFile* but they don't
> need changes, they are just addition, so I didn't spoke about them (unless
> you want to ?).
If it's not too much work, yes please.

> > except as Strigi analyzers and KFileWritePlugins. By going your own
> > way, you completely miss the point of the semantic desktop, which is
> > to allow all applications to know more about files and other objects
> > on the desktop. If krita speaks its own language, it cannot talk to
> > the other desktop apps and you lose the advantages this brings. This
> > is more effort than writing your own stuff because you have to
> > converse with others to get it working and agree on a common approach.
> > This is not always easy, but the result is all the move valueable.
> I really really don't see any lost. The biggest part of all this is allready
> assumes by other libraries that are shared.

> > > And honestly I do believe that it is better to it this way than twist
> > > either framework to adjust to the need of the other.
> > If you like to be on an island then yes.
> ... (I really love to see that kind of comment, that soooo constructive)
Well, just think it's so unfair that only krita will have this stuff.
I first learned about this stuff on your blog and was a bit baffled
that there's a KDE project so completely separated from the existing
architectures. The power of KDE is having good frameworks and sharing
them.

I've seen too much effort being lost by reinventing incompatible
technologies and given the effort going into Nepomuk and Strigi, doing
metadata incomplete, but compatible with that is much more valueable
than doing it more complete but incompatible.

Cheers,
Jos