Licensing for models and datasets

Volker Krause vkrause at kde.org
Tue Mar 26 16:33:56 GMT 2024


On Montag, 25. März 2024 15:17:48 CET Halla Rempt wrote:
> We're looking into adding an experimental AI-based feature to Krita:
> automated inking. That gives us three components, and we're not sure about
> the license we should use for two of them: the model and the datase. Would
> CC be best here?

Looking at https://community.kde.org/Policies/Licensing_Policy the closest 
thing would either be "media" files (generalized to "data files") and thus CC-
BY-SA (and presumably CC-BY/CC0) or "source code" (xGPL, BSD/MIT).

I think this is a bit more tricky though, depending on whether we assume a 
model is derivative work of the input data, and whether the output generated 
from a model is derivative work of the model (and thus potentially derivative 
work of the input data). The industry assumption so far seems to be that at 
least one of those isn't derivative work (AFAIK that has yet to be legally 
tested though), but I'm not sure that interpretation is in the best interest 
of FOSS developers or artists...

One scenario that would work regardless I think is using a license with 
practically no constraints (CC0, MIT, etc), but that also offers no protection 
for the training or model data (which might or might not be what you want).

Any other scenario I can think of involving more protective licenses runs into 
interesting issues:
- if the output is derivative work, Krita users would be bound by e.g. the 
attribution or share-alike requirements of the license (which I guess is not 
what you want).
- a Bison/Flex style "code generator exception" to state that the model output 
is free of any license requirements regardless of the model license itself 
requires that either the model isn't derivative work of the input or that the 
input data is licensed in a way compatible with that.
- In the latter case we are back to essentially unprotected CC0-like input, or 
a protective license with a special exception, which then gets awfully close 
to developing new licenses.

So I guess this boils down to how much protection you have in mind for the 
input and model data?

Interesting topic, sorry if my ramblings on this are of limited help :)

Regards,
Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kimageshop/attachments/20240326/985053d4/attachment.sig>


More information about the kimageshop mailing list