Resetting metadata after new from template ?

Friedrich W. H. Kossebau kossebau at kde.org
Thu Feb 4 12:55:19 GMT 2016


Hi Pierre,

Am Dienstag, 2. Februar 2016, 22:21:59 schrieb Pierre:
> On Tuesday, February 02, 2016 02:08:53 PM Jaroslaw Staniek wrote:
> > On 1 February 2016 at 23:20, Pierre <pinaraf at pinaraf.info> wrote:
> > > Hi
> > > 
> > > Right now, when we create a new empty document (at least for words),
> > > like
> > > any
> > > other office suite, it is created from a template. But we currently lack
> > > the
> > > reset of some metadata from our templates, leading to funny situations…
> 
> For
> 
> > > instance, any document created by calligra using the A4 template is more
> > > than
> > > 7 years old :)
> > > 
> > > We have several possible fix :
> > > - strip metadata from templates : we would still copy the metadata for
> 
> user
> 
> > > templates, including creation date, bad imho
> > > - strip all metadata when creating a file from template : bad too, users
> > > could
> > > have specific metadata they don't want to lose
> > > - override specific metadata like the creation date with sane values
> > > 
> > > Each one is very simple to implement, I just don't know which one is the
> > > best.
> > > I would go for the third option, but I don't have a list of metadata to
> > > erase.
> > > (we already override the generator BTW, but elsewhere in the saving
> > > code)
> > > 
> > > Any thoughts on this ?
> > 
> > ​Very interesting finding​
> > ​, Pierre.​
> > If you ask me, the 3rd option looks best. Documenting the new behaviour,
> > whatever it is, in the API docs, would be useful.
> 
> I just went through the ODF 1.2 spec part regarding metadata, I think we
> should reset all the meta data defined in the spec except
> "meta:user-defined"… And remember to fill in the meta:template with the
> XLink to the source template.
> If nobody disagrees nor sees anything hazardous in it, I'll implement that.

Happy to see you pick this up, I have found this annoying/funny as well :)

And I agree with & support your implementation plan basically, just a few 
modifications I would like to propose, read on.

IMHO the ODF spec has a flaw here WRT metadata and templates. There should be 
separate metadata for the actual template, and separate metadata for the to-
be-generated document. The first can be used as usual, to know more about the 
template itself when managing templates, and the second can be used to preset 
metadata of the actual generated document, as it fits (e.g. keywords, 
language, or whatever user-defined keys are standard with the organisation 
using the documents). (someone should bring this up in the OASIS TC, what, 
me?)

But we have to deal now with what there is in the current spec. So I would 
agree that resetting/dumping most metadata on document creation makes sense.

For the pre-defined metadata (as in ODF 1.2, §4.3.2) I think the following 
metadata could be kept though, as they are about the document/content type and 
less about the template, or only really make sense with the actual document.
So if they are present and set, they could be considered to target the created 
document, right?
* <meta:auto-reload>
* <meta:hyperlink-behaviour>
* <dc:language>

For <meta:keyword> and <meta:user-defined> it is hard to tell, given their 
less specific semantics. They could contain data only useful for template 
management or could be preset metadata for the generated document.
Having to choose between the chance to leak template handling data into 
generated documents and the inability to preset metadata for documents, the 
second seems a greater issue for me (given I have no template tags like "form 
letter to shutdown stupid customers" ;) ).

So I agree, let's keep <meta:user-defined>, but then also <meta:keyword>.

And then there is RDF metadata (§4.2.2), which for the non-content-specific 
statements has the same problem as <meta:keyword> and <meta:user-defined>. So 
better kept as is.

Custom metadata (§4.3.1) would also be treated like <meta:keyword> and 
<meta:user-defined> for the same reasons, keeping as is.

So in summary, IMHO we should reset/drop these predefined metadata types:
<dc:title>              - reset to empty
<dc:description>        - reset to empty
<dc:subject>            - reset to empty
<meta:initial-creator>  - reset to current author profile
<dc:creator>            - reset to current author profile
<meta:creation-date>    - reset to "now"
<dc:date>               - reset to "now"
<meta:editing-cycles>   - reset to 1
<meta:editing-duration> - reset to 0
<meta:template>         - reset to template iri

<meta:printed-by> - dump
<meta:print-date> - dump

<meta:generator>           - generated on the fly when saving only
<meta:document-statistics> - generated on the fly when saving only

Does this small adaption to your plan make sense to you as well? :)

Cheers
Friedrich



More information about the calligra-devel mailing list