Resetting metadata after new from template ?
Friedrich W. H. Kossebau
kossebau at kde.org
Thu Feb 4 12:55:19 GMT 2016
Hi Pierre,
Am Dienstag, 2. Februar 2016, 22:21:59 schrieb Pierre:
> On Tuesday, February 02, 2016 02:08:53 PM Jaroslaw Staniek wrote:
> > On 1 February 2016 at 23:20, Pierre <pinaraf at pinaraf.info> wrote:
> > > Hi
> > >
> > > Right now, when we create a new empty document (at least for words),
> > > like
> > > any
> > > other office suite, it is created from a template. But we currently lack
> > > the
> > > reset of some metadata from our templates, leading to funny situations…
>
> For
>
> > > instance, any document created by calligra using the A4 template is more
> > > than
> > > 7 years old :)
> > >
> > > We have several possible fix :
> > > - strip metadata from templates : we would still copy the metadata for
>
> user
>
> > > templates, including creation date, bad imho
> > > - strip all metadata when creating a file from template : bad too, users
> > > could
> > > have specific metadata they don't want to lose
> > > - override specific metadata like the creation date with sane values
> > >
> > > Each one is very simple to implement, I just don't know which one is the
> > > best.
> > > I would go for the third option, but I don't have a list of metadata to
> > > erase.
> > > (we already override the generator BTW, but elsewhere in the saving
> > > code)
> > >
> > > Any thoughts on this ?
> >
> > Very interesting finding
> > , Pierre.
> > If you ask me, the 3rd option looks best. Documenting the new behaviour,
> > whatever it is, in the API docs, would be useful.
>
> I just went through the ODF 1.2 spec part regarding metadata, I think we
> should reset all the meta data defined in the spec except
> "meta:user-defined"… And remember to fill in the meta:template with the
> XLink to the source template.
> If nobody disagrees nor sees anything hazardous in it, I'll implement that.
Happy to see you pick this up, I have found this annoying/funny as well :)
And I agree with & support your implementation plan basically, just a few
modifications I would like to propose, read on.
IMHO the ODF spec has a flaw here WRT metadata and templates. There should be
separate metadata for the actual template, and separate metadata for the to-
be-generated document. The first can be used as usual, to know more about the
template itself when managing templates, and the second can be used to preset
metadata of the actual generated document, as it fits (e.g. keywords,
language, or whatever user-defined keys are standard with the organisation
using the documents). (someone should bring this up in the OASIS TC, what,
me?)
But we have to deal now with what there is in the current spec. So I would
agree that resetting/dumping most metadata on document creation makes sense.
For the pre-defined metadata (as in ODF 1.2, §4.3.2) I think the following
metadata could be kept though, as they are about the document/content type and
less about the template, or only really make sense with the actual document.
So if they are present and set, they could be considered to target the created
document, right?
* <meta:auto-reload>
* <meta:hyperlink-behaviour>
* <dc:language>
For <meta:keyword> and <meta:user-defined> it is hard to tell, given their
less specific semantics. They could contain data only useful for template
management or could be preset metadata for the generated document.
Having to choose between the chance to leak template handling data into
generated documents and the inability to preset metadata for documents, the
second seems a greater issue for me (given I have no template tags like "form
letter to shutdown stupid customers" ;) ).
So I agree, let's keep <meta:user-defined>, but then also <meta:keyword>.
And then there is RDF metadata (§4.2.2), which for the non-content-specific
statements has the same problem as <meta:keyword> and <meta:user-defined>. So
better kept as is.
Custom metadata (§4.3.1) would also be treated like <meta:keyword> and
<meta:user-defined> for the same reasons, keeping as is.
So in summary, IMHO we should reset/drop these predefined metadata types:
<dc:title> - reset to empty
<dc:description> - reset to empty
<dc:subject> - reset to empty
<meta:initial-creator> - reset to current author profile
<dc:creator> - reset to current author profile
<meta:creation-date> - reset to "now"
<dc:date> - reset to "now"
<meta:editing-cycles> - reset to 1
<meta:editing-duration> - reset to 0
<meta:template> - reset to template iri
<meta:printed-by> - dump
<meta:print-date> - dump
<meta:generator> - generated on the fly when saving only
<meta:document-statistics> - generated on the fly when saving only
Does this small adaption to your plan make sense to you as well? :)
Cheers
Friedrich
More information about the calligra-devel
mailing list