Fwd: Re: Batch import of images

Mon Oct 31 22:39:44 GMT 2022

Hi,

Thanks for the detailed response...

Am Montag, 31. Oktober 2022, 15:40:39 CET schrieb Per Funke:
> Structure:
> =======
> 
>    1.
>    There is one file for exif data, “data”
>    2.
>    There are three files for tags called “what”,”when” and “where”
>    3.
>    To connect “data” with the tag files there is an index called “datab”
>    4.
>    The record id of the datab records are hashed, creating a file structure
>    called “more_txt”. An empty file is created for each record when
> imported. Annotating will later fill these already existing records with
> text. The hashing mechanism means there is no index to the texts.
>    5.
>    For searching purposes there is a index connecting all existing tags
>    (tags that are used, not empty ones) to the “datab” file’s record(s)
> where that tag is used..
>    6.
>    All images directories are named as the current date when created.
>    This means the dir contains images from the name (date) of the dir
>    before it, up to the date that constitutes it’s own name.
>    When you know the name of the dir where an image resides you also know
>    the approximate date of it's creation.
> [...]
> As you see, importing from this troglodyte of a database can only be done
> by an expert! Me!

Ok, that sounds like you have all the data you need in a usable format. You 
also have an expert at hand, which is always good ;-)

> If I were employed, this would have created my total job security. I'm like
> Sid in User Friendly, though not quite so skilled.

I understood that reference ;-)

> Also I do not use punch cards since my last university course where I
> programmed in Cobol.

... and while I never learned Cobol I do find punch cards handy for taking 
notes...

> It was kind of fun doing this description. That's something I've never
> done, ever. I had it all in my head from the beginning. Now it's in writing.
> I think I'll go and buy a fat trappist beer, I'm worth it!

Joking aside: writing down stuff is actually a great start. 

So, in summary: you already know your data, and probably know how to
1) create a list of all tags
2) create a list of all images and their associated tags

Also, it looks like your category names are plain ASCII, so you won't have to 
escape any special characters (before you ask: having non-ASCII characters in 
your tag names is not a problem). If your data is not already encoded in 
UTF-8, I would advise you to make sure that you tag names are valid UTF-8 
before the import.

If you need a template for creating the index.xml file for KPhotoAlbum, you can 
use this one:

<?xml version="1.0" encoding="UTF-8"?>
<KPhotoAlbum version="7" compressed="1">
    <Categories>
        <Category name="Who" icon="system-users" show="1" viewtype="0" 
positionable="1">
           <value value="Person A"/>
           <value value="Person B"/>
        </Category>
        <Category name="Where" icon="applications-internet" show="1" 
viewtype="0">
           <value value="Place A"/>
        </Category>
        <Category name="What" icon="favorites" show="1" viewtype="0">
            <value value="untagged" id="1"/>
            <value value="The What"/>
        </Category>
    </Categories>
    <images>
       <image file="relative/path/to/file.jpg" startDate="2000-01-01T00:00:00" 
endDate="2000-12-31T23:59:59" >
            <options>
                <option name="Who">  
                    <value value="Person A"/>
                    <value value="Person B"/>
                 </option>
                 <option name="Where">
                    <value value="Place A"/>
                 </option>
                 <option name="What"> 
                    <value value="The What"/>
                 </option>
            </options>
        </image>
    </images>
</KPhotoAlbum>

That means you need to convert your list of Who/Where/What tags into value 
XML-element under the appropriate Category element.
Then you'll need to create an image element for each image, possessing correct 
startDate and endDate attributes. Below the image element, create Who/Where/
What entries for each tag that applies to the image.

This will result in a very verbose file that has enough information for 
kphotoalbum to repair it to get a valid index.xml file. Using this verbose (or 
in kphotoalbum parlance "uncompressed") file format allows you to not care 
about tag ids. Having the "compressed=1" attribute tells kphotoalbum to save 
the database in a less verbose and more performance-friendly format.

Once you have created an index.xml file in this manner, you can open the 
database/image directory in kphotoalbum:
KPhotoAlbum will complain about missing image size information, and will tell 
you that you should pick a tag to mark untagged images. You should start with 
selecting the selecting the "untagged" tag and save the database.
After that, you need to run "Recreate MD5 sums..." and "Recreate EXIF 
database" from the maintenance menu. Make sure that the "Update image time" 
checkbox is not active.

When that is done, save your database again, recreate the thumbnails 
(maintenance menu) and go for another trappist beer. If all went well, you now 
have all your images in kphotoalbum.

Hope that helps,
  Johannes