[Digikam-devel] git-annex support?

Wouter Verhelst w at uter.be
Tue Mar 4 18:42:44 GMT 2014


Hi folks,

I used to be a digiKam user, back when storing files on ephemeral
devices wasn't possible yet. When this made my disk to run full (I was
an avid photographer), I tried moving things to another disk, which
caused things to go corrupt, and was a sensational failure in that I
also lost a number of my tags.

I've not invested the time to recreate those tags anymore since then,
and I still have a load of pictures that I need to process but it's not
been happening, which is a shame.

I like the fact that digiKam now supports pictures to be "offline" some
of the time, but it doesn't work for the way I want to do things.

When I'm away from home, I want to load pictures into my laptop (or
whatever device), and have them "automatically" be moved somewhere else.
If I need space, I want to be able to drop some of the data from the
hard disk, in the understanding that I'll be able to get it back when I
need it. I _think_ digiKam will allow some of that, but it might be a
bit fiddly. Additionally, doing things this way would require me to use
digiKam on every device where I want to manage my photos, which is not
necessarily what I would want.

Enter git-annex, which has some interesting features:
- It stores the metadata of all your photos in a git repository in a
  format which is designed not to conflict with other repositories if
  used correctly (i.e., you should use "git annex sync" to synchronize).
  This makes it easy to copy files from one machine to another,
  bidirectionally.
- It has support for several forms of "remotes": not only git
  repositories, but also things like amazon S3 and other cloud services.
  It's even possible to copy files from one device to another using
  XMPP. Most of the remotes also support encryption (where "most" should
  be interpreted as "I think all of them do, but I'm not sure and I'm
  too lazy to go look it up now ;-)
- It will keep track of where files are. If I want to remove a file from
  my local repository and git-annex cannot be sure that the file is
  still in at least N other remotes (where N is configurable and
  defaults to 1), then it will refuse to remove the file, to prevent
  data loss. As such, I can safely drop files from my local repository,
  secure in the knowledge that getting them back is just a simple "git
  annex get" away. This even works when I'm offline: as git-annex was
  written by Joey Hess, who lives off-grid in a cabin and has to walk a
  long distance for decent connectivity to his server (where his files
  are backed up), it follows that he doesn't want to do that walk just
  to check the state of a single file; so this check is pretty much
  builtin.
- Since fairly recently (in the order of weeks, if that), git annex now
  also supports storing random metadata; both boolean tags (which can be
  set or not) and name-value pairs can be set on any file, in what is
  pretty much a free format. Note that a name can also have multiple
  values; in fact, while boolean tags have different command-line
  options, internally they are (currently, at least) implemented as
  name-value pairs under the "tag" name.
- There is a "git annex assistant", a daemon mode whereby git-annex will
  watch for new files to appear in the repository, and automatically add
  them and upload them to other remotes that it knows about.
- From the git-annex website, one can download precompiled binaries for
  several platforms, including MacOS X and Android.

That could, theoretically, result in a workflow like the following:

- Install git-annex on android phone, and configure it so that the
  directory where camera photos are stored is a git-annex repository
  (this mode of operation is the default when installing the client on
  android). These get added to the repository and copied to remotes
  whenever I run the git-annex assistant, which can be done through an
  app icon.
- Go to computer, start digiKam. digiKam runs the assistant (if not
  already running), which syncs with remotes and creates symlinks for
  new photos just taken on the phone. Assistant may auto-download them,
  or not do so (depending on configuration), in which case digiKam could
  run "git annex get" on photos that would be shown on the screen.
- Tags are applied to photos by digiKam (from EXIF data, or from face
  recognition, or from manual input)
- User selects "sync tags with git-annex" or some similar option
- digiKam sets git-annex metadata on photos, assistant syncs with
  remotes
- assistant syncs new tags to remote running on webserver, where some
  other application (possibly ikiwiki, also written by Joey Hess)
  displays them in some pretty format. Or to a DLNA server, where the
  user runs "git annex view" to filter pictures based on the stored
  metadata and then shows them on the TV.

So, a simple question:

Assuming a well-written patch, is this something that would be likely to
be merged? Note that I'm willing to do the hard work here.

(If the answer to the above is "yes", expect some questions next on what
the best way to do that might be ;-)

Thanks,

-- 
This end should point toward the ground if you want to go to space.

If it starts pointing toward space you are having a bad problem and you
will not go to space today.

  -- http://xkcd.com/1133/



More information about the Digikam-devel mailing list