<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="http://git.reviewboard.kde.org/r/109811/">http://git.reviewboard.kde.org/r/109811/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On May 5th, 2013, 11:14 a.m. UTC, <b>Vishesh Handa</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">My main problem with this patch is that it is going to be hard to keep this information up to date. say the zip file moves, then we need to update the url of all the ArchiveItems. We already do that for files in the filewatcher, but then we will have to extend it for zip and other protocols.
Also, what happens when someone tries to open the url in a non kde application which doesn't support the zip/tar protocol. Say vlc or libreoffice? It would just lead to an ugly user experience.
I'm not sure if we want this indexer or not.</pre>
</blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I don't know how to solve the first problem you raise. One possible solution would be to store the URLs of the files contained in the archive file relative to the archive file itself, but I don't think it is possible.
For your second point, I made a quick test. I have an archive file named "Compta.zip" that contains two Excel files. I indexed this archive, and typed a part of the name of one of the files in KRunner. Two results were shown : the file itself, marked as an "Archive Item", and the zip file (very handy, now I known where I have archived my lost file !). Clicking on the zip file opens it in Ark. Clicking on the archive item opens it in LibreOffice without any problem. The display of the containing archive file is possible now that the indexer correctly uses hasSubResource and hasPart.
Maybe I was lucky and the .desktop file of LibreOffice was correctly written (the command line for launching LibreOffice used a %f placeholder, telling KIO that the application doesn't support URLs, so that every URL is first downloaded to /tmp, here by being extracted from the zip file, and then opened by LibreOffice), or KRunner always downloads the URLs of archive items.
If we want to ensure a perfect user experience, it is possible to disable by default the listing of archive items in KRunner. An user looking for a file will then only be presented with the archive file name, and never directly archive items.</pre>
<br />
<p>- Denis</p>
<br />
<p>On May 11th, 2013, 8:48 a.m. UTC, Denis Steckelmacher wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: url('http://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); background-position: left top; background-repeat: repeat-x; border: 1px black solid;">
<tr>
<td>
<div>Review request for Nepomuk.</div>
<div>By Denis Steckelmacher.</div>
<p style="color: grey;"><i>Updated May 11, 2013, 8:48 a.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">This patch adds a file metadata extractor for archive files. This extractor handles any file that can be read using KArchive.
The metadata extracted are the uncompressed size of the whole archive (shown in Dolphin, but not formatted like a file size using KB or MB suffixes), and the list of files it contains. The extractor creates one Nepomuk resource per file or directory in the archive (root directory included). These resources have the types ArchiveEntry, and FileDataObject (for files) or Folder (for directories). They also have their nie:url property set to an URL that can be used with the Archive KIO (for instance, "zip:/home/me/archive.zip/one/file" or "tar:/usr/src/linux-3.7.2.tar.xz"). For files, their fileSize is set to the uncompressed size of the file.
The files themselves are not read nor uncompressed. I haven't found a way to recursively extract metadata of archived files (for instance, launching the PlainTextExtractor on any plain text file found in the archive).</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">nepomukindexer seems to work. Nepomukshow displays meaningful information about the files indexed, the archive itself and the files contained in it. For a test archive, nepomukshow displays these informations :
$ nepomukshow test.zip
<nepomuk:/res/e5eddbdb-995b-472f-9ef1-3a4ba4c9999d> # Note this ID
rdf:type nfo:FileDataObject
rdf:type nfo:Archive
rdf:type nie:InformationElement
nao:created 2013-04-01T13:57:16.586Z
nao:lastModified 2013-04-01T13:57:17.414Z
nie:lastModified 2013-02-28T20:49:24Z
nie:url file:///home/steckdenis/test.zip
nie:mimeType application/zip
nie:created 2013-02-28T20:49:24Z
nfo:fileSize 3368744
nfo:uncompressedSize 4171547
nfo:fileName test.zip
kext:indexingLevel 2
Displaying the metadata of a file contained in the archive can be done by passing an URL to nepomukshow :
$ nepomukshow 'zip:/home/steckdenis/test.zip/'
<nepomuk:/res/71458f55-898c-4374-ad00-6ac5b1d9c9e7> # Note this ID, it is the one of the root compressed directory
rdf:type nfo:ArchiveItem
rdf:type nfo:Folder
rdf:type nfo:FileDataObject
rdf:type nfo:DataContainer
nao:created 2013-04-01T13:57:17.416Z
nao:lastModified 2013-04-01T13:57:17.416Z
nie:url <zip:/home/steckdenis/test.zip/>
nie:created 1970-01-01T00:00:00Z
nfo:belongsToContainer nepomuk:/res/e5eddbdb-995b-472f-9ef1-3a4ba4c9999d # ID of the archive file itself
$ nepomukshow 'zip:/home/steckdenis/test.zip/6 My account1.png'
<nepomuk:/res/ed73aabc-ce18-4ac7-9db7-f301ce07ffc5>
rdf:type nfo:ArchiveItem
rdf:type nfo:FileDataObject
nao:created 2013-04-01T13:57:17.417Z
nao:lastModified 2013-04-01T13:57:17.417Z
nie:url <zip:/home/steckdenis/test.zip/6%20My%20account1.png>
nie:created 2012-11-21T08:21:08Z
nfo:fileSize 330923 # Uncompressed size
nfo:belongsToContainer nepomuk:/res/71458f55-898c-4374-ad00-6ac5b1d9c9e7 # ID of the root directory
When entering "6 My account1.png" in KRunner, the file is shown as an "Archive entry". When clicking on it, Gwenview is launched and displays the image.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>services/fileindexer/indexer/CMakeLists.txt <span style="color: grey">(3474a43)</span></li>
<li>services/fileindexer/indexer/archiveextractor.h <span style="color: grey">(PRE-CREATION)</span></li>
<li>services/fileindexer/indexer/archiveextractor.cpp <span style="color: grey">(PRE-CREATION)</span></li>
<li>services/fileindexer/indexer/nepomukarchiveextractor.desktop <span style="color: grey">(PRE-CREATION)</span></li>
</ul>
<p><a href="http://git.reviewboard.kde.org/r/109811/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>