[kde-doc-english] [skrooge] /: Import of PDF invoice

Stephane Mankowski stephane at mankowski.fr
Thu Jun 16 20:30:14 UTC 2016


Git commit 3d40b969b2b5de92e7f8ccec07e26f40c47b327c by Stephane Mankowski.
Committed on 16/06/2016 at 20:29.
Pushed by smankowski into branch 'master'.

Import of PDF invoice

M  +1    -0    CHANGELOG
M  +2    -0    CMakeLists.txt
M  +42   -5    doc/index.docbook
M  +1    -1    doc/kde_docbook
M  +1    -0    plugins/import/CMakeLists.txt
C  +20   -21   plugins/import/skrooge_import_pdf/CMakeLists.txt [from: plugins/import/CMakeLists.txt - 059% similarity]
A  +7    -0    plugins/import/skrooge_import_pdf/allopneus.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/biofan.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/easycartouche.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/engie.extractor
A  +8    -0    plugins/import/skrooge_import_pdf/free.extractor
A  +24   -0    plugins/import/skrooge_import_pdf/freemobile.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/ldlc.extractor
A  +18   -0    plugins/import/skrooge_import_pdf/org.kde.skrooge-import-pdf.desktop
A  +7    -0    plugins/import/skrooge_import_pdf/oscaro.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/oxybul.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/pixmania.extractor
A  +253  -0    plugins/import/skrooge_import_pdf/skgimportpluginpdf.cpp     [License: GPL (v2+)]
A  +71   -0    plugins/import/skrooge_import_pdf/skgimportpluginpdf.h     [License: GPL (v2+)]
A  +7    -0    plugins/import/skrooge_import_pdf/spartoo.extractor
A  +7    -0    plugins/import/skrooge_import_pdf/topachat.extractor

http://commits.kde.org/skrooge/3d40b969b2b5de92e7f8ccec07e26f40c47b327c

diff --git a/CHANGELOG b/CHANGELOG
index a56810a..8b020d9 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,6 +4,7 @@ skrooge (2.5.0)
   *Correction: The 31 of the month, in budget page, "Previous month" does not work  
   *New feature: Capability to set/change order of budget rules
   *New feature: Tooltip on modified amount of budget to explain the reasons of modifications
+  *New feature: Import of PDF invoice  
   
  -- Stephane MANKOWSKI <stephane at mankowski.fr>  xxx
  
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 5e4202c..d382923 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -191,6 +191,8 @@ IF(SKG_BUILD_TEST AND NOT WIN32)
     ADD_SUBDIRECTORY(tests)
 ENDIF(SKG_BUILD_TEST AND NOT WIN32)
 
+ADD_SUBDIRECTORY(doc)
+
 #Main application
 ADD_SUBDIRECTORY(skrooge)
 ADD_SUBDIRECTORY(skroogeconvert)
diff --git a/doc/index.docbook b/doc/index.docbook
index cb07435..e603f0b 100644
--- a/doc/index.docbook
+++ b/doc/index.docbook
@@ -44,15 +44,15 @@
       <year>2013</year>
       <year>2014</year>
       <year>2015</year>
+      <year>2016</year>      
       <holder>Stéphane MANKOWSKI</holder>
       <holder>Guillaume DE BURE</holder>
     </copyright>
     
     <legalnotice>&FDLNotice;</legalnotice>
     
-    <date>15/06/2015</date>
-    <releaseinfo>2.0.0</releaseinfo>
-    
+    <date>15/06/2016</date>
+    <releaseinfo>2.5.0</releaseinfo>
     
     <abstract>
       <para>
@@ -693,8 +693,10 @@
 	    <listitem><para>QIF: <trademark>Quicken</trademark> Import File. Maybe the most common financial file format. However, it has some rather annoying limitations, like not giving the unit for operation, or no strict date formatting.</para></listitem>
 	    <listitem><para>IIF: <trademark>Intuit</trademark> Interchange Format is used by <trademark>QuickBooks</trademark>.</para></listitem>
 	    <listitem><para>SKG: This is useful to merge 2 &appname; documents</para></listitem>
+            <listitem><para>PDF: This allows to create the operation from a PDF invoice. The invoice is also associated to the operation as a property. Read the <link linkend="howto_extractor">How to</link> if you want ot know how to extract information from an invoice not supported yet.</para></listitem>
 	    <listitem><para>Backend: &appname; can also import operations by using a backend. The only one supported is <ulink url="http://weboob.org/">WEBOOB</ulink>. By using this backend you can import all operations from all your banks in only one click. For that, you just have to install <ulink url="http://weboob.org/">WEBOOB</ulink> and activate the corresponding backend from settings.</para>
-	    <tip><para>If you do not want to store your bank passwords in the configuration file of WEBOOB, you can do that:
+	    
+            <tip><para>If you do not want to store your bank passwords in the configuration file of WEBOOB, you can do that:
 	    
 	    <itemizedlist>
 	       <listitem><para>Add passwords for each bank by doing <quote>kwallet-query -f Weboob kdewallet -w m_bank_name</quote></para></listitem>
@@ -2499,7 +2501,42 @@ file is opened. It is also recommended to create a different account (⪚ "ETF"
 	<para>The size of your document can be very important. If you delete some old transactions, the size will increase. 
 	  This is normal because &appname; keeps the history of all modifications for the undo/redo mechanism. 
 	  So if you want to reduce the size of your document, you just have to clear the history.</para>
-      </sect1>	      
+      </sect1>	
+      
+      <sect1 id="howto_extractor">
+	<title>How to define a new invoice extractor?</title>
+	<para>&appname; uses pdftotext to extract all strings of a PDF. After that, it uses an text file describing how to find key values. If you want to define a new invoice extractor, you have to do that:</para>
+	    <itemizedlist>
+                <listitem><para>Launch <command>pdftotext</command> on your PDF file</para></listitem>
+	      <listitem><para>Open the text file generated and the corresponding PDF file</para></listitem>
+              <listitem><para>Create a new text with an extension ".extractor". Example: google.extractor</para></listitem>
+              <listitem><para>Your file must be like this:</para>
+<programlisting>              
+payee=REGEXPCAP:^(Biofan) SPRL$
+date=REGEXPCAP:^Order Date: (.*)$
+dateformat=dd MMM yyyy
+number=REGEXPCAP:^N° de facture (.*)$
+mode=SET:Carte
+comment=REGEXPCAP:^N° de commande (.*)$|SET:Commande %1
+amount=REGEXP:^Montant global:$|LINEOFFSET:2
+</programlisting>  
+<para>Each attribute (payee, date, number, mode, comment and amount) use the same syntax: COMMAND:value|COMMAND:value|...</para>
+<para>The command can be:</para>
+    <itemizedlist>
+        <listitem><para><command>REGEXPCAP</command>: This a regular expression capturing a value</para></listitem>
+        <listitem><para><command>REGEXP</command>: To find the line in the file maching a regular expression</para></listitem>
+        <listitem><para><command>LINEOFFSET</command>: To change the line index.</para></listitem>
+        <listitem><para><command>SET</command>: To force the value. Can be used as first command or after the REGEXPCAP (see example).</para></listitem>
+    </itemizedlist>         
+<para>dateformat is the format of the date extracted.</para>    
+</listitem>                      
+
+                <listitem><para>Put this file in the same directory than all other .extractor files</para></listitem>
+	    </itemizedlist>        
+      <para>
+      </para>
+      </sect1>	     
+      
     </chapter>    
     
     <chapter id="credits">
diff --git a/doc/kde_docbook b/doc/kde_docbook
index 893e05b..59cd3c7 100755
--- a/doc/kde_docbook
+++ b/doc/kde_docbook
@@ -14,7 +14,7 @@ echo " -> OK !";
 
 # Convert to html
 echo "Running xsltproc... ";
-xsltproc -o $2 /usr/share/kde4/apps/ksgmltools2/customization/kde-nochunk.xsl $1;
+xsltproc -o $2 /usr/share/kf5/kdoctools/customization/kde-nochunk.xsl $1;
 if [ $? -gt 0 ]; then
   echo " -> xsltproc failed !";
   exit 1;
diff --git a/plugins/import/CMakeLists.txt b/plugins/import/CMakeLists.txt
index 0ff1777..85fa0ed 100644
--- a/plugins/import/CMakeLists.txt
+++ b/plugins/import/CMakeLists.txt
@@ -33,6 +33,7 @@ ADD_SUBDIRECTORY(skrooge_import_mt940)
 IF(LIBOFX_FOUND)
   ADD_SUBDIRECTORY(skrooge_import_ofx)
 ENDIF(LIBOFX_FOUND)
+ADD_SUBDIRECTORY(skrooge_import_pdf)
 ADD_SUBDIRECTORY(skrooge_import_skg)
 
 ADD_SUBDIRECTORY(skrooge_import_xhb)
diff --git a/plugins/import/CMakeLists.txt b/plugins/import/skrooge_import_pdf/CMakeLists.txt
similarity index 59%
copy from plugins/import/CMakeLists.txt
copy to plugins/import/skrooge_import_pdf/CMakeLists.txt
index 0ff1777..27086ad 100644
--- a/plugins/import/CMakeLists.txt
+++ b/plugins/import/skrooge_import_pdf/CMakeLists.txt
@@ -14,26 +14,25 @@
 #*   You should have received a copy of the GNU General Public License     *
 #*   along with this program.  If not, see <http://www.gnu.org/licenses/>  *
 #***************************************************************************
-#Correction bug 223848 vvvv
-#FIND_PACKAGE( LibOfx REQUIRED )
-FIND_PACKAGE( LibOfx )
+MESSAGE( STATUS "..:: CMAKE PLUGIN_IMPORT_PDF ::..")
 
-ADD_SUBDIRECTORY(skrooge_import_afb120)
-ADD_SUBDIRECTORY(skrooge_import_backend)
-ADD_SUBDIRECTORY(skrooge_import_iif)
-ADD_SUBDIRECTORY(skrooge_import_qif)
-ADD_SUBDIRECTORY(skrooge_import_csv)
-ADD_SUBDIRECTORY(skrooge_import_gnc)
-ADD_SUBDIRECTORY(skrooge_import_gsb)
-ADD_SUBDIRECTORY(skrooge_import_json)
-ADD_SUBDIRECTORY(skrooge_import_kmy)
-ADD_SUBDIRECTORY(skrooge_import_mmb)
-ADD_SUBDIRECTORY(skrooge_import_mny)
-ADD_SUBDIRECTORY(skrooge_import_mt940)
-IF(LIBOFX_FOUND)
-  ADD_SUBDIRECTORY(skrooge_import_ofx)
-ENDIF(LIBOFX_FOUND)
-ADD_SUBDIRECTORY(skrooge_import_skg)
+PROJECT(plugin_import_PDF)
 
-ADD_SUBDIRECTORY(skrooge_import_xhb)
-ADD_SUBDIRECTORY(skrooge_import_xml)
+LINK_DIRECTORIES (${LIBRARY_OUTPUT_PATH})
+
+SET(skrooge_import_pdf_SRCS
+	skgimportpluginpdf.cpp
+)
+
+ADD_LIBRARY(skrooge_import_pdf MODULE ${skrooge_import_pdf_SRCS})
+TARGET_LINK_LIBRARIES(skrooge_import_pdf KF5::Parts skgbasemodeler skgbasegui skgbankmodeler skgbankgui)
+
+########### install files ###############
+INSTALL(TARGETS skrooge_import_pdf DESTINATION ${KDE_INSTALL_QTPLUGINDIR})
+INSTALL(FILES ${PROJECT_SOURCE_DIR}/org.kde.skrooge-import-pdf.desktop DESTINATION ${KDE_INSTALL_KSERVICES5DIR})
+INSTALL(DIRECTORY . DESTINATION ${KDE_INSTALL_DATADIR}/skrooge/extractors FILES_MATCHING PATTERN "*.extractor" 
+PATTERN ".svn" EXCLUDE
+PATTERN "CMakeLists.txt" EXCLUDE
+PATTERN "CMakeFiles" EXCLUDE
+PATTERN "grantlee_filters" EXCLUDE
+PATTERN "Testing" EXCLUDE)
\ No newline at end of file
diff --git a/plugins/import/skrooge_import_pdf/allopneus.extractor b/plugins/import/skrooge_import_pdf/allopneus.extractor
new file mode 100644
index 0000000..11257d1
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/allopneus.extractor
@@ -0,0 +1,7 @@
+payee=REGEXP:^Réglé par Carte bancaire date de facture|SET:Allo Pneus
+date=REGEXP:^Date$|LINEOFFSET:1
+dateformat=dd/MM/yyyy
+number=REGEXP:^Numéro$|LINEOFFSET:1
+mode=SET:Carte
+comment=REGEXP:^N°Commande$|LINEOFFSET:1|SET:Commande %1
+amount=REGEXP:^Montant TTC :$|LINEOFFSET:1|REGEXPCAP:^([^ ]*) .*$
diff --git a/plugins/import/skrooge_import_pdf/biofan.extractor b/plugins/import/skrooge_import_pdf/biofan.extractor
new file mode 100644
index 0000000..85346c7
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/biofan.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(Biofan) SPRL$
+date=REGEXPCAP:^Order Date: (.*)$
+dateformat=dd MMM yyyy
+number=REGEXPCAP:^N° de facture (.*)$
+mode=SET:Carte
+comment=REGEXPCAP:^N° de commande (.*)$|SET:Commande %1
+amount=REGEXP:^Montant global:$|LINEOFFSET:2
diff --git a/plugins/import/skrooge_import_pdf/easycartouche.extractor b/plugins/import/skrooge_import_pdf/easycartouche.extractor
new file mode 100644
index 0000000..ef5a95f
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/easycartouche.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(EasyCartouche) est une marque et un service
+date=REGEXPCAP:^Passée le ([^ ]*) et traitée le
+dateformat=dd/MM/yyyy
+number=REGEXPCAP:^FACTURE n° (.*)$
+mode=SET:Carte
+comment=REGEXPCAP:^Commande n° (.*)$|SET:Commande %1
+amount=REGEXP:^Total TTC$|LINEOFFSET:4
diff --git a/plugins/import/skrooge_import_pdf/engie.extractor b/plugins/import/skrooge_import_pdf/engie.extractor
new file mode 100644
index 0000000..2f6d99a
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/engie.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(ENGIE) - SA au capital de
+date=REGEXPCAP:^VOTRE FACTURE DU (.*)
+dateformat=dd/MM/yy
+number=REGEXPCAP:^N°(.*)$
+mode=SET:Carte
+comment=LINEOFFSET:1
+amount=REGEXP:^MONTANT TTC$|LINEOFFSET:2|REGEXPCAP:^([^ ]*) .*$
diff --git a/plugins/import/skrooge_import_pdf/free.extractor b/plugins/import/skrooge_import_pdf/free.extractor
new file mode 100644
index 0000000..e68fb2c
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/free.extractor
@@ -0,0 +1,8 @@
+payee=REGEXPCAP:^(Free) Service Abonné$
+date=REGEXPCAP:Facture no .* du (.*)$
+dateformat=dd MMM yyyy
+number=REGEXPCAP:Facture no (.*) du
+comment=REGEXPCAP:(^Communications de la ligne .*$)
+mode=SET:Prélèvement
+amount=REGEXP:^Total facture$|LINEOFFSET:13
+
diff --git a/plugins/import/skrooge_import_pdf/freemobile.extractor b/plugins/import/skrooge_import_pdf/freemobile.extractor
new file mode 100644
index 0000000..99a5704
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/freemobile.extractor
@@ -0,0 +1,24 @@
+#The payee extraction.
+#If the payee is not found, then the process is stopped with this extractor
+payee=REGEXPCAP:^(Free Mobile) – SAS
+
+#The date extraction.
+date=REGEXPCAP:Facture no .* du (.*)$
+
+#The date format
+dateformat=dd MMM yyyy
+
+#The number extraction
+number=REGEXPCAP:Facture no (.*) du
+
+#The comment extraction
+comment=REGEXPCAP:(^Consommations du .*$)
+
+#The mode extraction
+mode=SET:Prélèvement
+
+#The amount extraction
+amount=REGEXP:^Somme à payer TTC$|LINEOFFSET:-4|REGEXPCAP:^([^ ]*) .*$
+
+
+#For the regexp syntax, you can consult this page: http://qt-project.org/doc/qt-4.8/qregexp.html
diff --git a/plugins/import/skrooge_import_pdf/ldlc.extractor b/plugins/import/skrooge_import_pdf/ldlc.extractor
new file mode 100644
index 0000000..aebb7cf
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/ldlc.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(LDLC) AURORE EP5-4-S1 SANS SYSTEME$
+date=REGEXPCAP:^Date de la facture : (.*)$
+dateformat=dd/MM/yyyy
+number=REGEXP:^N° de facture :$|LINEOFFSET:1
+mode=SET:Carte
+comment=REGEXP:^Désignation$|LINEOFFSET:7
+amount=REGEXP:^Les conditions générales de vente|LINEOFFSET:-5
diff --git a/plugins/import/skrooge_import_pdf/org.kde.skrooge-import-pdf.desktop b/plugins/import/skrooge_import_pdf/org.kde.skrooge-import-pdf.desktop
new file mode 100644
index 0000000..4a83972
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/org.kde.skrooge-import-pdf.desktop
@@ -0,0 +1,18 @@
+[Desktop Entry]
+Name=Skrooge import PDF plugin
+Comment=A Skrooge plugin to import PDF files
+Encoding=UTF-8
+Icon=skrooge
+Type=Service
+X-KDE-ServiceTypes=SKG IMPORT/Plugin
+X-KDE-Library=skrooge_import_pdf
+X-Krunner-ID=Skrooge import PDF plugin
+X-KDE-PluginInfo-Author=Stephane MANKOWSKI
+X-KDE-PluginInfo-Email=stephane at mankowski.fr
+X-KDE-PluginInfo-Name=skrooge_import_pdf
+X-KDE-PluginInfo-Version=1.0
+X-KDE-PluginInfo-Website=http://skrooge.org/
+X-KDE-PluginInfo-Category=Plugins
+X-KDE-PluginInfo-Depends=
+X-KDE-PluginInfo-License=GPL
+X-KDE-PluginInfo-EnabledByDefault=true
diff --git a/plugins/import/skrooge_import_pdf/oscaro.extractor b/plugins/import/skrooge_import_pdf/oscaro.extractor
new file mode 100644
index 0000000..ff2154b
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/oscaro.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(OSCARO).COM$
+date=REGEXPCAP:^Date de facturation : (.*)$
+dateformat=dd/MM/yyyy
+number=REGEXPCAP:^Facture : (.*)$
+mode=SET:Carte
+comment=REGEXPCAP:^Numéro de commande : (.*)$|SET:Commande %1
+amount=REGEXP:^Montant TTC :$|LINEOFFSET:4
diff --git a/plugins/import/skrooge_import_pdf/oxybul.extractor b/plugins/import/skrooge_import_pdf/oxybul.extractor
new file mode 100644
index 0000000..db583e8
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/oxybul.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(Oxybul) Eveil et Jeux vous remercie
+date=REGEXP:^Date de commande :$|LINEOFFSET:4
+dateformat=dd/MM/yyyy
+number=REGEXPCAP:^FACTURE N° : (.*)$
+mode=SET:Carte
+comment=REGEXP:^N° commande :$|LINEOFFSET:4|SET:Commande %1
+amount=REGEXP:^TOTAL TTC de votre commande$|LINEOFFSET:-3
diff --git a/plugins/import/skrooge_import_pdf/pixmania.extractor b/plugins/import/skrooge_import_pdf/pixmania.extractor
new file mode 100644
index 0000000..49a6ff7
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/pixmania.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(PIXMANIA).COM :
+date=REGEXP:^Date de facture :$|LINEOFFSET:2
+dateformat=yyyy-MM-dd
+number=REGEXPCAP:^BON DE LIVRAISON ET FACTURE (.*)$
+mode=SET:Carte
+comment=REGEXP:^N° de commande :$|LINEOFFSET:2|SET:Commande %1
+amount=REGEXP:^Total TTC$|LINEOFFSET:2|REGEXPCAP:^(.*) .*$
diff --git a/plugins/import/skrooge_import_pdf/skgimportpluginpdf.cpp b/plugins/import/skrooge_import_pdf/skgimportpluginpdf.cpp
new file mode 100644
index 0000000..5208b3a
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/skgimportpluginpdf.cpp
@@ -0,0 +1,253 @@
+/***************************************************************************
+ *   Copyright (C) 2008 by S. MANKOWSKI / G. DE BURE support at mankowski.fr  *
+ *                                                                         *
+ *   This program is free software; you can redistribute it and/or modify  *
+ *   it under the terms of the GNU General Public License as published by  *
+ *   the Free Software Foundation; either version 2 of the License, or     *
+ *   (at your option) any later version.                                   *
+ *                                                                         *
+ *   This program is distributed in the hope that it will be useful,       *
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
+ *   GNU General Public License for more details.                          *
+ *                                                                         *
+ *   You should have received a copy of the GNU General Public License     *
+ *   along with this program.  If not, see <http://www.gnu.org/licenses/>  *
+ ***************************************************************************/
+/** @file
+ * This file is Skrooge plugin for for PDF import / export.
+ * http://jerome.girod.perso.sfr.fr/finance/pdfx.php
+ *
+ * @author Stephane MANKOWSKI / Guillaume DE BURE
+ */
+#include "skgimportpluginpdf.h"
+
+#include <kpluginfactory.h>
+
+#include <qfileinfo.h>
+#include <qstandardpaths.h>
+#include <qdiriterator.h>
+#include <qtemporaryfile.h>
+#include <qprocess.h>
+
+#include "skgtraces.h"
+#include "skgservices.h"
+#include "skgbankincludes.h"
+#include "skgimportexportmanager.h"
+
+/**
+ * This plugin factory.
+ */
+K_PLUGIN_FACTORY(SKGImportPluginPDFFactory, registerPlugin<SKGImportPluginPDF>();)
+
+SKGImportPluginPDF::SKGImportPluginPDF(QObject* iImporter, const QVariantList& iArg)
+    : SKGImportPlugin(iImporter)
+{
+    SKGTRACEINFUNC(10);
+    Q_UNUSED(iArg);
+}
+
+SKGImportPluginPDF::~SKGImportPluginPDF()
+{
+}
+
+bool SKGImportPluginPDF::isImportPossible()
+{
+    SKGTRACEINFUNC(10);
+    return (!m_importer ? true : m_importer->getFileNameExtension() == QStringLiteral("PDF"));
+}
+
+QString SKGImportPluginPDF::extract(const QStringList& iLine, const QString& iSyntax)
+{
+    QString output;
+    int currentIndex = -1;
+    bool setpossible = true;
+
+    // Interpret the syntax
+    auto items = SKGServices::splitCSVLine(iSyntax, '|', false);
+    for (const auto& item : items) {
+        if (item.startsWith(QLatin1String("REGEXPCAP:"))) {
+            QRegExp regexp(item.right(item.length() - 10));
+            if (output.isEmpty()) {
+                for (const auto& line : iLine) {
+                    if (regexp.indexIn(line) > -1) {
+                        output = regexp.cap(1);
+                        break;
+                    }
+                }
+            } else {
+                if (regexp.indexIn(output) > -1) {
+                    output = regexp.cap(1);
+                }
+            }
+        } else if (item.startsWith(QLatin1String("REGEXP:"))) {
+            setpossible = false;
+            QRegExp regexp(item.right(item.length() - 7));
+            int nb = iLine.count();
+            for (int i = 0; i < nb; ++i) {
+                if (regexp.indexIn(iLine.at(i)) > -1) {
+                    currentIndex = i;
+                    setpossible = true;
+                    break;
+                }
+            }
+        } else if (item.startsWith(QLatin1String("LINEOFFSET:"))) {
+            currentIndex += SKGServices::stringToInt(item.right(item.length() - 11));
+            if (currentIndex >= 0 && currentIndex < iLine.count()) {
+                output = iLine.at(currentIndex);
+            }
+        } else if (item.startsWith(QLatin1String("SET:")) && setpossible) {
+            QString s = item.right(item.length() - 4);
+            if (s.contains(QLatin1String("%1"))) {
+                output = s.arg(output);
+            } else {
+                output = s;
+            }
+        }
+    }
+
+
+    return output;
+}
+
+
+SKGError SKGImportPluginPDF::importFile()
+{
+    if (!m_importer) {
+        return SKGError(ERR_ABORT, i18nc("Error message", "Invalid parameters"));
+    }
+    SKGError err;
+    SKGTRACEINFUNCRC(2, err);
+
+    // Begin transaction
+    err = m_importer->getDocument()->beginTransaction("#INTERNAL#" % i18nc("Import step", "Import %1 file", "PDF"), 2);
+    IFOK(err) {
+        // Open file
+        IFOK(err) {
+            // Extract text from PDF
+            QString file = m_importer->getLocalFileName();
+            QTemporaryFile txtFile;
+            txtFile.open();
+            QString cmd = "pdftotext \"" % file % "\" \"" % txtFile.fileName() % "\"";
+
+            QProcess p;
+            p.start(cmd);
+            if (!p.waitForFinished(1000 * 60 * 2) || p.exitCode() != 0) {
+                err.setReturnCode(ERR_FAIL).setMessage(i18nc("Error message",  "The following command line failed with code %2:\n'%1'", cmd, p.exitCode()));
+
+            } else {
+                // Step 1 done
+                IFOKDO(err, m_importer->getDocument()->stepForward(1))
+
+                // Read the text file
+                QStringList lines;
+                QTextStream stream(&txtFile);
+                while (!stream.atEnd()) {
+                    // Read line
+                    lines.push_back(stream.readLine());
+                }
+
+                // Search extractors
+                bool found = false;
+                QString a = QStringLiteral("skrooge/extractors");
+                const auto dirs = QStandardPaths::locateAll(QStandardPaths::GenericDataLocation, a, QStandardPaths::LocateDirectory);
+                for (const auto& dir : dirs) {
+                    QDirIterator it(dir, QStringList() << QStringLiteral("*.extractor"));
+                    while (it.hasNext()) {
+                        // Read extractor
+                        QString fileName = it.next();
+                        QString extractor = QFileInfo(fileName).baseName().toUpper();
+                        QHash< QString, QString > properties;
+                        err = SKGServices::readPropertyFile(fileName, properties);
+                        IFOK(err) {
+                            // Check if this extractor is done for this file
+                            QString payee = extract(lines, properties[QStringLiteral("payee")]);
+                            if (!payee.isEmpty()) {
+                                // Search the date
+                                QString date = extract(lines, properties[QStringLiteral("date")]);
+                                QString dateFormat = properties[QStringLiteral("dateformat")];
+                                auto d = QDate::fromString(date, dateFormat);
+                                if (!dateFormat.contains(QStringLiteral("yyyy")) && d.year() < 2000) {
+                                    d = d.addYears(100);
+                                }
+
+                                // Search the amount
+                                double amount = SKGServices::stringToDouble(extract(lines, properties[QStringLiteral("amount")]));
+
+                                // Search the comment
+                                QString comment = extract(lines, properties[QStringLiteral("comment")]);
+
+                                // Search the number
+                                QString number = extract(lines, properties[QStringLiteral("number")]);
+
+                                // Search the mode
+                                QString mode = extract(lines, properties[QStringLiteral("mode")]);
+
+                                // Get account
+                                SKGAccountObject account;
+                                SKGOperationObject act;
+                                m_importer->getDocument()->getObject("v_account_display", "t_close='N' AND t_type='C' ORDER BY i_NBOPERATIONS DESC LIMIT 1", act);
+                                if (act.exist()) {
+                                    account = act;
+                                    IFOKDO(err, m_importer->getDocument()->sendMessage(i18nc("An information message",  "Using account '%1' for import", account.getName())))
+                                } else {
+                                    IFOKDO(err, m_importer->getDefaultAccount(account))
+                                }
+
+                                // Get unit
+                                SKGUnitObject unit;
+                                IFOKDO(err, m_importer->getDefaultUnit(unit))
+
+                                // Create operation
+                                SKGOperationObject operation;
+                                IFOKDO(err, account.addOperation(operation, true));
+                                IFOKDO(err, operation.setDate(d))
+
+
+                                IFOKDO(err, operation.setUnit(unit))
+                                SKGPayeeObject payeeObj;
+                                IFOKDO(err, SKGPayeeObject::createPayee(m_importer->getDocument(), payee, payeeObj));
+                                IFOKDO(err, operation.setPayee(payeeObj))
+                                IFOKDO(err, operation.setComment(comment))
+                                IFOKDO(err, operation.setMode(mode))
+                                IFOKDO(err, operation.setImportID(QStringLiteral("PDF-") % extractor % QStringLiteral("-") % number))
+                                // This is normal. PDF inport is for only one operation, so no check if already imported
+                                IFOKDO(err, operation.setAttribute(QStringLiteral("t_imported"), QStringLiteral("Y")))
+                                IFOKDO(err, operation.save(false))
+
+                                SKGSubOperationObject subop;
+                                IFOKDO(err, operation.addSubOperation(subop))
+                                IFOKDO(err, subop.setComment(comment))
+                                IFOKDO(err, subop.setQuantity(-amount));
+                                IFOKDO(err, subop.save(false, false))
+
+                                // Add file
+                                IFOKDO(err, err = operation.setProperty(i18n("Invoice"), file, file))
+
+                                found = true;
+                                break;
+                            }
+                        }
+                    }
+                }
+
+                if (!found) {
+                    IFOKDO(err, m_importer->getDocument()->sendMessage(i18nc("An information message",  "Invoice %1 has not been imported because no recognized.", file), SKGDocument::Error))
+                }
+
+                // Step 2 done
+                IFOKDO(err, m_importer->getDocument()->stepForward(2))
+            }
+        }
+    }
+    SKGENDTRANSACTION(m_importer->getDocument(),  err);
+
+    return err;
+}
+
+QString SKGImportPluginPDF::getMimeTypeFilter() const
+{
+    return "*.pdf|" % i18nc("A file format", "PDF file (invoice)");
+}
+
+#include <skgimportpluginpdf.moc>
diff --git a/plugins/import/skrooge_import_pdf/skgimportpluginpdf.h b/plugins/import/skrooge_import_pdf/skgimportpluginpdf.h
new file mode 100644
index 0000000..b75e583
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/skgimportpluginpdf.h
@@ -0,0 +1,71 @@
+/***************************************************************************
+ *   Copyright (C) 2008 by S. MANKOWSKI / G. DE BURE support at mankowski.fr  *
+ *                                                                         *
+ *   This program is free software; you can redistribute it and/or modify  *
+ *   it under the terms of the GNU General Public License as published by  *
+ *   the Free Software Foundation; either version 2 of the License, or     *
+ *   (at your option) any later version.                                   *
+ *                                                                         *
+ *   This program is distributed in the hope that it will be useful,       *
+ *   but WITHOUT ANY WARRANTY; without even the implied warranty of        *
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         *
+ *   GNU General Public License for more details.                          *
+ *                                                                         *
+ *   You should have received a copy of the GNU General Public License     *
+ *   along with this program.  If not, see <http://www.gnu.org/licenses/>  *
+ ***************************************************************************/
+#ifndef SKGIMPORTPLUGINPDF_H
+#define SKGIMPORTPLUGINPDF_H
+/** @file
+* This file is Skrooge plugin for PDF import / export.
+*
+* @author Stephane MANKOWSKI / Guillaume DE BURE
+*/
+#include "skgimportplugin.h"
+
+/**
+ * This file is Skrooge plugin for PDF import / export.
+ */
+class SKGImportPluginPDF : public SKGImportPlugin
+{
+    Q_OBJECT
+    Q_INTERFACES(SKGImportPlugin)
+
+public:
+    /**
+     * Default constructor
+     * @param iImporter the parent importer
+     * @param iArg the arguments
+     */
+    explicit SKGImportPluginPDF(QObject* iImporter, const QVariantList& iArg);
+
+    /**
+     * Default Destructor
+     */
+    virtual ~SKGImportPluginPDF();
+
+    /**
+     * To know if import is possible with this plugin
+     */
+    virtual bool isImportPossible() override;
+
+    /**
+     * Import a file
+     * @return an object managing the error.
+     *   @see SKGError
+     */
+    virtual SKGError importFile() override;
+
+    /**
+     * Return the mime type filter
+     * @return the mime type filter. Example: "*.csv|CSV file"
+     */
+    virtual QString getMimeTypeFilter() const override;
+
+
+private:
+    Q_DISABLE_COPY(SKGImportPluginPDF)
+    QString extract(const QStringList& iLine, const QString& iSyntax);
+};
+
+#endif  // SKGIMPORTPLUGINPDF_H
diff --git a/plugins/import/skrooge_import_pdf/spartoo.extractor b/plugins/import/skrooge_import_pdf/spartoo.extractor
new file mode 100644
index 0000000..443174a
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/spartoo.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(Spartoo)$
+date=REGEXPCAP:^Date : (.*)$
+dateformat=dd/MM/yyyy
+number=REGEXPCAP:^Facture : (.*)$
+mode=SET:Carte
+comment=REGEXP:^Désignation$|LINEOFFSET:4
+amount=REGEXP:^Total TTC$|LINEOFFSET:2|REGEXPCAP:^(.*) .*$
diff --git a/plugins/import/skrooge_import_pdf/topachat.extractor b/plugins/import/skrooge_import_pdf/topachat.extractor
new file mode 100644
index 0000000..81bbada
--- /dev/null
+++ b/plugins/import/skrooge_import_pdf/topachat.extractor
@@ -0,0 +1,7 @@
+payee=REGEXPCAP:^(TOPACHAT)-CLUST
+date=REGEXP:^Date facture$|LINEOFFSET:7
+dateformat=dd/MM/yyyy
+number=REGEXP:^N° Commande$|LINEOFFSET:7
+mode=SET:Carte
+comment=REGEXP:^Facture$|LINEOFFSET:1|SET:Commande %1
+amount=REGEXP:^TTC à payer en euros$|LINEOFFSET:3


More information about the kde-doc-english mailing list