<html><head></head><body><div class="ydp1abe9a47yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:16px;"><div></div>
<div dir="ltr" data-setdir="false">Well,</div><div dir="ltr" data-setdir="false">I hear you, but since I am not doing this for me, but for an average user, I ask my questions, experiment, and then write procedures showing how to import a pdf file.</div><div dir="ltr" data-setdir="false">The minute I try to do my own/their own scripting I forget who my audience is. There is data about the education/intellectual level of the average user, and it rules out scripting.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">If there was already such a script it would be another matter.</div><div dir="ltr" data-setdir="false">I am more interested in making it as easy as possible, I realize it won't be perfect.</div><div dir="ltr" data-setdir="false">Making things clear/easy for the end user is never a losing battle.</div><div dir="ltr" data-setdir="false">Aaron </div><div><br></div>
</div><div id="yahoo_quoted_0148785097" class="yahoo_quoted">
<div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">
<div>
On Thursday, December 31, 2020, 06:08:42 PM EST, Jack <ostroffjh@users.sourceforge.net> wrote:
</div>
<div><br></div>
<div><br></div>
<div>I really hate to be negative, but I think you're fighting a losing <br clear="none">battle. If you can program with almost any scripting language, and are <br clear="none">willing to spend some time experimenting, you can likely pull together <br clear="none">something that works for you, depending on how long you think the <br clear="none">effort is worth.<br clear="none"><br clear="none">On the sign of transactions, how would KMM know whether it's a deposit <br clear="none">or withdrawal? The csv import gives you two ways. First, the amount <br clear="none">column needs to have minus signs on withdrawals. (There is a check box <br clear="none">to reverse sign if the deposits show up as negative.) The other way is <br clear="none">to have separate columns for credits and for debits. If the statement <br clear="none">actually uses positive numbers for both, and doesn't give you any way <br clear="none">to reverse the appropriate ones, you will probably end up with as much <br clear="none">effort in post-import editing as you would have had just typing them in <br clear="none">manually in the first place. Remember, you will probably also need to <br clear="none">post-import adjust most of the categories.<br clear="none"><br clear="none">On 2020.12.31 17:22, Aaron Mehl wrote:<br clear="none">> Just as an experiment I manually deleted the overflow lines..But <br clear="none">> that isn't automatic.And as I read on and experiment, I think that <br clear="none">> semi-automatic might be the best option.So to rephrase my <br clear="none">> question:What is the best semi-automatic way to bring a pdf bank <br clear="none">> statement into KMyMoney.<br clear="none">> I see that without serious programming a converter (I googled and <br clear="none">> tried a few) from text to Qif or to csv all require manual input.The <br clear="none">> question is where in the food chain is the best place to make these <br clear="none">> changes.I see that pdftotext doesn't like a wide column length, and I <br clear="none">> gather there is no way to change it?Qif seems to want deposits listed <br clear="none">> with a plus sign and expenses with a minus.There probably other <br clear="none">> things that would need tweaking.<br clear="none">> So I wonder what is the best way to get bank statements into <br clear="none">> KMyMoney. My bank only lets me get a pdf.Aaron<br clear="none">> On Thursday, December 31, 2020, 04:41:34 PM EST, <br clear="none">> <<a shape="rect" ymailto="mailto:pjfarley3@earthlink.net" href="mailto:pjfarley3@earthlink.net">pjfarley3@earthlink.net</a>> wrote:<br clear="none">> <br clear="none">> #yiv9995229445 #yiv9995229445 -- _filtered {} _filtered <br clear="none">> {}#yiv9995229445 #yiv9995229445 p.yiv9995229445MsoNormal, <br clear="none">> #yiv9995229445 li.yiv9995229445MsoNormal, #yiv9995229445 <br clear="none">> div.yiv9995229445MsoNormal <br clear="none">> {margin:0in;font-size:11.0pt;font-family:sans-serif;}#yiv9995229445 <br clear="none">> a:link, #yiv9995229445 span.yiv9995229445MsoHyperlink <br clear="none">> {color:blue;text-decoration:underline;}#yiv9995229445 <br clear="none">> span.yiv9995229445EmailStyle19 <br clear="none">> {font-family:sans-serif;color:windowtext;}#yiv9995229445 <br clear="none">> .yiv9995229445MsoChpDefault {font-size:10.0pt;} _filtered <br clear="none">> {}#yiv9995229445 div.yiv9995229445WordSection1 {}#yiv9995229445<div class="yqt9543492191" id="yqtfd60229"><br clear="none">> Jack,<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> It is quite common in bank statement PDF’s to have transactions be <br clear="none">> formatted like this (I hope the alignment works, I will format as <br clear="none">> fixed-font to try to help):<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> MM/DD/YY Payee Name Amount paid Running <br clear="none">> balance<br clear="none">> <br clear="none">> Additional info about payment<br clear="none">> <br clear="none">> Can be multiple lines<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> MM/DD/YY Next Payee Name Amount Paid Running <br clear="none">> balance<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> MM/DD/YY DEPOSIT Amount deposited Running <br clear="none">> Balance<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> So when the PDF is translated to text, those “additional info” <br clear="none">> line(s) appear as separate physical lines without the MM/DD/YY header <br clear="none">> or any money amounts following.<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> Depending heavily on the PDF construction, I have also (but rarely) <br clear="none">> seen the money amounts (paid or deposited and balance) show up on the <br clear="none">> SECOND line after conversion of the PDF to text. The pdftotext <br clear="none">> “-layout” switch has improved over time to where I seldom see this <br clear="none">> any more, but it can happen.<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> Like I said, it can get complicated.<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> Peter<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> From: KMyMoney <<a shape="rect" ymailto="mailto:kmymoney-bounces@kde.org" href="mailto:kmymoney-bounces@kde.org">kmymoney-bounces@kde.org</a>> On Behalf Of Jack<br clear="none">> Sent: Thursday, December 31, 2020 3:14 PM<br clear="none">> To: <a shape="rect" ymailto="mailto:kmymoney@kde.org" href="mailto:kmymoney@kde.org">kmymoney@kde.org</a><br clear="none">> Subject: Re: More pdf2kmymoney (overflos/wrapping lines)<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> I started this yesterday, and I know there have been additional posts <br clear="none">> since, but I think this particular point hasn't been resolved.<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> On 12/30/20 8:59 PM, <a shape="rect" ymailto="mailto:pjfarley3@earthlink.net" href="mailto:pjfarley3@earthlink.net">pjfarley3@earthlink.net</a> wrote:<br clear="none">> <br clear="none">> <br clear="none">> In my experience pdftotext does not “overflow lines”. That is <br clear="none">> probably “extra information” (i.e., “Memo” field data) related to the <br clear="none">> transaction on the previous line. That is quite common in bank <br clear="none">> statements. You have to expect such lines and be prepared to attach <br clear="none">> them to the prior transaction. I do it as the “Memo” field in my <br clear="none">> output.<br clear="none">> <br clear="none">> <br clear="none">> Aaron would have to confirm, but I suspect he refers to a case where <br clear="none">> a single table row as shown in the PDF has two rows of text in each <br clear="none">> cell, becuase there is just too much text for one line. Because PDF <br clear="none">> knows only about where exactly on the page any text is, but not why <br clear="none">> it is there (no information about things like tables) the text output <br clear="none">> would have two lines. The first would have the first line of text <br clear="none">> from each cell, and the send would have the second line of text from <br clear="none">> each cell. Putting them back together is theoretically possible, but <br clear="none">> only if there is some way to know that the second line is not a new <br clear="none">> row (missing header info?) or part of a manually controlled cleanup <br clear="none">> phase of the conversion.<br clear="none">> <br clear="none"></div></div>
</div>
</div></body></html>