OFX Import Matching Problem

Sat Jun 22 21:06:58 BST 2019

On 2019.06.22 11:51, Thomas Baumgart wrote:
> On Samstag, 22. Juni 2019 17:16:45 CEST Brendan Coupe wrote:
> 
> > I see why my 30 day limit did not help. It does when I manually copy
> > the most recent paycheck and then import the OFX data.
> >
> > I have an idea how to deal with this. In the Default Account tab for
> > the payee there is a checkbox "Use the default category..." If  
> checked
> > you can select a single default category.
> >
> > How about making 4 radio buttons:
> >
> > - None
> > - Most recent transaction
> > - Closest amount
> > - Use the default category... (enable the dropdown list when  
> selected)
> 
> How about a system wide setting with the above option set (maybe  
> without the last one) and a per payee override option? Introduction  
> of this feature would be done as follows:
> 
> a) the system wide default setting is "closest amount" (which  
> reflects today's default)
> b) payees that don't have the category set will use the system wide  
> setting
> c) payees that have a default category set will override the system  
> wide setting with the default category
I THINK that sounds right, but I'm wondering what should be per account  
vs per payee vs per category.

I may be over thinking it - but when looking for a transaction to  
"match," am I missing something, or do we still have a lack of clear  
terminology to distinguish finding the existing transaction to use as a  
"model" [again - not a formal term] for an imported transaction vs.  
what I think of as "true" matching - to find if the imported  
transaction is a duplicate of one already present?  I hate to admit it,  
but I'm still not completely clear of that steps followed - first  
(assuming the imported transaction is not a duplicate) to find the best  
transaction to model (based on what) and then whether to use the payee  
and/or category of that transaction, or the default category of the  
assumed payee.  Just to add to the mix here, the problem I often face  
is for a payee which usually has transactions with a single category  
(marked default for that payee) I sometimes create split transactions -  
and it is almost always wrong to use one of these split transactions as  
the model for a newly imported transaction.  How might that fit into  
this process?

> 
> Does that make sense? Any objections anyone?
> 
> Thomas
> 
> 
> 
> > On Sat, Jun 22, 2019 at 4:25 AM Thomas Baumgart <thb at net-bembel.de>  
> wrote:
> > >
> > > On Freitag, 21. Juni 2019 22:55:29 CEST Brendan Coupe wrote:
> > >
> > > > I'm running a week old build from the 5.0 branch on Fedora 29.
> > > >
> > > > When I download my savings account transaction using online  
> banking
> > > > the paycheck frequently matches with a very old paycheck. This  
> results
> > > > in the splits being way off.
> > > >
> > > > This happens when the amount of the new paycheck is not very  
> close to
> > > > the most recent paycheck which has been happening a lot lately  
> due to
> > > > reimbursed business expanses.
> > > >
> > > > On the import tab of the ledge settings I have tried setting  
> "Match
> > > > transaction within days" from 7 days (paycheck is weekly) to 30  
> days
> > > > and the same thing happens. KMM is definitely matching  
> transactions
> > > > that are much more than 30 days old. In fact the transaction  
> that it
> > > > matched was only $0.01 closer to the new transaction than the  
> previous
> > > > paycheck (difference was $8.29 versus $8.30). The transaction it
> > > > matched is over 18 months old. It appears to be ignoring the  
> "Match
> > > > transaction within days" setting. it's simply matching the  
> transaction
> > > > from the same payee that is closest in value.
> > > >
> > > > I'm pretty sure this is fairly new behavior but I'm not sure if  
> it
> > > > started with the initial version of KMM5 that I used or more  
> recently.
> > >
> > > This probably goes back to a change I made in January this year:
> > >
> > >     
> https://cgit.kde.org/kmymoney.git/commit/?id=447213e04d6e7ab9022caeb5c258800625036967
> > >
> > > which added the part of choosing an ancient transaction based on  
> the smallest difference in amount whereas before it only used old  
> transactions if the amount was identical.
> > >
> > > Here's what I found in the code (which perfectly explains what  
> you encounter):
> > >
> > > In case the payee name has been found, the following will take  
> place:
> > >
> > >       // Fill in other side of the transaction (category/etc)  
> based on payee
> > >       //
> > >                 // [...]
> > >                 //
> > >       // We'll search for the most recent transaction in this  
> account with
> > >       // this payee.  If this reference transaction is a simple  
> 2-split
> > >       // transaction, it's simple.  If it's a complex split, and  
> the amounts
> > >       // are different, we have a problem.  Somehow we have to  
> balance the
> > >       // transaction.  For now, we'll leave it unbalanced, and  
> let the user
> > >       // handle it.
> > >
> > > For the category to be found, the first thing is to check if the  
> payee has a default category assigned. If yes, it is taken and we're  
> done. If not, all transactions for that payee in the account will be  
> searched backwards. Note: no date filtering here, which certainly is  
> the cause of the behavior you encounter. The algorithm then works as  
> follows:
> > >
> > >           // if there is more than one matching transaction, try  
> to be a little
> > >           // smart about which one we use.  we scan them all and  
> check if
> > >           // we find an exact match or use the one with the  
> closest value
> > >
> > > The scan works backwards with the last one being the default. So  
> we have at least one transaction for that payee, and in case of  
> multiple the one with the least difference in amount will be  
> selected. Then we continue with:
> > >
> > >                 // in case the old transaction has two splits
> > >                 // we simply inverse the amount of the current
> > >                 // transaction found in s1. In other cases (more
> > >                 // than two splits we copy all splits and don't
> > >                 // modify the splits. This may lead to unbalanced
> > >                 // transactions which the user has to fix manually
> > >
> > > The point is, that we are not talking about 'matching' at this  
> point but automatic categorization of the imported transaction.  
> Matching happens in the next step when KMyMoney tries to figure out  
> if you already have the said transaction on file (entered manually  
> for example). And it is for that matching that the interval is used,  
> but not the automatic categorization happening in the step before.  
> Matching actually means merge two transactions (the one on file and  
> the imported one) into a single one. This is not what is happening  
> for you and what you certainly don't want with older transactions.
> > >
> > > I am not sure at this point what happens, if I increase the  
> matching period beyond one month and another salary payment comes in  
> and it matches. It is certainly not detected as a duplicate but does  
> it match the transactions? I honestly don't know and have never tried.
> > >
> > > Why did I implement the feature as it is: I receive two payments  
> with very different amounts from the same payee each month and they  
> differ in categories. One of the amounts varies each month and the  
> other one is fix (we talk salary and reimbursement here as well, but  
> I receive them in two payments). The old behavior was always wrong,  
> because taking the last payment from that payee as categorization  
> base is certainly false and only worked when there was no  
> reimbursement (which means I received two salary payments in a row).  
> So for me, a matching period of a few days is OK, but for the  
> categorization I probably need a few months. The default to take the  
> last one on file if nothing else was found is probably a good  
> decision.
> > >
> > > Would a new setting to limit the search for transactions to do  
> the auto categorization help here? What would best describe it and  
> what would be a neat name for it?
> > >
> > > Any ideas, anyone?
> 
> --
> 
> Regards
> 
> Thomas Baumgart
> 
> https://www.signal.org/       Signal, the better WhatsApp
> -------------------------------------------------------------
> A: Because it destroys the flow of the conversation
> Q: Why is top-posting bad?
> A: Top-posting
> Q: What is the most annoying thing in e-mail?
> -------------------------------------------------------------
>