OFX Import Matching Problem

Brendan Coupe brendan at coupeware.com
Sat Jun 22 21:04:22 BST 2019


I like having a global setting but I don't think just having a global
setting works. In fact the more I think about it the more I think the
closest amount option should have an option for how far back it can
look, just like the matching does now. And I think it needs to be
global and per payee.

Consider two payees.

One is a weekly paycheck that can have expenses added to it. It has
many categories in the splits. Only the reimbursed expenses changes
significantly week to week. It would be best to set this to match
categories based on the closest total amount over the past several
weeks or months. I my case I would probably set this to match based on
the closest total amount over the past 30. Going back further than
that is likely to pick up an old paycheck that has very different
slips.

The second example is the Department of Motor Vehicles. Every year i
pay to register my car and let's say it $200. Every 4 or 5 years I pay
to renew my drivers license, let's say that $50. It would be nice if
KMM could select the correct categories based on the one that is
closest in value over a much longer period of time, probably 4 or 5
years in this case.

Setting this to most recent transaction globally would not work well
for either of these cases. Closest amount doesn't work well for the
first example and probably won't work for the second example over time
without a limit on the time frame that KMM looks for the closest
amount. Making it global means only one of the two examples will work
well (the paycheck payee has only been active for 2 years and more
paychecks will only make this worse with not time limit.

If you can make a quick fix as a global setting that would help with
example one but I hope the per payee can be implemented some time. If
global and per payee existed I would probably set global to most
recent and then use the time limited closest amount for specific
payees. This might make assigning default categories unnecessary for
most payees.

Sorry for top posting but my email is hosted on GMail (probably a
bigger sin many years ago before they revealed how evil they would
become). Apparently they no longer support bottom posting. Moving to
another email host won't be easy since many of my 100 free accounts
are in use by challenged users.

----
Brendan Coupe

On Sat, Jun 22, 2019 at 9:51 AM Thomas Baumgart <thb at net-bembel.de> wrote:
>
> On Samstag, 22. Juni 2019 17:16:45 CEST Brendan Coupe wrote:
>
> > I see why my 30 day limit did not help. It does when I manually copy
> > the most recent paycheck and then import the OFX data.
> >
> > I have an idea how to deal with this. In the Default Account tab for
> > the payee there is a checkbox "Use the default category..." If checked
> > you can select a single default category.
> >
> > How about making 4 radio buttons:
> >
> > - None
> > - Most recent transaction
> > - Closest amount
> > - Use the default category... (enable the dropdown list when selected)
>
> How about a system wide setting with the above option set (maybe without the last one) and a per payee override option? Introduction of this feature would be done as follows:
>
> a) the system wide default setting is "closest amount" (which reflects today's default)
> b) payees that don't have the category set will use the system wide setting
> c) payees that have a default category set will override the system wide setting with the default category
>
> Does that make sense? Any objections anyone?
>
> Thomas
>
>
>
> > On Sat, Jun 22, 2019 at 4:25 AM Thomas Baumgart <thb at net-bembel.de> wrote:
> > >
> > > On Freitag, 21. Juni 2019 22:55:29 CEST Brendan Coupe wrote:
> > >
> > > > I'm running a week old build from the 5.0 branch on Fedora 29.
> > > >
> > > > When I download my savings account transaction using online banking
> > > > the paycheck frequently matches with a very old paycheck. This results
> > > > in the splits being way off.
> > > >
> > > > This happens when the amount of the new paycheck is not very close to
> > > > the most recent paycheck which has been happening a lot lately due to
> > > > reimbursed business expanses.
> > > >
> > > > On the import tab of the ledge settings I have tried setting "Match
> > > > transaction within days" from 7 days (paycheck is weekly) to 30 days
> > > > and the same thing happens. KMM is definitely matching transactions
> > > > that are much more than 30 days old. In fact the transaction that it
> > > > matched was only $0.01 closer to the new transaction than the previous
> > > > paycheck (difference was $8.29 versus $8.30). The transaction it
> > > > matched is over 18 months old. It appears to be ignoring the "Match
> > > > transaction within days" setting. it's simply matching the transaction
> > > > from the same payee that is closest in value.
> > > >
> > > > I'm pretty sure this is fairly new behavior but I'm not sure if it
> > > > started with the initial version of KMM5 that I used or more recently.
> > >
> > > This probably goes back to a change I made in January this year:
> > >
> > >    https://cgit.kde.org/kmymoney.git/commit/?id=447213e04d6e7ab9022caeb5c258800625036967
> > >
> > > which added the part of choosing an ancient transaction based on the smallest difference in amount whereas before it only used old transactions if the amount was identical.
> > >
> > > Here's what I found in the code (which perfectly explains what you encounter):
> > >
> > > In case the payee name has been found, the following will take place:
> > >
> > >       // Fill in other side of the transaction (category/etc) based on payee
> > >       //
> > >                 // [...]
> > >                 //
> > >       // We'll search for the most recent transaction in this account with
> > >       // this payee.  If this reference transaction is a simple 2-split
> > >       // transaction, it's simple.  If it's a complex split, and the amounts
> > >       // are different, we have a problem.  Somehow we have to balance the
> > >       // transaction.  For now, we'll leave it unbalanced, and let the user
> > >       // handle it.
> > >
> > > For the category to be found, the first thing is to check if the payee has a default category assigned. If yes, it is taken and we're done. If not, all transactions for that payee in the account will be searched backwards. Note: no date filtering here, which certainly is the cause of the behavior you encounter. The algorithm then works as follows:
> > >
> > >           // if there is more than one matching transaction, try to be a little
> > >           // smart about which one we use.  we scan them all and check if
> > >           // we find an exact match or use the one with the closest value
> > >
> > > The scan works backwards with the last one being the default. So we have at least one transaction for that payee, and in case of multiple the one with the least difference in amount will be selected. Then we continue with:
> > >
> > >                 // in case the old transaction has two splits
> > >                 // we simply inverse the amount of the current
> > >                 // transaction found in s1. In other cases (more
> > >                 // than two splits we copy all splits and don't
> > >                 // modify the splits. This may lead to unbalanced
> > >                 // transactions which the user has to fix manually
> > >
> > > The point is, that we are not talking about 'matching' at this point but automatic categorization of the imported transaction. Matching happens in the next step when KMyMoney tries to figure out if you already have the said transaction on file (entered manually for example). And it is for that matching that the interval is used, but not the automatic categorization happening in the step before. Matching actually means merge two transactions (the one on file and the imported one) into a single one. This is not what is happening for you and what you certainly don't want with older transactions.
> > >
> > > I am not sure at this point what happens, if I increase the matching period beyond one month and another salary payment comes in and it matches. It is certainly not detected as a duplicate but does it match the transactions? I honestly don't know and have never tried.
> > >
> > > Why did I implement the feature as it is: I receive two payments with very different amounts from the same payee each month and they differ in categories. One of the amounts varies each month and the other one is fix (we talk salary and reimbursement here as well, but I receive them in two payments). The old behavior was always wrong, because taking the last payment from that payee as categorization base is certainly false and only worked when there was no reimbursement (which means I received two salary payments in a row). So for me, a matching period of a few days is OK, but for the categorization I probably need a few months. The default to take the last one on file if nothing else was found is probably a good decision.
> > >
> > > Would a new setting to limit the search for transactions to do the auto categorization help here? What would best describe it and what would be a neat name for it?
> > >
> > > Any ideas, anyone?
>
> --
>
> Regards
>
> Thomas Baumgart
>
> https://www.signal.org/       Signal, the better WhatsApp
> -------------------------------------------------------------
> A: Because it destroys the flow of the conversation
> Q: Why is top-posting bad?
> A: Top-posting
> Q: What is the most annoying thing in e-mail?
> -------------------------------------------------------------


More information about the KMyMoney-devel mailing list