OFX Import Matching Problem
Jack
ostroffjh at users.sourceforge.net
Sat Jun 22 21:06:58 BST 2019
On 2019.06.22 11:51, Thomas Baumgart wrote:
> On Samstag, 22. Juni 2019 17:16:45 CEST Brendan Coupe wrote:
>
> > I see why my 30 day limit did not help. It does when I manually copy
> > the most recent paycheck and then import the OFX data.
> >
> > I have an idea how to deal with this. In the Default Account tab for
> > the payee there is a checkbox "Use the default category..." If
> checked
> > you can select a single default category.
> >
> > How about making 4 radio buttons:
> >
> > - None
> > - Most recent transaction
> > - Closest amount
> > - Use the default category... (enable the dropdown list when
> selected)
>
> How about a system wide setting with the above option set (maybe
> without the last one) and a per payee override option? Introduction
> of this feature would be done as follows:
>
> a) the system wide default setting is "closest amount" (which
> reflects today's default)
> b) payees that don't have the category set will use the system wide
> setting
> c) payees that have a default category set will override the system
> wide setting with the default category
I THINK that sounds right, but I'm wondering what should be per account
vs per payee vs per category.
I may be over thinking it - but when looking for a transaction to
"match," am I missing something, or do we still have a lack of clear
terminology to distinguish finding the existing transaction to use as a
"model" [again - not a formal term] for an imported transaction vs.
what I think of as "true" matching - to find if the imported
transaction is a duplicate of one already present? I hate to admit it,
but I'm still not completely clear of that steps followed - first
(assuming the imported transaction is not a duplicate) to find the best
transaction to model (based on what) and then whether to use the payee
and/or category of that transaction, or the default category of the
assumed payee. Just to add to the mix here, the problem I often face
is for a payee which usually has transactions with a single category
(marked default for that payee) I sometimes create split transactions -
and it is almost always wrong to use one of these split transactions as
the model for a newly imported transaction. How might that fit into
this process?
>
> Does that make sense? Any objections anyone?
>
> Thomas
>
>
>
> > On Sat, Jun 22, 2019 at 4:25 AM Thomas Baumgart <thb at net-bembel.de>
> wrote:
> > >
> > > On Freitag, 21. Juni 2019 22:55:29 CEST Brendan Coupe wrote:
> > >
> > > > I'm running a week old build from the 5.0 branch on Fedora 29.
> > > >
> > > > When I download my savings account transaction using online
> banking
> > > > the paycheck frequently matches with a very old paycheck. This
> results
> > > > in the splits being way off.
> > > >
> > > > This happens when the amount of the new paycheck is not very
> close to
> > > > the most recent paycheck which has been happening a lot lately
> due to
> > > > reimbursed business expanses.
> > > >
> > > > On the import tab of the ledge settings I have tried setting
> "Match
> > > > transaction within days" from 7 days (paycheck is weekly) to 30
> days
> > > > and the same thing happens. KMM is definitely matching
> transactions
> > > > that are much more than 30 days old. In fact the transaction
> that it
> > > > matched was only $0.01 closer to the new transaction than the
> previous
> > > > paycheck (difference was $8.29 versus $8.30). The transaction it
> > > > matched is over 18 months old. It appears to be ignoring the
> "Match
> > > > transaction within days" setting. it's simply matching the
> transaction
> > > > from the same payee that is closest in value.
> > > >
> > > > I'm pretty sure this is fairly new behavior but I'm not sure if
> it
> > > > started with the initial version of KMM5 that I used or more
> recently.
> > >
> > > This probably goes back to a change I made in January this year:
> > >
> > >
> https://cgit.kde.org/kmymoney.git/commit/?id=447213e04d6e7ab9022caeb5c258800625036967
> > >
> > > which added the part of choosing an ancient transaction based on
> the smallest difference in amount whereas before it only used old
> transactions if the amount was identical.
> > >
> > > Here's what I found in the code (which perfectly explains what
> you encounter):
> > >
> > > In case the payee name has been found, the following will take
> place:
> > >
> > > // Fill in other side of the transaction (category/etc)
> based on payee
> > > //
> > > // [...]
> > > //
> > > // We'll search for the most recent transaction in this
> account with
> > > // this payee. If this reference transaction is a simple
> 2-split
> > > // transaction, it's simple. If it's a complex split, and
> the amounts
> > > // are different, we have a problem. Somehow we have to
> balance the
> > > // transaction. For now, we'll leave it unbalanced, and
> let the user
> > > // handle it.
> > >
> > > For the category to be found, the first thing is to check if the
> payee has a default category assigned. If yes, it is taken and we're
> done. If not, all transactions for that payee in the account will be
> searched backwards. Note: no date filtering here, which certainly is
> the cause of the behavior you encounter. The algorithm then works as
> follows:
> > >
> > > // if there is more than one matching transaction, try
> to be a little
> > > // smart about which one we use. we scan them all and
> check if
> > > // we find an exact match or use the one with the
> closest value
> > >
> > > The scan works backwards with the last one being the default. So
> we have at least one transaction for that payee, and in case of
> multiple the one with the least difference in amount will be
> selected. Then we continue with:
> > >
> > > // in case the old transaction has two splits
> > > // we simply inverse the amount of the current
> > > // transaction found in s1. In other cases (more
> > > // than two splits we copy all splits and don't
> > > // modify the splits. This may lead to unbalanced
> > > // transactions which the user has to fix manually
> > >
> > > The point is, that we are not talking about 'matching' at this
> point but automatic categorization of the imported transaction.
> Matching happens in the next step when KMyMoney tries to figure out
> if you already have the said transaction on file (entered manually
> for example). And it is for that matching that the interval is used,
> but not the automatic categorization happening in the step before.
> Matching actually means merge two transactions (the one on file and
> the imported one) into a single one. This is not what is happening
> for you and what you certainly don't want with older transactions.
> > >
> > > I am not sure at this point what happens, if I increase the
> matching period beyond one month and another salary payment comes in
> and it matches. It is certainly not detected as a duplicate but does
> it match the transactions? I honestly don't know and have never tried.
> > >
> > > Why did I implement the feature as it is: I receive two payments
> with very different amounts from the same payee each month and they
> differ in categories. One of the amounts varies each month and the
> other one is fix (we talk salary and reimbursement here as well, but
> I receive them in two payments). The old behavior was always wrong,
> because taking the last payment from that payee as categorization
> base is certainly false and only worked when there was no
> reimbursement (which means I received two salary payments in a row).
> So for me, a matching period of a few days is OK, but for the
> categorization I probably need a few months. The default to take the
> last one on file if nothing else was found is probably a good
> decision.
> > >
> > > Would a new setting to limit the search for transactions to do
> the auto categorization help here? What would best describe it and
> what would be a neat name for it?
> > >
> > > Any ideas, anyone?
>
> --
>
> Regards
>
> Thomas Baumgart
>
> https://www.signal.org/ Signal, the better WhatsApp
> -------------------------------------------------------------
> A: Because it destroys the flow of the conversation
> Q: Why is top-posting bad?
> A: Top-posting
> Q: What is the most annoying thing in e-mail?
> -------------------------------------------------------------
>
More information about the KMyMoney-devel
mailing list