OFX Import Matching Problem
Jack
ostroffjh at users.sourceforge.net
Sun Jun 23 16:08:33 BST 2019
On 2019.06.23 03:56, Thomas Baumgart wrote:
> On Samstag, 22. Juni 2019 23:33:43 CEST Jack wrote:
>
>> Minor point - I hope you mean default category (rather than account)
>> for a payee.
> >
>> Primarily, I'm just trying to think of cases that might end up with
>> unintended consequences, such as your current problem, after the
>> change Thomas made in January. I'm also partly just talking out
>> loud, to make sure I understand how things work, as I often discover
>> is not the case. One thing I was not framing correctly in my mind
>> is that a split transaction has only one payee, but multiple
>> categories. You are asking (do I have this right?) to choose the
>> matching transaction not based on total amount of the transaction,
>> but the closest amount (within a specified time limit) for a
>> category specified in the configuration for the payee. Given the
>> newly imported transaction is not yet split, are you trying to match
>> the total amount of the new transaction to the amount of the
>> specified category in past transactions? (Or am I further off the
>> mark than I thought?)
>
> This may get clearer for you if you start thinking in splits. Each
> (non-zero and balanced) transaction has at least two splits: one for
> the account and at least one for a category. The imported new
> transaction only has one split (as the category is yet unknown). So
> what KMyMoney does is to take a list of transactions filtered by
> payee and account (which means: transactions that have a split with
> the payee in that account. It would even work if each split of a
> transaction can have a different payee, which exists as wish list
> item). Amount comparison of the new and existing transactions happens
> on the split referencing the account (which in fact is what you refer
> to as the total amount). Anything else would not really work.
Thanks, that does give me a much better picture than I had.
>
> The old implementation (prior to my January change) looked for a
> transaction in that list that has the exact same amount and copied
> the categories that were assigned. In case no transaction with the
> exact same amount exists, it simply took the last one entered for
> that payee.
>
> This is what bugged me with two alternating transactions from the
> same payee with different amounts each month: it took the wrong one
> most of the time. Hence my change, which now works as follows:
>
> KMyMoney looks for a transaction in the list filtered by payee and
> account that has the exact same amount and copies the categories that
> were assigned. In case no transaction with the exact same amount
> exists, it simply takes the one with the smallest difference in
> amount for that payee. While doing so, it goes back to day one of
> your data in that account.
>
> Brendan now asks to limit this search further by adding a date filter
> which should be configurable on a per payee basis.
OK, so I have no objection to these changes. As far as I can tell,
they will have no effect on the problem I've mentioned, but for now,
the only thing I can think of that would help me is a setting to never
match to a split transaction, or perhaps to only use the category with
the largest split amount from the previous transaction, and I suspect
that would not be a good rule in general (even as an optional setting.)
>
> That seems to be doable with an addition to the payee editor and a
> new storage attribute.
>
> Thomas
>
>
> > Separately, I'm trying to think how I could use this for my problem,
> > which is that I never (or very rarely) want to match a newly
> imported
> > transaction to a split transaction, which seems to happen fairly
> often
> > when the most recent transaction for the payee is split.
> >
> > On 2019.06.22 16:23, Brendan Coupe wrote:
> > > If I understood Thomas correctly matching is only looking at
> existing
> > > transactions in the account. That works fine for me when I
> duplicate
> > > the previous paycheck prior to importing the OXF file from my
> bank.
> > > Not an ideal way to do this but when I don't it matches the
> closest
> > > amount for that payee since the beginning of time.
> > >
> > > The paycheck has 15 splits so a single default account does not
> work.
> > > Even if I could assign 15 default accounts I would have to update
> them
> > > fairly often or they would become less and less useful.
> > >
> > > Basically what I am asking for is an option in the payee default
> > > account settings that says pick the closest amount in the past xx
> days
> > > and use that/those category(ies). That alone would eliminate this
> > > weekly problem for me and probably many others that are less
> frequent.
> > > The global settings and my original suggestion are probably not
> needed
> > > if this setting was added for each payee.
> > >
> > > ----
> > > Brendan Coupe
> > >
> > > On Sat, Jun 22, 2019 at 2:07 PM Jack
> > > <ostroffjh at users.sourceforge.net> wrote:
> > > >
> > > > On 2019.06.22 11:51, Thomas Baumgart wrote:
> > > > > On Samstag, 22. Juni 2019 17:16:45 CEST Brendan Coupe wrote:
> > > > >
> > > > > > I see why my 30 day limit did not help. It does when I
> manually
> > > copy
> > > > > > the most recent paycheck and then import the OFX data.
> > > > > >
> > > > > > I have an idea how to deal with this. In the Default Account
> > > tab for
> > > > > > the payee there is a checkbox "Use the default category..."
> If
> > > > > checked
> > > > > > you can select a single default category.
> > > > > >
> > > > > > How about making 4 radio buttons:
> > > > > >
> > > > > > - None
> > > > > > - Most recent transaction
> > > > > > - Closest amount
> > > > > > - Use the default category... (enable the dropdown list when
> > > > > selected)
> > > > >
> > > > > How about a system wide setting with the above option set
> (maybe
> > > > > without the last one) and a per payee override option?
> > > Introduction
> > > > > of this feature would be done as follows:
> > > > >
> > > > > a) the system wide default setting is "closest amount" (which
> > > > > reflects today's default)
> > > > > b) payees that don't have the category set will use the system
> > > wide
> > > > > setting
> > > > > c) payees that have a default category set will override the
> > > system
> > > > > wide setting with the default category
> > > > I THINK that sounds right, but I'm wondering what should be per
> > > account
> > > > vs per payee vs per category.
> > > >
> > > > I may be over thinking it - but when looking for a transaction
> to
> > > > "match," am I missing something, or do we still have a lack of
> clear
> > > > terminology to distinguish finding the existing transaction to
> use
> > > as a
> > > > "model" [again - not a formal term] for an imported transaction
> vs.
> > > > what I think of as "true" matching - to find if the imported
> > > > transaction is a duplicate of one already present? I hate to
> admit
> > > it,
> > > > but I'm still not completely clear of that steps followed -
> first
> > > > (assuming the imported transaction is not a duplicate) to find
> the
> > > best
> > > > transaction to model (based on what) and then whether to use the
> > > payee
> > > > and/or category of that transaction, or the default category of
> the
> > > > assumed payee. Just to add to the mix here, the problem I often
> > > face
> > > > is for a payee which usually has transactions with a single
> category
> > > > (marked default for that payee) I sometimes create split
> > > transactions -
> > > > and it is almost always wrong to use one of these split
> > > transactions as
> > > > the model for a newly imported transaction. How might that fit
> into
> > > > this process?
> > > >
> > > > >
> > > > > Does that make sense? Any objections anyone?
> > > > >
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > > > On Sat, Jun 22, 2019 at 4:25 AM Thomas Baumgart
> > > <thb at net-bembel.de>
> > > > > wrote:
> > > > > > >
> > > > > > > On Freitag, 21. Juni 2019 22:55:29 CEST Brendan Coupe
> wrote:
> > > > > > >
> > > > > > > > I'm running a week old build from the 5.0 branch on
> Fedora
> > > 29.
> > > > > > > >
> > > > > > > > When I download my savings account transaction using
> online
> > > > > banking
> > > > > > > > the paycheck frequently matches with a very old
> paycheck.
> > > This
> > > > > results
> > > > > > > > in the splits being way off.
> > > > > > > >
> > > > > > > > This happens when the amount of the new paycheck is not
> very
> > > > > close to
> > > > > > > > the most recent paycheck which has been happening a lot
> > > lately
> > > > > due to
> > > > > > > > reimbursed business expanses.
> > > > > > > >
> > > > > > > > On the import tab of the ledge settings I have tried
> setting
> > > > > "Match
> > > > > > > > transaction within days" from 7 days (paycheck is
> weekly)
> > > to 30
> > > > > days
> > > > > > > > and the same thing happens. KMM is definitely matching
> > > > > transactions
> > > > > > > > that are much more than 30 days old. In fact the
> transaction
> > > > > that it
> > > > > > > > matched was only $0.01 closer to the new transaction
> than
> > > the
> > > > > previous
> > > > > > > > paycheck (difference was $8.29 versus $8.30). The
> > > transaction it
> > > > > > > > matched is over 18 months old. It appears to be
> ignoring the
> > > > > "Match
> > > > > > > > transaction within days" setting. it's simply matching
> the
> > > > > transaction
> > > > > > > > from the same payee that is closest in value.
> > > > > > > >
> > > > > > > > I'm pretty sure this is fairly new behavior but I'm not
> > > sure if
> > > > > it
> > > > > > > > started with the initial version of KMM5 that I used or
> more
> > > > > recently.
> > > > > > >
> > > > > > > This probably goes back to a change I made in January this
> > > year:
> > > > > > >
> > > > > > >
> > > > >
> > >
> https://cgit.kde.org/kmymoney.git/commit/?id=447213e04d6e7ab9022caeb5c258800625036967
> > > > > > >
> > > > > > > which added the part of choosing an ancient transaction
> based
> > > on
> > > > > the smallest difference in amount whereas before it only used
> old
> > > > > transactions if the amount was identical.
> > > > > > >
> > > > > > > Here's what I found in the code (which perfectly explains
> what
> > > > > you encounter):
> > > > > > >
> > > > > > > In case the payee name has been found, the following will
> take
> > > > > place:
> > > > > > >
> > > > > > > // Fill in other side of the transaction
> (category/etc)
> > > > > based on payee
> > > > > > > //
> > > > > > > // [...]
> > > > > > > //
> > > > > > > // We'll search for the most recent transaction in
> this
> > > > > account with
> > > > > > > // this payee. If this reference transaction is a
> > > simple
> > > > > 2-split
> > > > > > > // transaction, it's simple. If it's a complex
> split,
> > > and
> > > > > the amounts
> > > > > > > // are different, we have a problem. Somehow we
> have to
> > > > > balance the
> > > > > > > // transaction. For now, we'll leave it
> unbalanced, and
> > > > > let the user
> > > > > > > // handle it.
> > > > > > >
> > > > > > > For the category to be found, the first thing is to check
> if
> > > the
> > > > > payee has a default category assigned. If yes, it is taken and
> > > we're
> > > > > done. If not, all transactions for that payee in the account
> will
> > > be
> > > > > searched backwards. Note: no date filtering here, which
> certainly
> > > is
> > > > > the cause of the behavior you encounter. The algorithm then
> works
> > > as
> > > > > follows:
> > > > > > >
> > > > > > > // if there is more than one matching
> transaction,
> > > try
> > > > > to be a little
> > > > > > > // smart about which one we use. we scan them
> all
> > > and
> > > > > check if
> > > > > > > // we find an exact match or use the one with
> the
> > > > > closest value
> > > > > > >
> > > > > > > The scan works backwards with the last one being the
> default.
> > > So
> > > > > we have at least one transaction for that payee, and in case
> of
> > > > > multiple the one with the least difference in amount will be
> > > > > selected. Then we continue with:
> > > > > > >
> > > > > > > // in case the old transaction has two
> splits
> > > > > > > // we simply inverse the amount of the
> current
> > > > > > > // transaction found in s1. In other cases
> > > (more
> > > > > > > // than two splits we copy all splits and
> > > don't
> > > > > > > // modify the splits. This may lead to
> > > unbalanced
> > > > > > > // transactions which the user has to fix
> > > manually
> > > > > > >
> > > > > > > The point is, that we are not talking about 'matching' at
> this
> > > > > point but automatic categorization of the imported
> transaction.
> > > > > Matching happens in the next step when KMyMoney tries to
> figure
> > > out
> > > > > if you already have the said transaction on file (entered
> manually
> > > > > for example). And it is for that matching that the interval is
> > > used,
> > > > > but not the automatic categorization happening in the step
> before.
> > > > > Matching actually means merge two transactions (the one on
> file
> > > and
> > > > > the imported one) into a single one. This is not what is
> happening
> > > > > for you and what you certainly don't want with older
> transactions.
> > > > > > >
> > > > > > > I am not sure at this point what happens, if I increase
> the
> > > > > matching period beyond one month and another salary payment
> comes
> > > in
> > > > > and it matches. It is certainly not detected as a duplicate
> but
> > > does
> > > > > it match the transactions? I honestly don't know and have
> never
> > > tried.
> > > > > > >
> > > > > > > Why did I implement the feature as it is: I receive two
> > > payments
> > > > > with very different amounts from the same payee each month and
> > > they
> > > > > differ in categories. One of the amounts varies each month
> and the
> > > > > other one is fix (we talk salary and reimbursement here as
> well,
> > > but
> > > > > I receive them in two payments). The old behavior was always
> > > wrong,
> > > > > because taking the last payment from that payee as
> categorization
> > > > > base is certainly false and only worked when there was no
> > > > > reimbursement (which means I received two salary payments in a
> > > row).
> > > > > So for me, a matching period of a few days is OK, but for the
> > > > > categorization I probably need a few months. The default to
> take
> > > the
> > > > > last one on file if nothing else was found is probably a good
> > > > > decision.
> > > > > > >
> > > > > > > Would a new setting to limit the search for transactions
> to do
> > > > > the auto categorization help here? What would best describe
> it and
> > > > > what would be a neat name for it?
> > > > > > >
> > > > > > > Any ideas, anyone?
> > > > >
> > > > > --
> > > > >
> > > > > Regards
> > > > >
> > > > > Thomas Baumgart
> > > > >
> > > > > https://www.signal.org/ Signal, the better WhatsApp
> > > > > -------------------------------------------------------------
> > > > > A: Because it destroys the flow of the conversation
> > > > > Q: Why is top-posting bad?
> > > > > A: Top-posting
> > > > > Q: What is the most annoying thing in e-mail?
> > > > > -------------------------------------------------------------
> > > > >
> > > >
> > >
> >
>
> --
>
> Regards
>
> Thomas Baumgart
>
> https://www.signal.org/ Signal, the better WhatsApp
> -------------------------------------------------------------
> 'Good code is not created, it evolves.'
> -- George Anzinger
> -------------------------------------------------------------
>
More information about the KMyMoney-devel
mailing list