OFX Import Matching Problem

Jack ostroffjh at users.sourceforge.net
Sun Jun 23 17:24:16 BST 2019


On 2019.06.23 12:16, Brendan Coupe wrote:
> Thomas,
> 
> Let me know if you have a patch for me to test. I can easily add it  
> with my build scripts.
> 
> Jack, do you have a default category for the payee that sometimes has  
> multiple categories and sometimes (most times) does not? I assume  
> that setting a default category for a payee takes precedence. If the  
> transactions with one category has different categories each time  
> then I'm guessing that's an edge case that will be hard to handle  
> every time.
Ah - I did NOT have a default category set for that payee.  I have just  
set it to the most common use, and I will keep an eye out to see if  
that seems to solve my problem.  Thanks for the suggestion.
> 
> Your question about account versus category points out what I think  
> is an error in the user interface. I used the Payee tab label  
> "Default Account" in my email but the tab should probably be labeled  
> "Default Category" since that's what it really is.
The underlying "problem" is that internally to KMM, categories ARE  
accounts.  I think that is mentioned in the handbook, but it is  
certainly a frequent source of confusion.  I agree it would be good to  
change the tab label from "account" to "category."  Thomas - should I  
open a wishlist so we don't lose this?

Jack
> 
> 
> ----
> Brendan Coupe
> 
> On Sun, Jun 23, 2019 at 9:08 AM Jack  
> <ostroffjh at users.sourceforge.net> wrote:
> >
> > On 2019.06.23 03:56, Thomas Baumgart wrote:
> > > On Samstag, 22. Juni 2019 23:33:43 CEST Jack wrote:
> > >
> > >> Minor point - I hope you mean default category (rather than  
> account)
> > >> for a payee.
> > > >
> > >> Primarily, I'm just trying to think of cases that might end up  
> with
> > >> unintended consequences, such as your current problem, after the
> > >> change Thomas made in January.  I'm also partly just talking out
> > >> loud, to make sure I understand how things work, as I often  
> discover
> > >> is not the case.  One thing I was not framing correctly in my  
> mind
> > >> is that a split transaction has only one payee, but multiple
> > >> categories.  You are asking (do I have this right?) to choose the
> > >> matching transaction not based on total amount of the  
> transaction,
> > >> but the closest amount (within a specified time limit) for a
> > >> category specified in the configuration for the payee.  Given the
> > >> newly imported transaction is not yet split, are you trying to  
> match
> > >> the total amount of the new transaction to the amount of the
> > >> specified category in past transactions?  (Or am I further off  
> the
> > >> mark than I thought?)
> > >
> > > This may get clearer for you if you start thinking in splits. Each
> > > (non-zero and balanced) transaction has at least two splits: one  
> for
> > > the account and at least one for a category. The imported new
> > > transaction only has one split (as the category is yet unknown).  
> So
> > > what KMyMoney does is to take a list of transactions filtered by
> > > payee and account (which means: transactions that have a split  
> with
> > > the payee in that account. It would even work if each split of a
> > > transaction can have a different payee, which exists as wish list
> > > item). Amount comparison of the new and existing transactions  
> happens
> > > on the split referencing the account (which in fact is what you  
> refer
> > > to as the total amount). Anything else would not really work.
> > Thanks, that does give me a much better picture than I had.
> > >
> > > The old implementation (prior to my January change) looked for a
> > > transaction in that list that has the exact same amount and copied
> > > the categories that were assigned. In case no transaction with the
> > > exact same amount exists, it simply took the last one entered for
> > > that payee.
> > >
> > > This is what bugged me with two alternating transactions from the
> > > same payee with different amounts each month: it took the wrong  
> one
> > > most of the time. Hence my change, which now works as follows:
> > >
> > > KMyMoney looks for a transaction in the list filtered by payee and
> > > account that has the exact same amount and copies the categories  
> that
> > > were assigned. In case no transaction with the exact same amount
> > > exists, it simply takes the one with the smallest difference in
> > > amount for that payee. While doing so, it goes back to day one of
> > > your data in that account.
> > >
> > > Brendan now asks to limit this search further by adding a date  
> filter
> > > which should be configurable on a per payee basis.
> > OK, so I have no objection to these changes.  As far as I can tell,
> > they will have no effect on the problem I've mentioned, but for now,
> > the only thing I can think of that would help me is a setting to  
> never
> > match to a split transaction, or perhaps to only use the category  
> with
> > the largest split amount from the previous transaction, and I  
> suspect
> > that would not be a good rule in general (even as an optional  
> setting.)
> > >
> > > That seems to be doable with an addition to the payee editor and a
> > > new storage attribute.
> > >
> > > Thomas
> > >
> > >
> > > > Separately, I'm trying to think how I could use this for my  
> problem,
> > > > which is that I never (or very rarely) want to match a newly
> > > imported
> > > > transaction to a split transaction, which seems to happen fairly
> > > often
> > > > when the most recent transaction for the payee is split.
> > > >
> > > > On 2019.06.22 16:23, Brendan Coupe wrote:
> > > > > If I understood Thomas correctly matching is only looking at
> > > existing
> > > > > transactions in the account. That works fine for me when I
> > > duplicate
> > > > > the previous paycheck prior to importing the OXF file from my
> > > bank.
> > > > > Not an ideal way to do this but when I don't it matches the
> > > closest
> > > > > amount for that payee since the beginning of time.
> > > > >
> > > > > The paycheck has 15 splits so a single default account does  
> not
> > > work.
> > > > > Even if I could assign 15 default accounts I would have to  
> update
> > > them
> > > > > fairly often or they would become less and less useful.
> > > > >
> > > > > Basically what I am asking for is an option in the payee  
> default
> > > > > account settings that says pick the closest amount in the  
> past xx
> > > days
> > > > > and use that/those category(ies). That alone would eliminate  
> this
> > > > > weekly problem for me and probably many others that are less
> > > frequent.
> > > > > The global settings and my original suggestion are probably  
> not
> > > needed
> > > > > if this setting was added for each payee.
> > > > >
> > > > > ----
> > > > > Brendan Coupe
> > > > >
> > > > > On Sat, Jun 22, 2019 at 2:07 PM Jack
> > > > > <ostroffjh at users.sourceforge.net> wrote:
> > > > > >
> > > > > > On 2019.06.22 11:51, Thomas Baumgart wrote:
> > > > > > > On Samstag, 22. Juni 2019 17:16:45 CEST Brendan Coupe  
> wrote:
> > > > > > >
> > > > > > > > I see why my 30 day limit did not help. It does when I
> > > manually
> > > > > copy
> > > > > > > > the most recent paycheck and then import the OFX data.
> > > > > > > >
> > > > > > > > I have an idea how to deal with this. In the Default  
> Account
> > > > > tab for
> > > > > > > > the payee there is a checkbox "Use the default  
> category..."
> > > If
> > > > > > > checked
> > > > > > > > you can select a single default category.
> > > > > > > >
> > > > > > > > How about making 4 radio buttons:
> > > > > > > >
> > > > > > > > - None
> > > > > > > > - Most recent transaction
> > > > > > > > - Closest amount
> > > > > > > > - Use the default category... (enable the dropdown list  
> when
> > > > > > > selected)
> > > > > > >
> > > > > > > How about a system wide setting with the above option set
> > > (maybe
> > > > > > > without the last one) and a per payee override option?
> > > > > Introduction
> > > > > > > of this feature would be done as follows:
> > > > > > >
> > > > > > > a) the system wide default setting is "closest amount"  
> (which
> > > > > > > reflects today's default)
> > > > > > > b) payees that don't have the category set will use the  
> system
> > > > > wide
> > > > > > > setting
> > > > > > > c) payees that have a default category set will override  
> the
> > > > > system
> > > > > > > wide setting with the default category
> > > > > > I THINK that sounds right, but I'm wondering what should be  
> per
> > > > > account
> > > > > > vs per payee vs per category.
> > > > > >
> > > > > > I may be over thinking it - but when looking for a  
> transaction
> > > to
> > > > > > "match," am I missing something, or do we still have a lack  
> of
> > > clear
> > > > > > terminology to distinguish finding the existing transaction  
> to
> > > use
> > > > > as a
> > > > > > "model" [again - not a formal term] for an imported  
> transaction
> > > vs.
> > > > > > what I think of as "true" matching - to find if the imported
> > > > > > transaction is a duplicate of one already present?  I hate  
> to
> > > admit
> > > > > it,
> > > > > > but I'm still not completely clear of that steps followed -
> > > first
> > > > > > (assuming the imported transaction is not a duplicate) to  
> find
> > > the
> > > > > best
> > > > > > transaction to model (based on what) and then whether to  
> use the
> > > > > payee
> > > > > > and/or category of that transaction, or the default  
> category of
> > > the
> > > > > > assumed payee.  Just to add to the mix here, the problem I  
> often
> > > > > face
> > > > > > is for a payee which usually has transactions with a single
> > > category
> > > > > > (marked default for that payee) I sometimes create split
> > > > > transactions -
> > > > > > and it is almost always wrong to use one of these split
> > > > > transactions as
> > > > > > the model for a newly imported transaction.  How might that  
> fit
> > > into
> > > > > > this process?
> > > > > >
> > > > > > >
> > > > > > > Does that make sense? Any objections anyone?
> > > > > > >
> > > > > > > Thomas
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Sat, Jun 22, 2019 at 4:25 AM Thomas Baumgart
> > > > > <thb at net-bembel.de>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Freitag, 21. Juni 2019 22:55:29 CEST Brendan Coupe
> > > wrote:
> > > > > > > > >
> > > > > > > > > > I'm running a week old build from the 5.0 branch on
> > > Fedora
> > > > > 29.
> > > > > > > > > >
> > > > > > > > > > When I download my savings account transaction using
> > > online
> > > > > > > banking
> > > > > > > > > > the paycheck frequently matches with a very old
> > > paycheck.
> > > > > This
> > > > > > > results
> > > > > > > > > > in the splits being way off.
> > > > > > > > > >
> > > > > > > > > > This happens when the amount of the new paycheck is  
> not
> > > very
> > > > > > > close to
> > > > > > > > > > the most recent paycheck which has been happening a  
> lot
> > > > > lately
> > > > > > > due to
> > > > > > > > > > reimbursed business expanses.
> > > > > > > > > >
> > > > > > > > > > On the import tab of the ledge settings I have tried
> > > setting
> > > > > > > "Match
> > > > > > > > > > transaction within days" from 7 days (paycheck is
> > > weekly)
> > > > > to 30
> > > > > > > days
> > > > > > > > > > and the same thing happens. KMM is definitely  
> matching
> > > > > > > transactions
> > > > > > > > > > that are much more than 30 days old. In fact the
> > > transaction
> > > > > > > that it
> > > > > > > > > > matched was only $0.01 closer to the new transaction
> > > than
> > > > > the
> > > > > > > previous
> > > > > > > > > > paycheck (difference was $8.29 versus $8.30). The
> > > > > transaction it
> > > > > > > > > > matched is over 18 months old. It appears to be
> > > ignoring the
> > > > > > > "Match
> > > > > > > > > > transaction within days" setting. it's simply  
> matching
> > > the
> > > > > > > transaction
> > > > > > > > > > from the same payee that is closest in value.
> > > > > > > > > >
> > > > > > > > > > I'm pretty sure this is fairly new behavior but I'm  
> not
> > > > > sure if
> > > > > > > it
> > > > > > > > > > started with the initial version of KMM5 that I  
> used or
> > > more
> > > > > > > recently.
> > > > > > > > >
> > > > > > > > > This probably goes back to a change I made in January  
> this
> > > > > year:
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >  
> https://cgit.kde.org/kmymoney.git/commit/?id=447213e04d6e7ab9022caeb5c258800625036967
> > > > > > > > >
> > > > > > > > > which added the part of choosing an ancient  
> transaction
> > > based
> > > > > on
> > > > > > > the smallest difference in amount whereas before it only  
> used
> > > old
> > > > > > > transactions if the amount was identical.
> > > > > > > > >
> > > > > > > > > Here's what I found in the code (which perfectly  
> explains
> > > what
> > > > > > > you encounter):
> > > > > > > > >
> > > > > > > > > In case the payee name has been found, the following  
> will
> > > take
> > > > > > > place:
> > > > > > > > >
> > > > > > > > >       // Fill in other side of the transaction
> > > (category/etc)
> > > > > > > based on payee
> > > > > > > > >       //
> > > > > > > > >                 // [...]
> > > > > > > > >                 //
> > > > > > > > >       // We'll search for the most recent transaction  
> in
> > > this
> > > > > > > account with
> > > > > > > > >       // this payee.  If this reference transaction  
> is a
> > > > > simple
> > > > > > > 2-split
> > > > > > > > >       // transaction, it's simple.  If it's a complex
> > > split,
> > > > > and
> > > > > > > the amounts
> > > > > > > > >       // are different, we have a problem.  Somehow we
> > > have to
> > > > > > > balance the
> > > > > > > > >       // transaction.  For now, we'll leave it
> > > unbalanced, and
> > > > > > > let the user
> > > > > > > > >       // handle it.
> > > > > > > > >
> > > > > > > > > For the category to be found, the first thing is to  
> check
> > > if
> > > > > the
> > > > > > > payee has a default category assigned. If yes, it is  
> taken and
> > > > > we're
> > > > > > > done. If not, all transactions for that payee in the  
> account
> > > will
> > > > > be
> > > > > > > searched backwards. Note: no date filtering here, which
> > > certainly
> > > > > is
> > > > > > > the cause of the behavior you encounter. The algorithm  
> then
> > > works
> > > > > as
> > > > > > > follows:
> > > > > > > > >
> > > > > > > > >           // if there is more than one matching
> > > transaction,
> > > > > try
> > > > > > > to be a little
> > > > > > > > >           // smart about which one we use.  we scan  
> them
> > > all
> > > > > and
> > > > > > > check if
> > > > > > > > >           // we find an exact match or use the one  
> with
> > > the
> > > > > > > closest value
> > > > > > > > >
> > > > > > > > > The scan works backwards with the last one being the
> > > default.
> > > > > So
> > > > > > > we have at least one transaction for that payee, and in  
> case
> > > of
> > > > > > > multiple the one with the least difference in amount will  
> be
> > > > > > > selected. Then we continue with:
> > > > > > > > >
> > > > > > > > >                 // in case the old transaction has two
> > > splits
> > > > > > > > >                 // we simply inverse the amount of the
> > > current
> > > > > > > > >                 // transaction found in s1. In other  
> cases
> > > > > (more
> > > > > > > > >                 // than two splits we copy all splits  
> and
> > > > > don't
> > > > > > > > >                 // modify the splits. This may lead to
> > > > > unbalanced
> > > > > > > > >                 // transactions which the user has to  
> fix
> > > > > manually
> > > > > > > > >
> > > > > > > > > The point is, that we are not talking about  
> 'matching' at
> > > this
> > > > > > > point but automatic categorization of the imported
> > > transaction.
> > > > > > > Matching happens in the next step when KMyMoney tries to
> > > figure
> > > > > out
> > > > > > > if you already have the said transaction on file (entered
> > > manually
> > > > > > > for example). And it is for that matching that the  
> interval is
> > > > > used,
> > > > > > > but not the automatic categorization happening in the step
> > > before.
> > > > > > > Matching actually means merge two transactions (the one on
> > > file
> > > > > and
> > > > > > > the imported one) into a single one. This is not what is
> > > happening
> > > > > > > for you and what you certainly don't want with older
> > > transactions.
> > > > > > > > >
> > > > > > > > > I am not sure at this point what happens, if I  
> increase
> > > the
> > > > > > > matching period beyond one month and another salary  
> payment
> > > comes
> > > > > in
> > > > > > > and it matches. It is certainly not detected as a  
> duplicate
> > > but
> > > > > does
> > > > > > > it match the transactions? I honestly don't know and have
> > > never
> > > > > tried.
> > > > > > > > >
> > > > > > > > > Why did I implement the feature as it is: I receive  
> two
> > > > > payments
> > > > > > > with very different amounts from the same payee each  
> month and
> > > > > they
> > > > > > > differ in categories. One of the amounts varies each month
> > > and the
> > > > > > > other one is fix (we talk salary and reimbursement here as
> > > well,
> > > > > but
> > > > > > > I receive them in two payments). The old behavior was  
> always
> > > > > wrong,
> > > > > > > because taking the last payment from that payee as
> > > categorization
> > > > > > > base is certainly false and only worked when there was no
> > > > > > > reimbursement (which means I received two salary payments  
> in a
> > > > > row).
> > > > > > > So for me, a matching period of a few days is OK, but for  
> the
> > > > > > > categorization I probably need a few months. The default  
> to
> > > take
> > > > > the
> > > > > > > last one on file if nothing else was found is probably a  
> good
> > > > > > > decision.
> > > > > > > > >
> > > > > > > > > Would a new setting to limit the search for  
> transactions
> > > to do
> > > > > > > the auto categorization help here? What would best  
> describe
> > > it and
> > > > > > > what would be a neat name for it?
> > > > > > > > >
> > > > > > > > > Any ideas, anyone?
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Regards
> > > > > > >
> > > > > > > Thomas Baumgart
> > > > > > >
> > > > > > > https://www.signal.org/       Signal, the better WhatsApp
> > > > > > >  
> -------------------------------------------------------------
> > > > > > > A: Because it destroys the flow of the conversation
> > > > > > > Q: Why is top-posting bad?
> > > > > > > A: Top-posting
> > > > > > > Q: What is the most annoying thing in e-mail?
> > > > > > >  
> -------------------------------------------------------------
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > > --
> > >
> > > Regards
> > >
> > > Thomas Baumgart
> > >
> > > https://www.signal.org/       Signal, the better WhatsApp
> > > -------------------------------------------------------------
> > > 'Good code is not created, it evolves.'
> > > -- George Anzinger
> > > -------------------------------------------------------------
> > >
> >
> 



More information about the KMyMoney-devel mailing list