[Kmymoney-devel] csv file that fails to import (see comments from the review request)

Mon Nov 28 12:40:01 UTC 2011

On 26/11/11 13:58, Cristian Oneț wrote:
> On Saturday 26 November 2011 14:58:13 Cristian Oneț wrote:
>> On Saturday 26 November 2011 14:44:43 you wrote:
>>>>> One other thing. Importing the same csv twice should not add
>>>>> already
>>>>> imported transactions as new transactions but I have to check
>>>>> why
>>>>> that
>>>>> is not working at the moment.
>>>>
>>>> That seems to be the case for Banking transactions, but investment
>>>> transactions being re-imported do get matched.  Don't know why that
>>>> happens.
>>>
>>> That's because banking transactions need a correct bankid. If you take a
>>> look  at CSVDialog.cpp:781 you can see that you set the bankid to a
>>> random number. I this there should be a way to extract a unique ID from
>>> a CSV record, just like the date and the amount are extracted. In my
>>> case, for example, I would need to run a regexp on the details field to
>>> obtain a unique bank id.
>>
>> If the id is not available it would be better to do something like in
>> mymoneyqifreader.cpp:1028 instead of a random number.
>
> The attached patch uses the same implementation as the qif importer without
> the need to implement any id extraction logic. It only works for transactions
> that are imported after the patch is applied. If it seems OK to you too please
> also include this in your commit of the reviewboard patch.
>

| Thanks Cristian.

| Sadly, it doesn't produce the desired effect - it finds a match on
| the payee, but not "Detected as duplicate", so a bit more research is 
| needed first.  If I import the same qif file twice, it finds the
| duplicates, but not with the csv version.
__________________________________________________________________
| It works in my tests but as I've told you it will only work on
| transactions
| imported after the patch is applied. So let's say you've imported and 
| matched
| transaction A without the patch then you import transaction A with
| the patch -
| that will not work since transaction A already has the bankid set by | 
the old version so the duplicate will not be detected.

| To test the patch you need to import and match new transactions
| several times and duplicates should be detected then.

| Regards,

| Cristian Oneț
______________________________________________________________________

| Yes, but I did the same for both qif and csv.  I had an empty file,
| patch already applied, imported the csv, saved the file then
| reimported the csv again.

| Allan
______________________________________________________________________

I've spent ages looking at this and it definitely does not work for me.
I've compared what happens in the qif importer with the same in the csv 
importer and I can see why it doesn't work.

1) Import the same file with just one transaction, in the different 
formats..
    QIF -  a hash is generated with an idx suffix of '-1'.
    CSV -  exactly the same result, with an idx suffix of '-1'.
2) Import the same files a second time.
    QIF -  exactly the same sequence and the same idx suffix of '-1'.
    CSV -  the same hash with an idx suffix of '-1' is created, but this 
time
           the hash is discovered in the the hashmap, so the idx is
           incremented and the for loop is re-entered, producing another
           hash, but with the suffix of '-2'.  So, the hashes are
           different and no match occurs.

So, with the qif import, it seems that the hashmap has been cleared, 
although not explicitly, as far as I can see.  Then, it occurred to me 
that for the second pass with the QIF file, the file has had to be 
reselected, whereas that's not necessary with the CSV file.  So, it's 
like the hashmap gets cleared on the second qif import.

I therefore cleared the hashmap in the csv file selector method, select 
the file again and now the second import gets matched.

However, I'm not clear if clearing the hashmap before each import is 
defeating its purpose?  Otherwise, I cannot see how the hashmap routine 
can work.  I have to admit though that I don't understand yet the 
significance of the 'd' pointer in all this.

Help!

Allan