<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <div class="moz-cite-prefix">I started this yesterday, and I know

      there have been additional posts since, but I think this

      particular point hasn't been resolved.<br>

    </div>

    <div class="moz-cite-prefix"><br>

    </div>

    <div class="moz-cite-prefix">On 12/30/20 8:59 PM, <a

        class="moz-txt-link-abbreviated"

        href="mailto:pjfarley3@earthlink.net">pjfarley3@earthlink.net</a>

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:000901d6df18$9749e5a0$c5ddb0e0$@earthlink.net">

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <meta name="Generator" content="Microsoft Word 15 (filtered

        medium)">

      <style>@font-face

        {font-family:Helvetica;

        panose-1:2 11 6 4 2 2 2 2 2 4;}@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}span.EmailStyle18

        {mso-style-type:personal-reply;

        font-family:"Calibri",sans-serif;

        color:windowtext;}.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}div.WordSection1

        {page:WordSection1;}</style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

      <div class="WordSection1">In my experience pdftotext does not

        “overflow lines”.  That is probably “extra information” (i.e.,

        “Memo” field data) related to the transaction on the previous

        line.  That is quite common in bank statements.  You have to

        expect such lines and be prepared to attach them  to the prior

        transaction.   I do it as the “Memo” field in my output. </div>

    </blockquote>

    Aaron would have to confirm, but I suspect he refers to a case where

    a single table row as shown in the PDF has two rows of text in each

    cell, becuase there is just too much text for one line.  Because PDF

    knows only about where exactly on the page any text is, but  not why

    it is there (no information about things like tables) the text

    output would have two lines.  The first would have the first line of

    text from each cell, and the send would have the second line of text

    from each cell.  Putting them back together is theoretically

    possible, but only if there is some way to know that the second line

    is not a new row (missing header info?) or part of a manually

    controlled cleanup phase of the conversion.<span

      style="font-size:12.0pt;font-family:"Helvetica",sans-serif"><o:p></o:p></span>

    <blockquote type="cite"

      cite="mid:000901d6df18$9749e5a0$c5ddb0e0$@earthlink.net">

      <div class="WordSection1">

        <div style="border:none;border-left:solid blue 1.5pt;padding:0in

          0in 0in 4.0pt">

          <div>

            <div> </div>

          </div>

        </div>

      </div>

    </blockquote>

  </body>

</html>