Review Request 129557: [okular] Enable searching for a phrase split by a newline character in a PDF

Marduk Bolanos mardukbp at mac.com
Thu Jan 5 10:21:04 UTC 2017



> On Jan. 4, 2017, 10:13 p.m., Albert Astals Cid wrote:
> > home/marduk/textpage.cpp, line 57
> > <https://git.reviewboard.kde.org/r/129557/diff/1/?file=486539#file486539line57>
> >
> >     so
> >       from = "Hola\n"
> >       to = "Adios "
> >     returns true?
> 
> Marduk Bolanos wrote:
>     The function does not receive words but characters. The comparison between the text in the PDF and the search query is performed char by char. `from` is a char in the PDF and `to` is a char in the search query. Therefore, the actual behaviour is:
>     
>     ```
>     from = "\n"
>     to = " "
>     returns true
>     ```
>     
>     As a result, the phrase "hola\nadios" in the PDF generates a match when the search query is "hola adios".
> 
> Albert Astals Cid wrote:
>     This is how the function is used *now* but there's nothing in the function itself that forces for only characters to be sent, since it gets a QString, so no, your line of reasoning doesn't work, the function totally fails with the input i said, that needs fixing.

Ok. I modified the comparison to make it more precise. Let's consider the last two words of a line in the PDF and the first word of the next line:

```
given the
effort
```

The search query is `the effort`.

What actually happens is that after `th` matches, `e\n` is compared against `e` followed by space and that also matches.


- Marduk


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/129557/#review101805
-----------------------------------------------------------


On Jan. 5, 2017, 10:20 a.m., Marduk Bolanos wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/129557/
> -----------------------------------------------------------
> 
> (Updated Jan. 5, 2017, 10:20 a.m.)
> 
> 
> Review request for Okular and Oliver Sander.
> 
> 
> Repository: okular
> 
> 
> Description
> -------
> 
> A blank space in the query is matched against a newline character in the PDF.
> 
> 
> Diffs
> -----
> 
>   okular/core/textpage.cpp 44dfa14 
> 
> Diff: https://git.reviewboard.kde.org/r/129557/diff/
> 
> 
> Testing
> -------
> 
> Tried a few PDF files. It works.
> 
> 
> Thanks,
> 
> Marduk Bolanos
> 
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/okular-devel/attachments/20170105/caef03df/attachment-0001.html>


More information about the Okular-devel mailing list