[okular] [Bug 458516] New: Spaces in content filenames causes a second copy of the book's content to be shown; TOC points to the second copy.

Duane bugzilla_noreply at kde.org
Tue Aug 30 15:46:01 BST 2022


https://bugs.kde.org/show_bug.cgi?id=458516

            Bug ID: 458516
           Summary: Spaces in content filenames causes a second copy of
                    the book's content to be shown; TOC points to the
                    second copy.
           Product: okular
           Version: 22.08.0
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: EPub backend
          Assignee: okular-devel at kde.org
          Reporter: duane-tech at evenson.ca
  Target Milestone: ---

Created attachment 151708
  --> https://bugs.kde.org/attachment.cgi?id=151708&action=edit
source html file

SUMMARY
An epub file with html content filenames with spaces in the epub zip file cause
a doubling on the content with the TOC pointing to the second copy.

STEPS TO REPRODUCE
1. create epub file with spaces in component filenames
  1.1. create html file: "te st.html" (with space)
    nano "te st.html"
<html><body>
    <h1>Chapter 1</h1>
    <h1>Chapter 2</h1>
    <h1>Chapter 3</h1>
    <h1>Chapter 4</h1>
    <h1>Chapter 5</h1>
</body></html>

  1.2. convert to epub file (with spaces in component filenames)
    ebook-convert "te st".{html,epub}
  1.3. review contents
    unzip -l "te st.epub"
2. view with okular
  okular "te st.epub"

OBSERVED RESULT
The reader will show Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4,
Chapter 5, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5.
The table of contents will point to the second occurrence so Chapter 1 will be
on page 7.

EXPECTED RESULT
Reader should show Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5.
The TOC should place Chapter 1 on page 2.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
5.19.4-arch1-1 x86_64 GNU/Linux
Window Manager:
jwm 2.3.7-3

ADDITIONAL INFORMATION
Manually removing spaces in component file names (te st_split_000.html, etc.)
and editing content in conent.opf and toc.ncx to remove space and %20 in
references corrects the problem.
ebook-viewer does not share this problem.
There is no doubling of content references in either content.opf or toc.ncx.

Playing around with it:  If I have:
test__split_000.html
te st__split_001.html
te st__split_002.html
test__split_003.html
te st__split_004.html
and edit contents.opt to have lines:
    <item id="html5" href="test_split_000.html"
media-type="application/xhtml+xml"/>
    <item id="html4" href="te st_split_001.html"
media-type="application/xhtml+xml"/>
    <item id="html3" href="te st_split_002.html"
media-type="application/xhtml+xml"/>
    <item id="html2" href="test_split_003.html"
media-type="application/xhtml+xml"/>
    <item id="html1" href="te st_split_004.html"
media-type="application/xhtml+xml"/>
and edit toc.ncx, changing lines:
      <content src="te%20st_split_000.html"/>
...
      <content src="te%20st_split_003.html"/>
changed to:
      <content src="test_split_000.html"/>
...
      <content src="test_split_003.html"/>

The book shows Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter
5, Chapter 2, Chapter 3, Chapter 5
Second copies of Chapters 1 and 4 are missing.
The TOC shows Chapters 1-5 pointing to pages 2, 7, 8, 4, 9.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Okular-devel mailing list