KHTML Paged Media - Status Report
Allan Sandfeld Jensen
kde at carewolf.com
Sat Aug 20 12:31:57 CEST 2005
Hi
I've noticed a few status reports here, and would just like to add mine to it.
GROUND-WORK:
I started the project by thorough reading of the spec. and small experiments
to figure out what was possible.
First it should be noted that the current implementation of pagination in
KHTML is well... wrong. It paints the document unpaged, and then just tries
to chose good places to cut. This leads to many poor cuts, and means that
forced page-breaks (such as page-break-after: always) makes a full premature
break across the whole page.
The first step of fixing printing in KHTML was rewriting all page-break logic,
and move page-break decisions from painting time to layout time. This makes
it possible to move multiple block and move them different distances.
I considered a wide range of implementations, with the first goal of
paginating the entire document during one layout. It turned out there was
many problems with such a solution. I've written a little framework for it,
but ultimately abandoned the idea, at least until I have layout/page-breaking
of one-page-at-time working.
Before starting to mess too much with the block layout I first ported a major
clean-up from WebCore, making sure our code bases was as similar as possible
so that my project will not increase the split.
IMPLEMENTATION:
I then removed all the old truncation code, and put in page-break decisions in
the layout of blocks. I've written two different page-break logics, one for
blocks containing block-children and one for blocks containing inline
children.
The block-children code is simple. It assumes it is the responsibility of the
children to split themselves, setting a flag if they succeeds. If the parent
discovers a child that crosses a page-break but has not set a flag, it will
attempt to move the child below the page-break. It will do the same for block
that have set CSS forced page-break.
To handle page-break-*: avoid, I decided to introduce new anonymous blocks
that contains runs of children that "avoids" page-breaks between them and set
page-break-inside: avoid. With this being done before layout, the only CSS
the layout has to handle is page-break-after/before: always and
page-break-inside: avoid. This means it can be done progressively.
Page-breaking inline children is in the simple form done much similar except
that lines are always assumed not to break themselves, and always cleared if
crossing page-breaks. Handling the CSS orphans and widows turned out to make
it harder though. Violating orphans is simple just don't break and let the
parent move the block across the page-break. Violation of widows is much
harder. Since we layout one line at a time, it is impossible to know if we
will later violate widows when we encounter a page-break. The solution so far
have been to postpone the widows check until all lines have been layouted. If
there is a violation I then set a hint to what line _should_ have been
broken, and redo the layout.
With the new page-break code implemented. I've spend much of the remaining
time to fix corner-cases and bugs in the implementation. At this point tables
are still generating many page-break bugs, and many objects are not cropped
correctly at page-breaks.
STANDARD TROUBLE:
Another issue I've been studying is the DPI question. Previously we have been
using 72DPI, being lower than most screen DPI this generate larger fonts than
seen on the screen. It seems that for good rendering of most webpages DPI
should be assumed constant (at 96). This has the consequence of requiring a
dppx (dots per px), making the CSS px value an abstract size.
The last ironic standard problem are websites importing their style-sheets
with a "screen" media selector. This means the style-sheet doesn't apply for
"print" media, basically producing an unstyled webpage. It appears to support
broken web-sites a "screen" media selector should be treated as "all".
I've decided to expand CSS 2.1 as well. For presentations and other non-print
paged media it just not good enough to use a "@media print" selector. I am
experimenting with adding media-groups as valid selectors making "@media
paged" and "@media static" possible.
WORK-IN-PROGRESS:
Besides bug-fixes. I am working on parsing and applying page-context CSS
"@page", and I need to look into how to handle FRAMESET documents and
documents with IFRAMEs.
`Allan
More information about the Kde-soc
mailing list