Escape sequences and new CSS parser?

Lars Knoll khtml-devel@kde.org
Mon, 24 Feb 2003 08:40:42 +0100


--Boundary-00=_6xcW+l4lmYjuZTB
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

The attached fix to CSSParser::text() should fix this.

Cheers,
Lars


> The main CSS page on W3C (which worked in the old parser) is now hosed
> because of the escape characters used in conjunction with the content
> property.   Did I miss something in the merge, or does the flex/bison
> combo have trouble with these?
>
> <html>
> <head>
> <style>
> p.date:before {
>    content: "\a0\a0\a0\a0\a0\a0";
> }
> </style>
> </head>
> <body>
>
> <p class="date">Hello</p>
> </body>
> </html>
>
> In the above example, I see what appear to be spaces, but then I also
> see the \.  I see one less \ than the number of "\a0" that I specify,
> and I also get a line break (oddly) at the end.
>
> Any tips/pointers?  I see this ParseString object with an unsigned
> short* and a length field, but I'm totally at a loss as to how that
> gets filled in.
>
> dave
>
> _______________________________________________
> Khtml-devel@mail.kde.org
> http://mail.kde.org/mailman/listinfo/khtml-devel

--Boundary-00=_6xcW+l4lmYjuZTB
Content-Type: text/x-diff;
  charset="iso-8859-1";
  name="escapes.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="escapes.diff"

cvs -f log -N -r1.261 cssparser.cpp

RCS file: /home/kde/kdelibs/khtml/css/cssparser.cpp,v
Working file: cssparser.cpp
head: 1.261
branch:
locks: strict
access list:
keyword substitution: kv
total revisions: 296;	selected revisions: 1
description:
----------------------------
revision 1.261
date: 2003/02/24 07:46:06;  author: knoll;  state: Exp;  lines: +22 -5
fix parsing of escape sequences.
=============================================================================
cvs -f diff -bp -u -r1.260 -r1.261 cssparser.cpp
Index: cssparser.cpp
===================================================================
RCS file: /home/kde/kdelibs/khtml/css/cssparser.cpp,v
retrieving revision 1.260
retrieving revision 1.261
diff -b -p -u -r1.260 -r1.261
--- cssparser.cpp	24 Feb 2003 07:21:31 -0000	1.260
+++ cssparser.cpp	24 Feb 2003 07:46:06 -0000	1.261
@@ -3,7 +3,7 @@
  *
  * Copyright (C) 2003 Lars Knoll (knoll@kde.org)
  *
- * $Id: cssparser.cpp,v 1.260 2003/02/24 07:21:31 knoll Exp $
+ * $Id: cssparser.cpp,v 1.261 2003/02/24 07:46:06 knoll Exp $
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Library General Public
@@ -1674,10 +1674,6 @@ unsigned short *DOM::CSSParser::text(int
 
     for ( int i = 0; i < l; i++ ) {
 	unsigned short *current = start+i;
-	if ( !escape && *current == '\\' ) {
-	    escape = current;
-	    continue;
-	}
 	if ( escape == current - 1 ) {
 	    if ( ( *current >= '0' && *current <= '9' ) ||
 		 ( *current >= 'a' && *current <= 'f' ) ||
@@ -1721,6 +1717,7 @@ unsigned short *DOM::CSSParser::text(int
 	    if ( uc > 0xffff )
 		uc = 0xfffd;
 	    *(out++) = (unsigned short)uc;
+	    escape = 0;
 	    if ( *current == ' ' ||
 		 *current == '\t' ||
 		 *current == '\r' ||
@@ -1728,7 +1725,27 @@ unsigned short *DOM::CSSParser::text(int
 		 *current == '\f' )
 		continue;
 	}
+	if ( !escape && *current == '\\' ) {
+	    escape = current;
+	    continue;
+	}
 	*(out++) = *current;
+    }
+    if ( escape ) {
+	// add escaped char
+	int uc = 0;
+	escape++;
+	while ( escape < start+l ) {
+	    // 		qDebug("toHex( %c = %x", (char)*escape, toHex( *escape ) );
+	    uc *= 16;
+	    uc += toHex( *escape );
+	    escape++;
+	}
+	// 	    qDebug(" converting escape: string='%s', value=0x%x", QString( (QChar *)e, current-e ).latin1(), uc );
+	// can't handle chars outside ucs2
+	if ( uc > 0xffff )
+	    uc = 0xfffd;
+	*(out++) = (unsigned short)uc;
     }
 
     *length = out - start;

--Boundary-00=_6xcW+l4lmYjuZTB--