patch to fix bug with long hexadecimal entities

Darin Adler darin at apple.com
Tue Nov 18 03:56:50 CET 2003


I haven't looked at the latest version of tokenizer.cpp, so this may 
already be fixed, but I did this fix to make 8-character hexadecimal 
entries.

-------------- next part --------------
Index: ChangeLog
===================================================================
RCS file: /local/home/cvs/Labyrinth/WebCore/ChangeLog,v
retrieving revision 1.2286
diff -p -u -u -p -r1.2286 ChangeLog
--- ChangeLog	2003/11/17 23:35:21	1.2286
+++ ChangeLog	2003/11/17 23:36:33
@@ -2,6 +2,15 @@
 
         Reviewed by John.
 
+        - fixed 3485925: Safari does not correctly parse eight-digit hex character entities
+
+        * khtml/html/htmltokenizer.cpp: (HTMLTokenizer::parseEntity): Replaced puzzling variable limit
+        on number of hexadecimal characters to parse with an 8-character limit.
+
+2003-11-17  Darin Adler  <darin at apple.com>
+
+        Reviewed by John.
+
         - fixed 3485572 -- secure form check in KHTML uses case-sensitive comparison with "https"
 
         * khtml/html/html_formimpl.h:
Index: khtml/html/htmltokenizer.cpp
===================================================================
RCS file: /local/home/cvs/Labyrinth/WebCore/khtml/html/htmltokenizer.cpp,v
retrieving revision 1.43
diff -p -u -u -p -r1.43 khtml/html/htmltokenizer.cpp
--- khtml/html/htmltokenizer.cpp	2003/11/17 23:27:42	1.43
+++ khtml/html/htmltokenizer.cpp	2003/11/17 23:36:33
@@ -731,20 +731,19 @@ void HTMLTokenizer::parseEntity(DOMStrin
 
         case Hexadecimal:
         {
-            int ll = kMin(src.length(), 9-cBufferPos);
+            int ll = kMin(src.length(), 8);
             while(ll--) {
                 QChar csrc(src->lower());
                 cc = csrc.cell();
 
                 if(csrc.row() || !((cc >= '0' && cc <= '9') || (cc >= 'a' && cc <= 'f'))) {
-                    Entity = SearchSemicolon;
                     break;
                 }
                 EntityUnicodeValue = EntityUnicodeValue*16 + (cc - ( cc < 'a' ? '0' : 'a' - 10));
                 cBuffer[cBufferPos++] = cc;
                 ++src;
             }
-            if(cBufferPos == 9) Entity = SearchSemicolon;
+            Entity = SearchSemicolon;
             break;
         }
         case Decimal:
-------------- next part --------------


I could not figure out what the "9-cBufferPos" thing and the check for 
cBufferPos == 9 was good for. Please let me know if I missed something 
important.

I guess the other entity parsing code nearby does something similar, so 
once I understand it, there may be something to fix there too.

     -- Darin


More information about the Khtml-devel mailing list