diff htmltokenizer.h

Maciej Stachowiak mjs at apple.com
Sat Oct 25 17:22:21 CEST 2003


On Oct 25, 2003, at 8:59 AM, Dirk Mueller wrote:

>
> Hi,
>
> +++ WebCore/khtml/html/htmltokenizer.h  2003-08-04 12:16:06.000000000  
> +0200
> @@ -340,7 +340,10 @@
>      int scriptStartLineno;
>      int tagStartLineno;
>
> -#define CBUFLEN 14
> +// This buffer can hold arbitrarily long user-defined attribute  
> names, such
> as in EMBED tags.
> +// So any fixed number might be too small, but rather than rewriting  
> all
> usage of this buffer
> +// we'll just make it large enough to handle all imaginable cases.
> +#define CBUFLEN 1024
>      char cBuffer[CBUFLEN+2];
>      unsigned int cBufferPos;
>
>
> That again makes no sense at all. the cBuffer is solely used for the
> attrname / tagname to id mapping, done by the gperf generated code in  
> misc/
> htmltags.*.

We had an actual bug report that user-defined attribute names on embed  
tags were being cut off.

The test case is this page:

https://sss-web.usps.com/ds/jsps/pl_dummy.jsp

The offending embed tag looks like this:

<EMBED type="application/x-java-applet;version=1.3"
MAYSCRIPT=true
java_CODE = "com.ibm.lex.printapplet.LabelPrinter.class"
java_ARCHIVE = "LabelPrinter.jar"
ALT = "United States Postal Service - Label Printer"
WIDTH = 760 HEIGHT = 574  cabbase = LabelPrinter.cab
singleton = "1067123829893"
display_label_url =  
"/ds/servlet/com.usps.shipping.domestic.servlets.GetLabelServlet? 
test=true&labelName=DISPLAY"
display_label_width = ""
display_label_height = ""
do_cancel_unsaved_url = "/ds/index.html"
do_cancel_saved_url = "/ds/index.html"
do_save_url = ""
do_payment_url = ""
num_print_labels = "1"
print_label_1_url =  
"/ds/servlet/com.usps.shipping.domestic.servlets.GetLabelServlet? 
test=true&labelName=MAIN"
payment_fail_button_1_label = ""
payment_fail_button_1_url = ""
payment_fail_button_2_label = ""
payment_fail_button_2_url = ""
payment_fail_button_3_label = ""
payment_fail_button_3_url = ""
payment_fail_button_4_label = ""
payment_fail_button_4_url = ""
success_url = "/ds/index.html"
do_void_url = ""
void_fail_url = ""
void_success_url = ""
pluginspage="http://java.sun.com/products/archive/j2se/1.3.1_02/jre/ 
index.html">
</EMBED>

We found that the attributes came out of the tokenizer split every 14  
characterfs like this:

java_archive = LabelPrinter.jar;
utton_3_url = ;
num_print_labe = ;
cabbase = LabelPrinter.cab;
print_label_1_ = ;
utton_1_label = ;
utton_1_url = ;
display_label_ = ;
alt = United States Postal Service - Label Printer;
singleton = 1058479194254;
d_url = /ds/index.html;
utton_3_label = ;
url =  
/ds/servlet/com.usps.shipping.domestic.servlets.GetLabelServlet? 
test=true&labelName=DISPLAY;
utton_4_url = ;
do_cancel_save = ;
mayscript = true;
rl = ;
do_cancel_unsa = ;
void_success_u = ;
utton_2_url = ;
java_code = com.ibm.lex.printapplet.LabelPrinter.class;
do_save_url = ;
success_url = /ds/index.html;
pluginspage =  
http://java.sun.com/products/archive/j2se/1.3.1_02/jre/index.html;
utton_2_label = ;
void_fail_url = ;
payment_fail_b = ;
height = 574;
type = application/x-java-applet;version=1.3;
ved_url = /ds/index.html;
do_void_url = ;
do_payment_url = ;
utton_4_label = ;
width = 760;
ls = 1;

And the change you cite fixed the bug.

> embed does not use any of those tags. Thats also why it does not  
> matter that
> the "| 0x20" part in htmltokenizer.cpp mangles the '_'. because no  
> valid html
> attribute or tagname that we map to an id has it in its name anyway. I
> checked, no such tags have been added to your tree either.

These changes were to support passing custom attribute names unmodified  
to plugins and Javaapplets, not for valid standar html attributes and  
tagnames.

Hope this helps,
Maciej




More information about the Khtml-devel mailing list