HTML XML DTD

Ruan Strydom ruan at jcell.co.za
Tue Mar 23 17:56:05 UTC 2010


Thanks,  I will have a look,

I need more advice:

With regards to the tree: SGML's open, close and content of tags depends on 
the specification in DTD, assuming that a tag does not have to be closed or 
that its content is CDATA ie SCRIPT breaks SGML.

With that said: 
Lexing/parsing and validating against DTD is a pretty complex and lengthy task 
as I just found out (duh ;) ).

1) Should I ignore DTD for now, and just parse on a fixed set of rules for 
HTML loose for instance (the same goes for the formatter)? If not I have a 
couple of questions on implementing the DTD parser (I have  to rewrite it :(  
).

2) Also, I need to rename the project and move everything into a 'namespace' I 
was thinking 'Ml' as in 'Markup Language'? If I am going to hard-code the 
rules for HTML I could create a project for HTML and leave this project as 
XML? (I think HTML takes priority so I am going to leave XML for now (just 
simple bug fixes))

I hope these come out aligned, I am trying to show directory structures:

ie:

	Html  - formatter
                   - language
                   - ....
   
        Xml   - formatter
                  - catalog
                  - validator
                  - language

or:

           Ml   - formatter
                  - catalog
                  - validator
                  - language  -  dtd
                                        -  sgml
                                        -  xml

On Tuesday 23 March 2010 12:23:57 Jonathan Schmidt-Dominé - Developer wrote:
> Hi Ruan!
> 
> Some comments:
> -Why do you need multiple CDATA-tokens? It would be most convenient to
>  create a simple TEXT-token. The parse should not care about if there is a
>  CDATA_END or anything like this. If you think that there is any usage for
>  differntiation between normal text and cdata-text, there shouldbe a single
>  CDATA-token, not CDATA_END and CDATA_START.
> -It is quite simple to build a tree:
>     (element | TEXT)*
> -> block ;;
>     LT TEXT [: QString text = currentToken(); /* pseudo-code */ :]
> (#attributes=attribute)* (FSGT | GT (?[ sgmlNoClosing(text) /* pseudo-code
>  */
> 
> :] 0 | block=block LT TEXT [: if(!matches(text, currentToken()) /* xml:
> : case-
> 
> sensitivem sgml : case-insensitive */ ) { /* error */ } :] GT ;;
> 
> So you can build a tree.
> 
> Jonathan
> 
> ------------------------
> Automatisch eingefügte Signatur:
> Es lebe die Freiheit!
> Stoppt den Gebrauch proprietärer Software!
> Operating System: GNU/Linux
> Kernel: Linux 2.6.31.8-0.1-default
> Distribution: openSuSE 11.2
> Qt: 4.6.3
> KDE: 4.4.67 (KDE 4.4.67 (KDE 4.5 >= 20100310)) "release 3"
> KMail: 1.13.1
> http://gnu.org/
> http://kde.org/
> http://windows7sins.org/
> 




More information about the KDevelop-devel mailing list