Parsing multi-language files

Sascha Cunz mail at SaCu.DE
Sun Feb 29 14:08:04 UTC 2004


Hi Roberto, hi all,

does the current class store / AST have the ability to correctly parse files 
that are put together out of more than one language? I know that this might 
sound very wired at first glance. And i doubt that such a thing is posible 
currently. But let me explain how real that problem is:

Basicly, we can consider everything that embeds into HTML/XHTML/XML as forming 
a multi lanugage file. Thus, our PHP support will fit into this category. 
Generally spoken, the PHP part of a php file can be seen just like a special 
part of an xml file (though it could be even more). PHP is nothing more than 
a preproccessor, which proccesses everything contained inside "<?php" and 
"?>, thus many people use it for embedding small parts of code into static 
HTML pages.

Out of this arise some problems:
- Using Ctrl+D might cause a C/C++-Style comment inside a block of the file
  which is not php ( This one has a bug report )

- Using Doxygen-style documenter might cause a comment being inserted before
  the <?php tag. imagine: "<html>\n<body>\n<php function test(){..."
  ( this is just a rare case, but it could happen; though this results just
  from a talk between Jonas and me on irc. )

- If we ever intend to create a "add function"-dialog to php support, it
  might get confused by this, too.

The only way to solve this ( as far as i can see ), we would need to know 
which blocks of a file are actually php. Maybe we could store a simple list 
of "blocks" saying:
  byte    0 -  167 is HTML
  byte  168 -  230 is an embedded CSS
  byte  231 -  240 is HTML
  byte  241 -  299 is an embedded java script
  byte  300 -  380 is HTML
  byte  381 -  553 is PHP
  byte  554 -  594 is HTML

Further, i could imagine creating a parser for xml files, which would allow us 
to use problem reporter in conjunction with xhtml/xml files. This one could 
provide the data needed.

So finally, is this vision absolute horror? Or is it something that could be 
done in order to provide better support for web-scripting?

Cheers Sascha




More information about the KDevelop-devel mailing list