Supporting multiple languages in a single document

Milian Wolff mail at milianw.de
Mon Jun 7 00:24:40 UTC 2010


On Sunday 06 June 2010 21:04:47 Nicolás Alvarez wrote:
> Milian Wolff wrote:
> > I've now started working on it and at least the documents get parsed one
> > after the other already. There are lots of problems though and the code
> > is still bit dirty. The biggest problems I see right now is that
> > apparently most tools (outline, context browser, ...) use
> > DUChainUtils::standardContextForUrl() (most functions in DUChainUtils use
> > that) which will always return the "inner" language. In my test case that
> > would be CSS. How should that be fixed...
> > 
> > I mean now we have multiple TopDUContexts for each file, one for each
> > contained language...
> 
> I don't really understand the minor technical details of what you're
> doing; but here are use cases for you to think about :)
> 
> In the case of PHP, the default language is actually HTML. It only
> becomes PHP when you get to a <?php ?> tag. However, all the PHP
> chunks together form a single program.
> 
> Another way to think about it is that the document language is PHP,
> but everything between begin of file and <?php, between ?> and <?php,
> and between ?> and end of file is really text passed to an "echo"
> statement. This is how the internal PHP implementation works; but I'm
> not sure if this model is useful for us at all.
> 
> <?php
> function foo() {
>     $one = 42;
> ?>
> hello world
> <?php
>     echo $one;
> }
> ?>
> 
> In "echo $one", the parser has to understand that we are still inside
> the function. This is important so it knows that $one is the same
> local variable that was declared above.

Uhm, thanks Nicolas, but you seem to have forgotten that I have years of 
experience with PHP and now all this :P :P

Btw.: That code example is so ugly, I doubt anyone ever uses that in 
productive codes :P

Anyhow: PHP is a preprocessor (look at the name), hence it's deciding what 
gets spit out in the end. That it's mostly HTML is actually just for 
historical reasons. I've worked with lots of different PHP applications and 
know that you also use PHP for:

plain text
javascript
css
XML (feeds, dumps, ...)
yaml
...

So PHP _is_ definitely the main language. What it spits out is the question 
though that I want to solve properly for KDevelop. In the standard "HTML-
Template" case we'd have two TopDUChains: one for PHP, one for HTML. Then 
maybe another one for all contained CSS and one for JavaScript and ...

> Another situation: Javascript embedded in HTML. In this case, separate
> <script> tags are executed separately, they don't form a single
> program. If you do something like in the PHP example above, it's a
> syntax error. However, global variables *are* shared between different
> script tags, and built-ins like "document" or "window" are the same in
> the entire HTML document. In addition, if you declare a function
> anywhere, and call it from another function anywhere, it will work.
> All this is important in the creation of the Javascript duchain.

I know, see above :P

> There is a chance we can handle this situation very similar to PHP,
> actually. I haven't analyzed it carefully.

You didn't understand my problem ;-)

> But here is another hypothetical case of multiple languages that is
> definitely different, with chunks of fully-independent programs
> embedded in another:
> 
> MACRO(AC_STRUCT_TIMEZONE)
>     MESSAGE(STATUS "Checking for struct tm.tm_zone")
>     CHECK_C_SOURCE_COMPILES("
> #include <time.h>
> 
> int main() {
>     static struct tm obj;
>     if (obj.tm_zone) {
>         return 0;
>     }
>     return 0;
> }
> " struct_tm_check)
> ENDMACRO(AC_STRUCT_TIMEZONE)
> 
> This is C embedded in CMake. It would be quite interesting if we could
> have C++ semantic highlighting, code completion, etc. inside that
> string. But note that each call to CHECK_C_SOURCE_COMPILES uses a
> totally independent C program. If I declare a function in one, and use
> it in another, the second should appear underlined in yellow, because
> one program won't see the other's functions.

This is very similar to above, you'd just need even more TopDUChains or well 
actually no. One would suffice with distinct contexts.

> Even if nobody implements this in the CMake language support, I think
> it's a valid use case for multiple languages in a document, and the
> parsing frameworks should support it.

I know, this is why I asked :P

-- 
Milian Wolff
mail at milianw.de
http://milianw.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.kde.org/pipermail/kdevelop-devel/attachments/20100607/d94481c5/attachment.sig>


More information about the KDevelop-devel mailing list