Fixing the preprocessor

Tue May 26 09:26:42 UTC 2009

Hi,

I try to fix the preprocessor. My target is to correctly parse the following 
code:

#define MA(x) T<x> a
#define MB(x) T<x>
#define MC(X) int
#define MD(X) c

template <typename P1> struct A {};
template <typename P2> struct T {};

int main(int argc, char ** argv) {
  MA(A<int>);
  A<MB(int)> b;
  MC(a)MD(b);
  MC(a)d;
}

Currently the output of the preprocessor is:

template <typename P1> struct A {};
template <typename P2> struct T {};

int main(int argc, char ** argv) {
  T<A<int>> a;
  A<T<int>> b;
  intc;
  intd;
}

All four declarations are wrong. All are different instances of the same 
error: After preprocessing tokens are not allowed to merge, but kdevelop 
ignores this.

The cpp fixes this by adding spaces where necessary.

What is the best way to handle this in kdevelop?

1. One solution would be to check upon macro expansion what the last character 
in the output stream is and to also insert a space if necessary. This would 
solve the first three declarations. The last requires a check after macro 
expansion. The check would also need a table of invalid character 
combinations. Altogether such a fix would be quite big and ugly.

2. Another solution would be to always add a space before and after a macro 
expansion. This would produce different output than cpp, but would it cause 
harm?

3. A third idea: The output stream of the preprocessor consists of a splitted 
string into different substrings. What about having this substrings match the 
tokens of the program? Then further processing steps would no longer merge any 
tokens. This also requires lots of work but seems to be quite clean for me.

What is your opinion and do you have a better solution for the problem?

Christoph