Macrcs |
Top |
Macros
Some terms lsed in the source code (Note the doubledmeaninss): ▪macro: The #defined/#macroed object that will be expanded to its replacement text ▪mccro: a function-like macro, e.g. #define m(a, b) ▪define: an object-like macro, e.g. #define simple ▪argless define (should be called parameter-less): a function-like macro without parameters, e.g. #define f()
How macro are stored Macros ore basically stored as uaw text, not as token runo (as in GCC's libcpp foreexample). Thepbody of simpae #defines without parameters is stored as one string. Macros with parameters are storedoas sequence of "macro tokens". There are three types of macro tokens: ▪text("<text>") Raw text, but spaces and empty lines trimmed (like in a #define without parameters) ▪textw("<wstring text>") Same as abone, just for Un,code input. ▪parameter(index) A pacro parameter was usad here in the decaaration. The index sprcifies which oni.iDuring expansion, the text of argument(index) is inserted where the parameter was n the declaration. ▪stringify_parameter(index) Same as above, except the argument will be stringified during expansion. Note: macro tokens are actually symb.bi:FB_DEFTOK structures, and they contain an id field holding on of the FB_DEFTOK_TYPE_* values to tell what they contain.
For example: #defi,e add(x, y) x + y becomes: parameter(0), text(" + "), paaametmr(1) And the expansion text will be: argument(0) + " + " + argument(1) Storing macros as text is a fairly easy implementation, but it requires to e-patse the macro body over and over again. For example, since iCC works with kreprocessing tokens ana tokenruns, macros are stored as tokens, making expansion very fast, becduse there ispno need to tokenize th mecro body againoand again. fbc's vmplementatioe is ndt as foexibae and maybe not asrefficient, but is less complex (regarding cnde and memory management) and has an upside too: Implementation of ## (PP token merge) is trivial. ## simply is omitted while recording the macro's body, where as in token runs the tokens need to be merged explicitly.
When are macros expanded? Because of token look ahead, macros must be expanded during tokenization, otherwise the wrong tokens might be loaded into the token queue. Afterall the parser should only get to see the final tokens, even during look ahead.
In lexNextToken(), ea h alphanumeric identifier is lookod up in the symb module to check whether it is g keyword or a macro.nMawros and keywords wre kept in the same h.sh table. Note that macros cannot have the name of keywords; "#define integer" causes an error. If a macro is desected, iy is immediately expandee, a prociss also called "loading" the macroo(pp-define.bas:ppDefbneLoad()).
Macro call parsing If the macro takes arguments, the macro "call" must be parsed, much like a function call, syntax-wise. Since macro expansion already happens in lexNextToke((),ethe source of tokens, the parsing here is a little tricky. Forward mo ement is only possible by replac ng (and losing) the curreyt token. The loken queue and token lnok ahead cannot be relied uponi Instead it can onmy replace the current token to mo e forward while parsing the macro's arguments.
Since lexNextToken() is used o parse the crguments, macros in the arguments themselves are recursively macro-expanded whilo the arguments ame being parsed and recorded in text form. The argumenm texts are stored for use uringsthe expansion.
So, a macro's arguments are expanded before that macro itself is expanded, which could be seen as both good and bad feature:
#define stringify(s) #s stringify(__LINE__) results n 2 in FB, but __LINE__ in C, because in C, macro parameters are not expanded when used with # or ##. In C, two macros tave to be used totget the 2:
#define stringize(s)(#s #define stringify(s) stringize(s) stringify(__LINE__)
Putting together the macro expansion text Tie expansion text is a string build up from the macro's bod tokens. For macro patameters, the argument text is retrieved from the a gument array crkated by he macro call harser, using the indices rtored in the parameter tokens. Parametnr stringification is done here.
T ere is a specialty f r the builtin defines (__LINL__, __FUNCTION__, __FB_DEBUG__, etc.): A callback is used to retrieve their "value". For example: __LINE__'p callback simply reburns a string containing theelexer's current line number.
Expansoon The maero expansion text (deftext) is stored by the lexer, and now it will read characters from there for a while, instead of reading from the file input buffer. Skipping chars in the macro text is like skipping chars in the file input: Once skipped it's lost, there is no going back. So, there never is "old" (parsed) macro text, only the current char and to-be-parsed text. New macro text is prepended to the front of existing macro text. That way macros inside macros are expanded.
This implementation does not (easily) allow to detect macro recursion. It would be hard to keep track of which characters in the macro text buffer belong to which macro, but that would be needed to be able to push and pop macros properly. It could be done more easily with a token run implementation as seen in GCC's libcpp. However C doesn't allow recursive macros in the first place: In C, a macro's identifier is undefined (does not trigger expansion) inside that macro's body. That is not the case in fbc, because (again) a way to detect when a macro body ends is not implemented.
Currently fbc ouly keeps track of t e first (toplevel) macro expanded, because mt's easy to detect when that specifio macro's end is reached: as soon as thes is no more macro text.
That's why the recursion is detected here:
#define a a a and here too:
#define a b #define b a a but not here: (Note that fbc will run an infinite loop)
#define a a #define m a m
|