所以,我取了一些生成的lex代码并使其独立。
我希望,我可以用C++代码,虽然我认识到
c
只是。嗯,这只关系到你的生活,不是这样的
std::string
.)
scanC.l
:
%{
#include <iostream>
#include <string>
#ifdef _WIN32
/// disables #include <unistd.h>
#define YY_NO_UNISTD_H
#endif // _WIN32
// buffer for collected C/C++ code
static std::string cCode;
// counter for braces
static int nBraces = 0;
%}
/* Options */
/* make never interactive (prevent usage of certain C functions) */
%option never-interactive
/* force lexer to process 8 bit ASCIIs (unsigned characters) */
%option 8bit
/* prevent usage of yywrap */
%option noyywrap
EOL ("\n"|"\r"|"\r\n")
SPC ([ \t]|"\\"{EOL})*
LITERAL "\""("\\".|[^\\"])*"\""
%s CODE
%%
<INITIAL>"{" { cCode = '{'; nBraces = 1; BEGIN(CODE); }
<INITIAL>. |
<INITIAL>{EOL} { std::cout << yytext; }
<INITIAL><<EOF>> { return 0; }
<CODE>"{" {
cCode += '{'; ++nBraces;
//updateFilePos(yytext, yyleng);
} break;
<CODE>"}" {
cCode += '}'; //updateFilePos(yytext, yyleng);
if (!--nBraces) {
BEGIN(INITIAL);
//return new Token(filePosCCode, Token::TkCCode, cCode.c_str());
std::cout << '\n'
<< "Embedded C code:\n"
<< cCode << "// End of embedded C code\n";
}
} break;
<CODE>"/*" { // C comments
cCode += "/*"; //_filePosCComment = _filePos;
//updateFilePos(yytext, yyleng);
char c1 = ' ';
do {
char c0 = c1; c1 = yyinput();
switch (c1) {
case '\r': break;
case '\n':
cCode += '\n'; //updateFilePos(&c1, 1);
break;
default:
if (c0 == '\r' && c1 != '\n') {
c0 = '\n'; cCode += '\n'; //updateFilePos(&c0, 1);
} else {
cCode += c1; //updateFilePos(&c1, 1);
}
}
if (c0 == '*' && c1 == '/') break;
} while (c1 != EOF);
if (c1 == EOF) {
//ErrorFile error(_filePosCComment, "'/*' without '*/'!");
//throw ErrorFilePrematureEOF(_filePos);
std::cerr << "ERROR! '/*' without '*/'!\n";
return -1;
}
} break;
<CODE>"//"[^\r\n]* | /* C++ one-line comments */
<CODE>"'"("\\".|[^\\'])+"'" | /*"/* C/C++ character constants */
<CODE>{LITERAL} | /* C/C++ string constants */
<CODE>"#"[^\r\n]* | /* preprocessor commands */
<CODE>[ \t]+ | /* non-empty white space */
<CODE>[^\r\n] { // any other character except EOL
cCode += yytext;
//updateFilePos(yytext, yyleng);
} break;
<CODE>{EOL} { // special handling for EOL
cCode += '\n';
//updateFilePos(yytext, yyleng);
} break;
<CODE><<EOF>> { // premature EOF
//ErrorFile error(_filePosCCode,
// compose("%1 '{' without '}'!", _nBraces));
//_errorManager.add(error);
//throw ErrorFilePrematureEOF(_filePos);
std::cerr << "ERROR! Premature end of input. (Not enough '}'s.)\n";
}
%%
int main(int argc, char **argv)
{
return yylex();
}
要扫描的示例文本
scanC.txt
Hello juhist.
The text without braces doesn't need to have any syntax.
It just echoes the characters until it finds a block:
{ // the start of C code
// a C++ comment
/* a C comment
* (Remember that nested /*s are not supported.)
*/
#define MAX 1024
static char buffer[MAX] = "", empty="\"\"";
/* It is important that tokens are recognized to a limited amount.
* Otherwise, it would be too easy to fool the scanner with }}}
* where they have no meaning.
*/
char *theSameForStringConstants = "}}}";
char *andCharConstants = '}}}';
int main() { return yylex(); }
}
This code should be just copied
(with a remark that the scanner recognized the C code a such.)
Greetings, Scheff.
编译和测试
cygwin64
$ flex --version
flex 2.6.4
$ flex -o scanC.cc scanC.l
$ g++ --version
g++ (GCC) 7.3.0
$ g++ -std=c++11 -o scanC scanC.cc
$ ./scanC < scanC.txt
Hello juhist.
The text without braces doesn't need to have any syntax.
It just echoes the characters until it finds a block:
Embedded C code:
{ // the start of C code
// a C++ comment
/* a C comment
* (Remember that nested /*s are not supported.)
*/
#define MAX 1024
static char buffer[MAX] = "", empty="\"\"";
/* It is important that tokens are recognized to a limited amount.
* Otherwise, it would be too easy to fool the scanner with }}}
* where they have no meaning.
*/
char *theSameForStringConstants = "}}}";
char *andCharConstants = '}}}';
int main() { return yylex(); }
}// End of embedded C code
This code should be just copied
(with a remark that the scanner recognized the C code a such.)
Greetings, Scheff.
$
笔记:
-
这是从一个助手工具(不是为了销售)。因此,这不是防弹的,但对于高效代码来说已经足够了。
-
-
用宏与非平衡宏的创造性组合来愚弄这个工具当然是可能的
{
}
所以,这至少是进一步发展的开始。
为了对照C lex规范检查这一点,我
ANSI C grammar, Lex specification