[ANTLR] : Quelques questions

: Quelques questions [ANTLR] - Divers - Programmation

Marsh Posté le 22-02-2008 à 10:09:58    

Bonjour,
 
Je débute en ANTLR et je compte sur vous pour éclaircir quelques points durs...
 
Mon but est de parser des fichiers C pour en récupérer les typedef et structures...
 
J'ai donc récupéré la grammaire C.g sur ANTLR que vous pouvez trouver en fin de message.
 
Je travaille en C#.
 
[B][U]Question 1 :[/U][/B]
J 'ai quelques warnings à la génération des fichiers Lexer et Parser. Voici les logs :

Code :
  1. ANTLR Parser Generator  Version 3.0 (May 17, 2007)  1989-2007
  2. warning(200): C.g:468:38: Decision can match input such as "'else'" using multiple alternatives: 1, 2
  3. As a result, alternative(s) 2 were disabled for that input
  4. warning(200): C.g:517:4: Decision can match input such as "{'U', 'u'}{'L', 'l'}" using multiple alternatives: 1, 2
  5. As a result, alternative(s) 2 were disabled for that input
  6. warning(200): C.g:522:9: Decision can match input such as "'0'..'9'{'E', 'e'}{'+', '-'}'0'..'9'{'D', 'F', 'd', 'f'}" using multiple alternatives: 3, 4
  7. As a result, alternative(s) 4 were disabled for that input


 
Quelqu'un a une idée ?
 
 
[B][U]Question 2 :[/U][/B]
J'aimerais afficher à chaque itération la chaine sur la laquelle on travaille. Comment puis je faire ?
Il faut rajouter un {Console.WriteLIne()} dans la grammaire mais où et avec quoi comme paramètre ?
 
[B][U]Question 3 :[/U][/B]
Quelqu'un pourrait-il m'expliquer les deux extraitrs suivants car là je suis dépassé...   :cry:
[U]Extrait 1 :[/U]

Code :
  1. translation_unit
  2. : external_declaration+
  3. ;
  4. external_declaration
  5. options {k=1;}
  6. : ( declaration_specifiers? declarator declaration* '{' )=> function_definition
  7. | declaration
  8. ;


 
[U]Extrait 2 :[/U]
Là il commence à y avoir des ? et je ne comprends plus le fond de la règle...

Code :
  1. declaration
  2. scope {
  3.   bool isTypedef;
  4. }
  5. @init {
  6.   $declaration::isTypedef = false;
  7. }
  8. : 'typedef' declaration_specifiers? {$declaration::isTypedef=true;}
  9.   init_declarator_list ';' // special case, looking for typedef  
  10. | declaration_specifiers init_declarator_list? ';'
  11. ;
  12. declaration_specifiers
  13. :   (   storage_class_specifier
  14.  |   type_specifier
  15.         |   type_qualifier
  16.         )+
  17. ;
  18. init_declarator_list
  19. : init_declarator (',' init_declarator)*
  20. ;


 
 
Merci beaucoup !
 
Pascal
 
[U]Annexe :[/U]
 
Grammaire utilisée :

Code :
  1. /** ANSI C ANTLR v3 grammar
  2. Translated from Jutta Degener's 1995 ANSI C yacc grammar by Terence Parr
  3. July 2006.  The lexical rules were taken from the Java grammar.
  4. Jutta says: "In 1985, Jeff Lee published his Yacc grammar (which
  5. is accompanied by a matching Lex specification) for the April 30, 1985 draft
  6. version of the ANSI C standard.  Tom Stockfisch reposted it to net.sources in
  7. 1987; that original, as mentioned in the answer to question 17.25 of the
  8. comp.lang.c FAQ, can be ftp'ed from ftp.uu.net,
  9.    file usenet/net.sources/ansi.c.grammar.Z.
  10. I intend to keep this version as close to the current C Standard grammar as
  11. possible; please let me know if you discover discrepancies. Jutta Degener, 1995"
  12. Generally speaking, you need symbol table info to parse C; typedefs
  13. define types and then IDENTIFIERS are either types or plain IDs.  I'm doing
  14. the min necessary here tracking only type names.  This is a good example
  15. of the use of the global scope (called Symbols).  Every rule that declares its usage
  16. of Symbols pushes a new copy on the stack effectively creating a new
  17. symbol scope.  Also note rule declaration declares a rule scope that
  18. lets any invoked rule see isTypedef boolean.  It's much easier than
  19. passing that info down as parameters.  Very clean.  Rule
  20. direct_declarator can then easily determine whether the IDENTIFIER
  21. should be declared as a type name.
  22. I have only tested this on a single file, though it is 3500 lines.
  23. This grammar requires ANTLR v3 (3.0b3 or higher)
  24. Terence Parr
  25. July 2006
  26. ANTLR C# version - Kunle Odutola, November 2006
  27. */
  28. grammar C;
  29. options {
  30. language=CSharp;
  31.     backtrack=true;
  32.     memoize=true;
  33.     k=2;
  34. }
  35. scope Symbols {
  36. IDictionary types;
  37. }
  38. @header {
  39. }
  40. @members {
  41. bool isTypeName(string name)
  42. {
  43.  for (int i = Symbols_stack.Count-1; i>=0; i--)
  44.  {
  45.   Symbols_scope scope = (Symbols_scope)Symbols_stack[i];
  46.   if ( scope.types.Contains(name) )
  47.   {
  48.    return true;
  49.   }
  50.  }
  51.  return false;
  52. }
  53. bool isfinStruct(string token)
  54. {
  55.  if (token.ToString() == "}" )
  56.  {
  57.   return true;
  58.  }
  59.  return false;
  60. }
  61. }
  62. translation_unit
  63. scope Symbols; // entire file is a scope
  64. @init {
  65.   $Symbols::types = new Hashtable();
  66. }
  67. : external_declaration+
  68. ;
  69. /** Either a function definition or any other kind of C decl/def.
  70. *  The LL(*) analysis algorithm fails to deal with this due to
  71. *  recursion in the declarator rules.  I'm putting in a
  72. *  manual predicate here so that we don't backtrack over
  73. *  the entire function.  Further, you get a better error
  74. *  as errors within the function itself don't make it fail
  75. *  to predict that it's a function.  Weird errors previously.
  76. *  Remember: the goal is to avoid backtrack like the plague
  77. *  because it makes debugging, actions, and errors harder.
  78. *
  79. *  Note that k=1 results in a much smaller predictor for the  
  80. *  fixed lookahead; k=2 made a few extra thousand lines. ;)
  81. *  I'll have to optimize that in the future.
  82. */
  83. external_declaration
  84. options {k=1;}
  85. : ( declaration_specifiers? declarator declaration* '{' )=> function_definition
  86. | declaration
  87. ;
  88. function_definition
  89. scope Symbols; // put parameters and locals into same scope for now
  90. @init {
  91.   $Symbols::types = new Hashtable();
  92. }
  93. : declaration_specifiers? declarator
  94.  ( declaration+ compound_statement // K&R style
  95.  | compound_statement    // ANSI style
  96.  )
  97. ;
  98. declaration
  99. scope {
  100.   bool isTypedef;
  101. }
  102. @init {
  103.   $declaration::isTypedef = false;
  104. }
  105. : 'typedef' declaration_specifiers? {$declaration::isTypedef=true;}
  106.   init_declarator_list ';' // special case, looking for typedef  
  107. | declaration_specifiers init_declarator_list? ';'
  108. ;
  109. declaration_specifiers
  110. :   (   storage_class_specifier
  111.  |   type_specifier
  112.         |   type_qualifier
  113.         )+
  114. ;
  115. init_declarator_list
  116. : init_declarator (',' init_declarator)*
  117. ;
  118. init_declarator
  119. : declarator ('=' initializer)?
  120. ;
  121. storage_class_specifier
  122. : 'extern'
  123. | 'static'
  124. | 'auto'
  125. | 'register'
  126. ;
  127. type_specifier
  128. : 'void'
  129. | 'char'
  130. | 'short'
  131. | 'int'
  132. | 'long'
  133. | 'float'
  134. | 'double'
  135. | 'signed'
  136. | 'unsigned'
  137. | struct_or_union_specifier
  138. | enum_specifier
  139. | type_id
  140. ;
  141. type_id
  142.     :   {isTypeName(input.LT(1).Text)}? IDENTIFIER
  143.      {Console.Out.WriteLine("\t" + input.LT(-1).Text + " " + input.LT(1).Text);}
  144.     ;
  145. struct_or_union_specifier
  146. options {k=3;}
  147. scope Symbols; // structs are scopes
  148. @init {
  149.   $Symbols::types = new Hashtable();
  150. }
  151. : struct_or_union IDENTIFIER? '{' struct_declaration_list '}'
  152. | struct_or_union IDENTIFIER
  153. ;
  154. struct_or_union
  155. : 'struct' {Console.Out.WriteLine("\r\nStruct\r\n{" );}
  156. | 'union' {Console.Out.WriteLine("\r\nUnion\r\n{" );}
  157. ;
  158. struct_declaration_list
  159. : struct_declaration+
  160. ;
  161. struct_declaration
  162. : specifier_qualifier_list struct_declarator_list ';'
  163. ;
  164. specifier_qualifier_list
  165. : ( type_qualifier | type_specifier )+
  166. ;
  167. struct_declarator_list
  168. : struct_declarator (',' struct_declarator)*
  169. ;
  170. struct_declarator
  171. : declarator (':' constant_expression)?
  172. | ':' constant_expression
  173. ;
  174. enum_specifier
  175. options {k=3;}
  176. : 'enum' '{' enumerator_list '}'
  177. | 'enum' IDENTIFIER '{' enumerator_list '}'
  178. | 'enum' IDENTIFIER
  179. ;
  180. enumerator_list
  181. : enumerator (',' enumerator)*
  182. ;
  183. enumerator
  184. : IDENTIFIER ('=' constant_expression)?
  185. ;
  186. type_qualifier
  187. : 'const'
  188. | 'volatile'
  189. ;
  190. declarator
  191. : pointer? direct_declarator
  192. | pointer
  193. ;
  194. direct_declarator
  195. :   ( IDENTIFIER
  196.   {
  197.   if ($declaration.Count>0 && $declaration::isTypedef) {
  198.    $Symbols::types[$IDENTIFIER.Text] = $IDENTIFIER.Text;
  199.    Console.Out.WriteLine("using " + input.LT(-1).Text + " = System." + input.LT(-2).Text + ";" );
  200.   }
  201.   }
  202.  | '(' declarator ')'
  203.  )
  204.         declarator_suffix*
  205. ;
  206. declarator_suffix
  207. :   '[' constant_expression ']'
  208.     |   '[' ']'
  209.     |   '(' parameter_type_list ')'
  210.     |   '(' identifier_list ')'
  211.     |   '(' ')'
  212. ;
  213. pointer
  214. : '*' type_qualifier+ pointer?
  215. | '*' pointer
  216. | '*'
  217. ;
  218. parameter_type_list
  219. : parameter_list (',' '...')?
  220. ;
  221. parameter_list
  222. : parameter_declaration (',' parameter_declaration)*
  223. ;
  224. parameter_declaration
  225. : declaration_specifiers (declarator|abstract_declarator)*
  226. ;
  227. identifier_list
  228. : IDENTIFIER (',' IDENTIFIER)*
  229. ;
  230. type_name
  231. : specifier_qualifier_list abstract_declarator?
  232. ;
  233. abstract_declarator
  234. : pointer direct_abstract_declarator?
  235. | direct_abstract_declarator
  236. ;
  237. direct_abstract_declarator
  238. : ( '(' abstract_declarator ')' | abstract_declarator_suffix ) abstract_declarator_suffix*
  239. ;
  240. abstract_declarator_suffix
  241. : '[' ']'
  242. | '[' constant_expression ']'
  243. | '(' ')'
  244. | '(' parameter_type_list ')'
  245. ;
  246. initializer
  247. : assignment_expression
  248. | '{' initializer_list ','? '}'
  249. ;
  250. initializer_list
  251. : initializer (',' initializer)*
  252. ;
  253. // E x p r e s s i o n s
  254. argument_expression_list
  255. :   assignment_expression (',' assignment_expression)*
  256. ;
  257. additive_expression
  258. : (multiplicative_expression) ('+' multiplicative_expression | '-' multiplicative_expression)*
  259. ;
  260. multiplicative_expression
  261. : (cast_expression) ('*' cast_expression | '/' cast_expression | '%' cast_expression)*
  262. ;
  263. cast_expression
  264. : '(' type_name ')' cast_expression
  265. | unary_expression
  266. ;
  267. unary_expression
  268. : postfix_expression
  269. | '++' unary_expression
  270. | '--' unary_expression
  271. | unary_operator cast_expression
  272. | 'sizeof' unary_expression
  273. | 'sizeof' '(' type_name ')'
  274. ;
  275. postfix_expression
  276. :   primary_expression
  277.         (   '[' expression ']'
  278.         |   '(' ')'
  279.         |   '(' argument_expression_list ')'
  280.         |   '.' IDENTIFIER
  281.         |   '*' IDENTIFIER
  282.         |   '->' IDENTIFIER
  283.         |   '++'
  284.         |   '--'
  285.         )*
  286. ;
  287. unary_operator
  288. : '&'
  289. | '*'
  290. | '+'
  291. | '-'
  292. | '~'
  293. | '!'
  294. ;
  295. primary_expression
  296. : IDENTIFIER
  297. | constant
  298. | '(' expression ')'
  299. ;
  300. constant
  301.     :   HEX_LITERAL
  302.     |   OCTAL_LITERAL
  303.     |   DECIMAL_LITERAL
  304.     | CHARACTER_LITERAL
  305. | STRING_LITERAL
  306.     |   FLOATING_POINT_LITERAL
  307.     ;
  308. /////
  309. expression
  310. : assignment_expression (',' assignment_expression)*
  311. ;
  312. constant_expression
  313. : conditional_expression
  314. ;
  315. assignment_expression
  316. : lvalue assignment_operator assignment_expression
  317. | conditional_expression
  318. ;
  319. lvalue
  320. : unary_expression
  321. ;
  322. assignment_operator
  323. : '='
  324. | '*='
  325. | '/='
  326. | '%='
  327. | '+='
  328. | '-='
  329. | '<<='
  330. | '>>='
  331. | '&='
  332. | '^='
  333. | '|='
  334. ;
  335. conditional_expression
  336. : logical_or_expression ('?' expression ':' conditional_expression)?
  337. ;
  338. logical_or_expression
  339. : logical_and_expression ('||' logical_and_expression)*
  340. ;
  341. logical_and_expression
  342. : inclusive_or_expression ('&&' inclusive_or_expression)*
  343. ;
  344. inclusive_or_expression
  345. : exclusive_or_expression ('|' exclusive_or_expression)*
  346. ;
  347. exclusive_or_expression
  348. : and_expression ('^' and_expression)*
  349. ;
  350. and_expression
  351. : equality_expression ('&' equality_expression)*
  352. ;
  353. equality_expression
  354. : relational_expression (('=='|'!=') relational_expression)*
  355. ;
  356. relational_expression
  357. : shift_expression (('<'|'>'|'<='|'>=') shift_expression)*
  358. ;
  359. shift_expression
  360. : additive_expression (('<<'|'>>') additive_expression)*
  361. ;
  362. // S t a t e m e n t s
  363. statement
  364. : labeled_statement
  365. | compound_statement
  366. | expression_statement
  367. | selection_statement
  368. | iteration_statement
  369. | jump_statement
  370. ;
  371. labeled_statement
  372. : IDENTIFIER ':' statement
  373. | 'case' constant_expression ':' statement
  374. | 'default' ':' statement
  375. ;
  376. compound_statement
  377. scope Symbols; // blocks have a scope of symbols
  378. @init {
  379.   $Symbols::types = new Hashtable();
  380. }
  381. : '{' declaration* statement_list? '}'
  382. ;
  383. statement_list
  384. : statement+
  385. ;
  386. expression_statement
  387. : ';'
  388. | expression ';'
  389. ;
  390. selection_statement
  391. : 'if' '(' expression ')' statement (options {k=1; backtrack=false;}:'else' statement)?
  392. | 'switch' '(' expression ')' statement
  393. ;
  394. iteration_statement
  395. : 'while' '(' expression ')' statement
  396. | 'do' statement 'while' '(' expression ')' ';'
  397. | 'for' '(' expression_statement expression_statement expression? ')' statement
  398. ;
  399. jump_statement
  400. : 'goto' IDENTIFIER ';'
  401. | 'continue' ';'
  402. | 'break' ';'
  403. | 'return' ';'
  404. | 'return' expression ';'
  405. ;
  406. IDENTIFIER
  407. : LETTER (LETTER|'0'..'9')*
  408. ;
  409. fragment
  410. LETTER
  411. : '$'
  412. | 'A'..'Z'
  413. | 'a'..'z'
  414. | '_'
  415. ;
  416. CHARACTER_LITERAL
  417.     :   '\'' ( EscapeSequence | ~('\''|'\\') ) '\''
  418.     ;
  419. STRING_LITERAL
  420.     :  '"' ( EscapeSequence | ~('\\'|'"') )* '"'
  421.     ;
  422. HEX_LITERAL : '0' ('x'|'X') HexDigit+ IntegerTypeSuffix? ;
  423. DECIMAL_LITERAL : ('0' | '1'..'9' '0'..'9'*) IntegerTypeSuffix? ;
  424. OCTAL_LITERAL : '0' ('0'..'7')+ IntegerTypeSuffix? ;
  425. fragment
  426. HexDigit : ('0'..'9'|'a'..'f'|'A'..'F') ;
  427. fragment
  428. IntegerTypeSuffix
  429. : ('u'|'U')? ('l'|'L')
  430. | ('u'|'U')  ('l'|'L')?
  431. ;
  432. FLOATING_POINT_LITERAL
  433.     :   ('0'..'9')+ '.' ('0'..'9')* Exponent? FloatTypeSuffix?
  434.     |   '.' ('0'..'9')+ Exponent? FloatTypeSuffix?
  435.     |   ('0'..'9')+ Exponent FloatTypeSuffix?
  436.     |   ('0'..'9')+ Exponent? FloatTypeSuffix
  437. ;
  438. fragment
  439. Exponent : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
  440. fragment
  441. FloatTypeSuffix : ('f'|'F'|'d'|'D') ;
  442. fragment
  443. EscapeSequence
  444.     :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
  445.     |   OctalEscape
  446.     ;
  447. fragment
  448. OctalEscape
  449.     :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
  450.     |   '\\' ('0'..'7') ('0'..'7')
  451.     |   '\\' ('0'..'7')
  452.     ;
  453. fragment
  454. UnicodeEscape
  455.     :   '\\' 'u' HexDigit HexDigit HexDigit HexDigit
  456.     ;
  457. WS  :  (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
  458.     ;
  459. COMMENT
  460.     :   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
  461.     ;
  462. LINE_COMMENT
  463.     : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
  464.     ;
  465. // ignore #line info for now
  466. LINE_COMMAND
  467.     : '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
  468.     ;

Reply

Marsh Posté le 22-02-2008 à 10:09:58   

Reply

Marsh Posté le 25-02-2008 à 15:20:10    

Personne ?

Reply

Marsh Posté le 02-03-2008 à 17:45:50    

Les messages semblent indiquer que la grammaire n'est pas complète et que dans certains cas, ceux indiqués, des ambiguités subsistent.
Mais alors pour la corriger, bonne chance... cependant il y a des chances que ça ne t'affecte pas pour ce que tu veux faire.
Pour le reste, ben il n'y a pas moyen de faire autrement que de te plonger dans la copieuse documentation [:spamafote]
 
Rapidement, la syntaxe des déclarations de types symboliques est grosso modo:
 
type : tokens { bout de code associé, éventuellement; } ;
 
sachant que ça peut se faire sur plusieurs lignes comme
 
type :
   (token1 | token2) { ... }
   ;
Dans le bout de code Java associé, on fait appelle à des mots-clés d'ANTLR qui commencent par $.


---------------
Les aéroports où il fait bon attendre, voila un topic qu'il est bien
Reply

Sujets relatifs:

Leave a Replay

Make sure you enter the(*)required information where indicate.HTML code is not allowed