The representation of terminal symbols (tokens) is not defined
by the <i>Accent</i> specification. An <i>Accent</i> parser
cooperates with a lexical scanner that converts the source text into
a sequence of tokens. This scanner is implemented by a function
<tt>yylex()</tt> that reads the next token and returns a value
representing the kind of the token.
<h3>The Kind of a Token</h3>
The kind of a token is indicated by a number.
<p>
A terminal symbol denoted by a literal in the <i>Accent</i> specification,
e.g. <tt>'+'</tt>, is represented by the numerical value of the character.
So <tt>yylex()</tt> returns this value if it has recognized this literal:
<pre>
return '+';
</pre>
A terminal symbol denoted by a symbolic name declared
in the token declaration part of the <i>Accent</i> specification,
e.g. <tt>NUMBER</tt>, is represented by a constant with a symbolic name
that is the same as the token name. So <tt>yylex</tt> returns
this constant:
<pre>
return NUMBER;
</pre>
The definition of the constants is generated by <i>Accent</i>
and is contained in the generated file <tt>yygrammar.h</tt>.
Hence the file introducing <tt>yylex</tt> should include this file.
<pre>
#include "yygrammar.h"
</pre>
<h3>The Attribute of a Token</h3>
Besides having a kind (e.g. <tt>NUMBER</tt>)
a token can also be augmented with a semantic attribute.
The function <tt>yylex</tt>
assigns this attribute value to the variable <tt>yylval</tt>.
For example
<pre>
yylval = atoi(yytext);
</pre>
(here <tt>yytext</tt> is the actual token that has been recognized
as a <tt>NUMBER</tt>; the function <tt>atoi()</tt> converts this
string into a numerical value).
<p>
The variable <tt>yylval</tt> is declared
in the generated file <tt>yygrammar.c</tt>.
An <tt>external</tt> declaration for this variable
is provided in the generated file <tt>yygrammar.h</tt>.
<p>
<tt>yylval</tt> is declared as of type <tt>YYSTYPE</tt>.
This is defined by <i>Accent</i>
in the file <tt>yygrammar.h</tt> as a macro standing for <tt>long</tt>.
<pre>
#ifndef YYSTYPE
#define YYSTYPE long
#endif
</pre>
The user can define his or her own type before including the file
<tt>yygrammar.h</tt>.
For example, a file <tt>yystype.h</tt> may define
<pre>
typedef union {
int intval;
float floatval;
} ATTRIBUTE;
#define YYSTYPE ATTRIBUTE
</pre>
Now the file defining <tt>yylex()</tt> imports two header files:
<pre>
#include "yystype.h"
#include "yygrammar.h"
</pre>
and defines the semantic attribute by:
<pre>
yylval.intval = atoi(yytext);
</pre>
<h3>The <i>Lex</i> Specification</h3>
The function <tt>yylex</tt> can be generated by the scanner generator
<i>Lex</i> (or the GNU implementation <i>Flex</i>).
<p>
The <a href="http://dinosaur.compilertools.net"><i>Lex & Yacc Page</i></a>
has online documentation for <i>Lex</i> and <i>Flex</i>.
<p>
A <i>Lex</i> specification gives rules that define for each token how it
is represented and how it is processed.
A rule has the form
<pre>
pattern { action }
</pre>
<tt>pattern</tt> is a regular expression
that specifies the representation of the token.
<p>
<tt>action</tt> is <i>C</i> code that specifies how the token is processed.
This code sets the attribute value and returns the kind of the token.
<p>
For example, here is a rule for the token <tt>NUMBER</tt>:
<pre>
[0-9]+ { yylval.intval = atoi(yytext); return NUMBER; }
</pre>
The <i>Lex</i> specification starts with a definition section
which can be used to import header files and to declare variables.
For example,
<pre>
%{
#include "yystype.h"
#include "yygrammar.h"
%}
%%
</pre>
Here the section imports <tt>yystype.h</tt> to provide a user specific
definition of <tt>YYSTYPE</tt> and <tt>yygrammar.h</tt>
that defines the token codes.
The <tt>%%</tt> separates this section from the rules part.
<h3>The <i>Accent</i> Specification</h3>
In the <i>Accent</i> specification, tokens are introduced in the token
declaration part.
<p>
For example
<pre>
%token NUMBER;
</pre>
introduces a token with name <tt>NUMBER</tt>.
<p>
Inside a rule the token can be used with a parameter,
for example
<pre>
NUMBER<x>
</pre>
This parameter can then be used in actions to access the attribute of the token.
It is of type <tt>YYSTYPE</tt>.
<pre>
Value : NUMBER<x> { printf("%d", x.intval); } ;
</pre>
or simply
<pre>
Value : NUMBER<x> { printf("%d", x); } ;
</pre>
if there is no user specific definition of <tt>YYSTYPE</tt>.
<p>
As opposed to the <i>Lex</i> specification the import of <tt>yygrammar.h</tt>
does not appear in the
<i>Accent</i> specification.
If the user specifies an own type <tt>YYSTYPE</tt>
this has to be done in global prelude part, e.g.
<pre>
%prelude {
#include "yystype.h"
}
</pre>
<h3>Tracking the Source Position</h3>
Like <tt>yylval</tt>, which holds the attribute of a token,
there is a further variable, <tt>yypos</tt>, thats holds the source position
of the token.
<p>
<tt>yypos</tt> is declared in the <i>Accent</i> runtime
as an <tt>external</tt> variable of type <tt>long</tt>.
Its initial value is <tt>1</tt>.
<p>
This variable can be set in rules of the <i>Lex</i> specification.
For example,
<pre>
\n { yypos++; /* adjust linenumber and skip newline */ }
</pre>
If the newline character is seen, <tt>yypos</tt> is incremented
and so holds the actual line number.
<p>
The variable <tt>yypos</tt> is managed in in such a way that
it holds the correct value when <tt>yyerror</tt> is invoked to
report a syntax error
(although due to lookahead already the next token is read).
<p>
It has also a correct value when semantic actions are executed
(note that this is done after lexical analysis and parsing).
Hence it can be used inside semantic actions,
for example
<pre>
value:
NUMBER<n> { printf("value in line %d is %d\n", yypos, n); }
;
</pre> <!--- end main content -->
<br>
<br>
<font face="helvetica" size="1">
<a href="http://accent.compilertools.net">accent.compilertools.net</a>
</font>
</TD>
</TR>
</TABLE>
</BODY>
</HTML>
¤ Dauer der Verarbeitung: 0.36 Sekunden
(vorverarbeitet)
¤
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung ist noch experimentell.