[antlr-interest] How to recognize a String

The Researcher researcher0x00 at gmail.com
Mon Sep 12 09:17:07 PDT 2011


On Mon, Sep 12, 2011 at 9:45 AM, Gabriel Miro <gsmiro at gmail.com> wrote:

> Eric,
>
> Here's the grammar I'm using:
>
> grammar test;
>
> rule    :    STRING;
>
> RPAREN    :    ')';
> LPAREN    :    '(';
>
> WS  :   ( ' '
>        | '\t'
>        | '\r'
>        | '\n'
>        ) {$channel=HIDDEN;}
>    ;
>
> CHAR:  '\'' ( ESC_SEQ | ~('\''|'\\') ) '\''
>    ;
>
> STRING
>    :  '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
>    ;
>
> fragment
> HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
>
> fragment
> ESC_SEQ
>    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
>    |   UNICODE_ESC
>    |   OCTAL_ESC
>    ;
>
> fragment
> OCTAL_ESC
>    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7')
>    ;
>
> fragment
> UNICODE_ESC
>    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
>    ;
>

Since the Debugger is giving you problems, use the command line.

Create these four files and run the highlighted commands. I showed you a
full run show you could see what to expect.


Mail001.g

grammar Mail001;



stmt: val EOF;



val : '(' '\'' ID '\'' ')'



ID  :   ('a'..'z'|'A'..'Z')+ ;

NEWLINE:'\r'? '\n' {skip();} ;

WS  :   (' '|'\t')+ {skip();} ;



Mail001.java

import org.antlr.runtime.*;



public class Mail001 {

    public static void main(String[] args) throws Exception {

        ANTLRInputStream input = new ANTLRInputStream(System.in);

        Mail001Lexer lexer = new Mail001Lexer(input);

        CommonTokenStream tokens = new CommonTokenStream(lexer);

        Mail001Parser parser = new Mail001Parser(tokens);

        parser.stmt();

    }

}



Mail001.dat

('val')



bad.dat

(val)



student at antlr:~/projects/antlr/mail/mail001$ ls -l

total 16

-rw-r--r-- 1 student student   6 Sep 12 12:05 bad.dat

-rw-r--r-- 1 student student   9 Sep 12 11:48 Mail001.dat

-rw-r--r-- 1 student student 157 Sep 12 12:04 Mail001.g

-rw-r--r-- 1 student student 389 Sep 12 11:41 Mail001.java

student at antlr:~/projects/antlr/mail/mail001$ java org.antlr.Tool Mail001.g

student at antlr:~/projects/antlr/mail/mail001$ ls -l

total 36

-rw-r--r-- 1 student student    6 Sep 12 12:05 bad.dat

-rw-r--r-- 1 student student    9 Sep 12 11:48 Mail001.dat

-rw-r--r-- 1 student student  157 Sep 12 12:04 Mail001.g

-rw-r--r-- 1 student student  389 Sep 12 11:41 Mail001.java

-rw-r--r-- 1 student student 9218 Sep 12 12:09 Mail001Lexer.java

-rw-r--r-- 1 student student 3237 Sep 12 12:09 Mail001Parser.java

-rw-r--r-- 1 student student   60 Sep 12 12:09 Mail001.tokens

student at antlr:~/projects/antlr/mail/mail001$ javac *.java

student at antlr:~/projects/antlr/mail/mail001$ ls -l

total 48

-rw-r--r-- 1 student student    6 Sep 12 12:05 bad.dat

-rw-r--r-- 1 student student  744 Sep 12 12:09 Mail001.class

-rw-r--r-- 1 student student    9 Sep 12 11:48 Mail001.dat

-rw-r--r-- 1 student student  157 Sep 12 12:04 Mail001.g

-rw-r--r-- 1 student student  389 Sep 12 11:41 Mail001.java

-rw-r--r-- 1 student student 4028 Sep 12 12:09 Mail001Lexer.class

-rw-r--r-- 1 student student 9218 Sep 12 12:09 Mail001Lexer.java

-rw-r--r-- 1 student student 2754 Sep 12 12:09 Mail001Parser.class

-rw-r--r-- 1 student student 3237 Sep 12 12:09 Mail001Parser.java

-rw-r--r-- 1 student student   60 Sep 12 12:09 Mail001.tokens

student at antlr:~/projects/antlr/mail/mail001$ java Mail001 < bad.dat

line 1:1 missing '\'' at 'val'

line 1:4 missing '\'' at ')'

student at antlr:~/projects/antlr/mail/mail001$ java Mail001 < Mail001.dat

student at antlr:~/projects/antlr/mail/mail001$



Notice that using Mail001.dat is valid data so no errors will be presented.



Hope that helps,



Eric



>
> Please not that, apart from the parenthesis, the rest is generated by
> ANTLRWorks. Using the interpreter to match the string ' ' (single quotes
> and
> one space, or any letter between the quotes), I get a
> MismatchedTokenException(4!=10). It only matches '' (single quote without
> space).
>
> I tried removing the CHAR rule and then, for the same input, I get:
>
> MismatchedTokenException(-1!=9)
>
> and in the console
>
> problem matching token at 1:3 NoViableAltException('?'@[()* loopback of
> 17:13: ( ESC_SEQ |~ ( '\\' | '\'' ) )*])
>
> I tried using the debugger, but on Windows I get a javac, cannot create
> process error and In my Mac, I get compile time errors in the generated
> classes because it cannot find antlr classes (probably  just a classpath
> problem). Is there any documentation on how to use AntlrWorks features? The
> site documentation is pretty shallow and I cannot find any tutorials.
>
> Regards,
> Gabriel Miró
> ANTLR Newbie
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of The Researcher
> > Sent: Friday, September 09, 2011 2:46 PM
> > To: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] How to recognize a String
> >
> > Hi Gabriel,
> >
> > It would help if you could post your entire grammar and the exact error
> > message here.
> >
> > Also, don't you mean
> >
> > LPAREN: '(";
> > RPAREN: ')';
> >
> > Don't worry we have all been there.
> >
> > Thanks, Eric
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list