Discussion:
looking for a lexical scanner (lexer)
(too old to reply)
Guido Franzke
2009-10-02 12:33:07 UTC
Permalink
Hello NG,

I'm looking for a lexical scanner (lexer), that extracts words out of a
sentence including interpunction like . , - etc.
I cannot find an example code.
Do you know where to find it?
Thanks,
Guido
Mateusz Loskot
2009-10-02 12:55:29 UTC
Permalink
Post by Guido Franzke
Hello NG,
I'm looking for a lexical scanner (lexer), that extracts words out of a
sentence including interpunction like . , - etc.
I cannot find an example code.
Do you know where to find it?
I would look at Boost Wave and Boost Spirit 2.0.

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Tom Serface
2009-10-02 15:21:13 UTC
Permalink
I didn't think of this in the other forum (mfc), but you may want to just
look up a version of Lex (usually comes with YACC) since then you could
write your own parser or analyzer based on your criteria. It would take a
while to learn, but be a lot more flexible.

Tom
Post by Guido Franzke
Hello NG,
I'm looking for a lexical scanner (lexer), that extracts words out of a
sentence including interpunction like . , - etc.
I cannot find an example code.
Do you know where to find it?
Thanks,
Guido
Carl Daniel [VC++ MVP]
2009-10-02 16:54:53 UTC
Permalink
Post by Tom Serface
I didn't think of this in the other forum (mfc), but you may want to
just look up a version of Lex (usually comes with YACC) since then
you could write your own parser or analyzer based on your criteria. It
would take a while to learn, but be a lot more flexible.
I'll second Tom's suggestion - look at "compiler toolkits" like YACC/LEX,
Bison/Flex, ANTLR. They can be intimidating at first, but building a simple
lexer with them is actually quite easy.

If you'd like a pure C++ solution, you might consider Boost::Spirit
(www.boost.org).

-cd
legalize+ (Richard)
2009-10-02 17:13:35 UTC
Permalink
[Please do not mail me a copy of your followup]
Post by Carl Daniel [VC++ MVP]
I'll second Tom's suggestion - look at "compiler toolkits" like YACC/LEX,
Bison/Flex, ANTLR. They can be intimidating at first, but building a simple
lexer with them is actually quite easy.
If you'd like a pure C++ solution, you might consider Boost::Spirit
(www.boost.org).
I've learned lex/yacc via the "Dragon Book" and over the past year or
so became familiar with Spirit. Within the last month or so I looked
at ANTLR.

Of all those approaches, if I needed to write a little lexer or parser
now I would choose Spirit. Its like having a domain specific language
for your grammar right in C++. It can also do things that lex/yacc
just can't accomplish and it generally outperforms those other code
generation techniques as well. ANTLR has too manye external
dependencies for my tastes and it feels like a Mac tool that was
ported to Windows.
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>
Carl Daniel [VC++ MVP]
2009-10-02 22:18:50 UTC
Permalink
Post by Carl Daniel [VC++ MVP]
Post by Tom Serface
I didn't think of this in the other forum (mfc), but you may want to
just look up a version of Lex (usually comes with YACC) since then
you could write your own parser or analyzer based on your criteria.
It would take a while to learn, but be a lot more flexible.
I'll second Tom's suggestion - look at "compiler toolkits" like
YACC/LEX, Bison/Flex, ANTLR. They can be intimidating at first, but
building a simple lexer with them is actually quite easy.
If you'd like a pure C++ solution, you might consider Boost::Spirit
(www.boost.org).
If your needs are very simple, you might be able to get by with something
like boost::tokenizer as well:

http://www.boost.org/doc/libs/1_40_0/libs/tokenizer/index.html

-cd
Tom Serface
2009-10-02 22:30:58 UTC
Permalink
Hi Carl,

That may work for OP and good suggestion, but in my experience that
tokenizer is almost as difficult to figure out as YACC/LEX :o) Their
documentation is not exactly over explanatory ...

Tom
Post by Carl Daniel [VC++ MVP]
Post by Carl Daniel [VC++ MVP]
Post by Tom Serface
I didn't think of this in the other forum (mfc), but you may want to
just look up a version of Lex (usually comes with YACC) since then
you could write your own parser or analyzer based on your criteria.
It would take a while to learn, but be a lot more flexible.
I'll second Tom's suggestion - look at "compiler toolkits" like
YACC/LEX, Bison/Flex, ANTLR. They can be intimidating at first, but
building a simple lexer with them is actually quite easy.
If you'd like a pure C++ solution, you might consider Boost::Spirit
(www.boost.org).
If your needs are very simple, you might be able to get by with something
http://www.boost.org/doc/libs/1_40_0/libs/tokenizer/index.html
-cd
Wayne A. King
2009-10-02 22:47:41 UTC
Permalink
Post by Guido Franzke
I'm looking for a lexical scanner (lexer), that extracts words out of a
sentence including interpunction like . , - etc.
Let's Build a Compiler
http://compilers.iecc.com/crenshaw/


Compiler Construction
http://www-old.oberon.ethz.ch/WirthPubl/CBEAll.pdf

- Wayne

- Wayne A. King
***@rogers.com
Loading...