.. |nbsp| unicode:: 0xA0 ================================================== re2c |nbsp| --- |nbsp| Regular Expressions to Code ================================================== .. toctree:: :hidden: User manual How to build Benchmarks Release notes Changelog *re2c* stands for *Regular Expressions to Code*. It is a free and open-source lexer generator that supports C/C++, D, Go, Haskell, Java, JavaScript, OCaml, Python, Rust, V, Zig, and can be extended to other languages by implementing a single :ref:`syntax file `. The primary focus of re2c is on generating *fast* code: it compiles regular expressions to deterministic finite automata and translates them into direct-coded lexers in the target language (such lexers are generally faster and easier to debug than their table-driven analogues). Secondary re2c focus is on *flexibility*: it does not assume a fixed program template; instead, it allows the user to embed lexers anywhere in the source code and configure them to avoid unnecessary buffering and bounds checks. Internal algorithm used by re2c is based on a special kind of deterministic finite automata: `lookahead TDFA <2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf>`_. These automata are as fast as ordinary DFA, but they are also capable of performing submatch extraction with minimal overhead. re2c is used in other open-source projects, such as `php `_, `ninja `_, `yasm `_, `spamassassin `_, `BRL-CAD `_, `wake `_, etc. .. |man| image:: _static/manual.png :target: manual/manual.html :class: feed :width: 2em |man| |nbsp| Read the manual for `C/C++ `_, `D `_, `Go `_, `Haskell `_, `Java `_, `JS `_, `OCaml `_, `Python `_, `Rust `_ `V `_, `Zig `_. .. |play| image:: _static/play.png :target: ../playground :class: feed :width: 2em |play| |nbsp| Run examples in the `playground <../playground>`_. .. |feed| image:: feed/feed/feed.png :target: feed/atom.xml :class: feed :width: 2em |feed| |nbsp| `Subscribe `_ to receive release notes. Download -------- You can get the `latest release `_ on GitHub, as well as the `older releases `_. Many Linux distributions and other systems provide their own packages. The source code is hosted on both GitHub (``_) and SourceForge (``_). GitHub serves as the main repository, bugtracker and tarball hosting. SourceForge is used as a backup repository and email hosting. Bugs & patches -------------- Please send bugs reports, patches and other feedback to `GitHub issue tracker `_ or email them to `re2c-devel@lists.sourceforge.net `_ and `re2c-general@lists.sourceforge.net `_ mailing lists. There is an IRC channel ``#re2c`` on `irc.libera.chat `_ and `irc.oftc.net `_. Questions and contributions are welcome! Papers ------ - 2022 `A closer look at TDFA `_ by Angelo Borsotti and Ulya Trofimovich. arXiv:2206.01398 `[pdf 2022] <2022_borsotti_trofimovich_a_closer_look_at_tdfa.pdf>`_ - 2020 `RE2C: A lexer generator based on lookahead-TDFA `_ by Ulya Trofimovich. Software Impacts 6 (2020) 100027, `[pdf 2021] <2020_trofimovich_re2c_a_lexer_generator_based_on_lookahead_tdfa.pdf>`_ - 2019 `Efficient POSIX submatch extraction on NFA `_ by Angelo Borsotti and Ulya Trofimovich. Software: Practice and Experience 51, 2, pp. 159–192 `[pdf 2019] <2019_borsotti_trofimovich_efficient_posix_submatch_extraction_on_nfa.pdf>`_ - 2017 `Tagged Deterministic Finite Automata with Lookahead `_ by Ulya Trofimovich. arXiv:1907.08837, `[pdf 2017] <2017_trofimovich_tagged_deterministic_finite_automata_with_lookahead.pdf>`_ - 1994 `RE2C: a more versatile scanner generator `_ by Peter Bumbulis and Donald D. Cowan. ACM Letters on Programming Languages and Systems (LOPLAS) `[ps 1994] <1994_bumbulis_cowan_re2c_a_more_versatile_scanner_generator.ps>`_ Authors ------- re2c was originally written by Peter Bumbulis (peter@csg.uwaterloo.ca) in 1993. Marcus Boerger and Dan Nuffer spent several years to turn the original idea into a production ready code generator. Since then it has been maintained and developed by multiple volunteers, most notably, Brian Young (bayoung@acm.org), `Marcus Boerger `_, Dan Nuffer (nuffer@users.sourceforge.net), `Ulya Trofimovich `_ (skvadrik@gmail.com), `Serghei Iakovlev `_, `Sergei Trofimovich `_, `Petr Skocik `_, `ligfx `_ and `raekye `_. Many thanks to all other contributors! License ------- .. include:: LICENSE Version ------- This website describes re2c version |version|.