VERSION 2.0.2 Fixed incomplete (read: buggy) implementation of Nfa.completeToSkip(). See also http://cs.stackexchange.com/q/50259/42743 for a deeper discussion of this feature. VERSION 2.0.1 Removed left-over debugging output in Nfa.java. VERSION 2.0.0 Added two new syntax features to the regular expression parser. 1) \xhh and \uhhhh with 2 or 4 hex digits is now understood as the respective unicode character. 2) The syntax (re){low,high} or (re){count} or (re){low,} is now understood. Use with care, though, with high numbers, since for a Dfa, the re is duplicated as many times as needed, since a Dfa cannot count. In particular (2) is an imcompatible change, since '{' now has a special meaning which changes the meaning of available regular expressions. For more about the new syntax see: http://haraldki.github.io/monqjfa/monqApiDoc/monq/jfa/doc-files/resyntax.html VERSION 1.7.1 Removed dependency on XStream to read in a single configuration file. The same can be done much easier with a standard Java properties file. VERSION 1.7.0 Added two new operations to the NFA: allPrefixes --- makes all useful states, except the start state, into stop states such that the resulting automaton accepts all non-empty prefixes of the original automaton. In a regular expression, this can be performed with the suffix operator '@'. Its age old and deprecated use to mark reporting subexpressions can now no longer be used. completeToSkip --- completes the automaton nearly with its own complement such that it nearly never fails to match. See the javadoc for more information. The goal was that during filtering the automaton nearly never hits a no-match and thereby is faster overall. My measurements indicate that this is not the case. The speed is the same is just handling non-matches specially, but it is also not slower. VERSION 1.6.2 Fixed a seemingly long standing bug in Dfa.toNfa(), which did not make sure that the start state used for the generated Nfa satisfies the condition that it has no incoming transitions, which is needed for subsequent Thompson construction steps. VERSION 1.6.1 Improved on the number of warnings due to still Java 1.4 code without generics and got rid of a C/FORTRAN style "optimization". VERSION 1.6 Replaced a three internal algorithms to not use recursion, since this bombs out easily on degenerate automata with a large depth. The ugliest was the one to remove useless states. The way it was implemented, I was not able anymore to convince me that is really worked in all cases. The new algorithm is still not exactly "clean code", but it is rather straight forward. VERSION 1.5 Inspection of the test coverage data revealed a subtle bug in the Dfa.toNfa() method. A not so subtle bug was a stack based recursion visiting all state of an Nfa or Dfa that easily led to a StackOverflowError for deep finite automata. A visitor pattern was introduced and used in two places to circumvent the error while also leading to much better code readability. VERSION 1.4 Added a new setting to Nfa objects which trades RAM for speed when generating transition tables during DFA compilation. A performance test with randomly generated text reflecting the bi-gram statistics of English text and search for around 100 words in parallel, showed a speedup of around 20% and more. VERSION 1.3 Replaced all instances of StringBuffer with StringBuilder. For a certain use case of finding N=35 English words in a text of 2.2 million English words, this resulted in a speed-up of ~1.8 (reduction in runtime from 56 to 30 seconds on the same machine). Version 1.2 Got the build system going again and publishing on my homepage. VERSION 1.1.1 This is a bugfix release to correct improper unsynchronized reuse of an object within monq.jfa.actions.Printf. The user visible symptom would be garbled output. VERSION 1.1 With this version, monqjfa made the move to Java 1.5. When compiling, there is only one 'unchecked' warning left. Where necessary, generic parameters were added and in the process, a few loops turned out to be more readable with 1.5 syntax. In addition, there is a new FaAction called monq.jfa.actions.Call. It simplifies writing applications a lot, because all callbacks for a Dfa can be combined as methods in one class and need not be implemented as individual (anonymous) classes. Some minor improvements and some cleanup leads to slightly incompatible changes in monq.clifj. The necessary changes should be obvious from compiler output. VERSION 1.0 This version is functionally equivalent to version 0.19. With version 1.0 we celebrate the move to berlios.de and the entry of 6 developers. VERSION 0.19 IMPORTANT: The communication between Dist(Pipe)?Filter and FilterServiceFactory changed. You have to recompile and restart servers in order to work with the new Dist(Pipe)?Filter. The ServiceFactory interface was changed so that createService accepts an additional arbitrary Object that can be used to parameterize the Service created. FilterServiceFactory, DistPipeFilter and DistFilter make use of this by providing for a set of key/value pairs to be send upstream to individual servers. Introduced Feeder and Drainer interfaces in monq.stuff. A Feeder is needed to use a DistPipeFilter. Together they result in a pipe, i.e. a Runnable that passes data from an InputStream into an OutputStream. AbstractPipe has a basic implementation that can be used to implement Feeders and Drainers easily. Feeders feeding from a Reader, a CharSequence and a DfaRun are available. VERSION 0.18 Bug fixes: a) The templates of DictFilter went wrong when they contained character entities like "<". This bug was introduced in version 0.13. VERSION 0.17 Bug fixes: a) TcpServer could get stuck on unchecked exceptions thrown by a Service. b) DfaRun could gobble all memory in mode UNMATCHED_COPY if no match was found at all. Features: a) Historically DictFilter always added a default word in addition to what was found in the dictionary. This can now be switched off on the command line. VERSION 0.16 A region-of-interest (ROI) --- similar to monq.programs.Grep --- can now be defined when setting up a finite automaton (FA) via Jython (monq.jfa.JyFA). The ROI restricts the work of the pattern/action pairs of the FA to the region of interest. Text outside the ROI can either be deleted or ignored. Unit tests were added for JyFA. Exception reporting for CallbackExceptions was (hopefully) improved. VERSION 0.15a Maintenance release: Example java program and description of the regular expression syntax where missing in the documentation package. VERSION 0.15 Jython can now be easily used to set up all the regular expressions for an Nfa. The class monq.jfa.JyFA loads an Nfa from a jython module and monq.programs.JythonFilter allows to set up a series of JyFAs from the command lin. VERSION 0.14 Maintenence release: The example mentioned in README did not compile. VERSION 0.13 Major code rearrangements and cleanup: 1) Because the way to handle unmatched input as well as which action callback to call at EOF are design issues with an automaton, they are now properly stored in the Dfa when it is compiled from the Nfa. The DfaRun retrieves both pieces of information from the Dfa. 2) Several constructors of DfaRun as well as of Nfa were deprecated/deleted to reduce code bloat. Similarly DfaRun.setIn(). The major impact is that creation of a CharSource must be done explicitely. 3) Formatter.format() has an explicit Map argument to avoid storing the Map in the Formatter which, for Printf(Formatter), would result in the Dfa not be shareable between threads. 4) All monq.jfa.actions.* got their internal state removed so that the can be used in a Dfa which is shared between threads. They retrieve the state to manipulate from DfaRun.clientData. 5) Xml.splitElement versions to fill a TextStore are no longer available. Now always a Map is filled. VERSION 0.12 This version was internal only. Some of the more visible changes include: Sizeof also records parent statistics for each type; Term2Re does not add optional plural 's' to words not ending in a character; DictFilter's got an attribute for trailing context, i/o encodings handled more seriously; DfaService deprecated due to improper encodings handling, use DfaRunService instead. VERSION 0.10 This is a bug fix release. In monq.net.TcpServer, creating the service object was done in the connection related thread, which meant that the ServiceFactory to create the Service object had to be thread safe. Now createService is called in the server's main thread only. As a result, FilterServiceFactory had to be changed to not potentially block while creating a service. Another fix regards a bad allocation strategy in the code handling submatches, which lead to extremely slow operation for huge matches. VERSION 0.9 The package monq.jfa.ctx, which supports recursive parsing, now has a tutorial introduction in the package description. This also triggered a rewrite which simplified things and corrected a serious bug. Unit tests with nearly 100% coverage were added. VERSION 0.8 The new package monq.jfa.ctx is an experiment in recursive parsing with jfa. The package provides the infrastructure to call callbacks based on a regular expression AND a stack context. More info can be found in the api documentation, in particular for monq.jfa.ctx.ContextManager. VERSION 0.7 First public GPLed version. VERSION 0.6 Stable version used for quite some time in house.