diff options
Diffstat (limited to 'chall/ply-2.2/CHANGES')
| -rw-r--r-- | chall/ply-2.2/CHANGES | 680 |
1 files changed, 0 insertions, 680 deletions
diff --git a/chall/ply-2.2/CHANGES b/chall/ply-2.2/CHANGES deleted file mode 100644 index bd0894d..0000000 --- a/chall/ply-2.2/CHANGES +++ /dev/null @@ -1,680 +0,0 @@ -Version 2.2 ------------------------------- -11/01/06: beazley - Added lexpos() and lexspan() methods to grammar symbols. These - mirror the same functionality of lineno() and linespan(). For - example: - - def p_expr(p): - 'expr : expr PLUS expr' - p.lexpos(1) # Lexing position of left-hand-expression - p.lexpos(1) # Lexing position of PLUS - start,end = p.lexspan(3) # Lexing range of right hand expression - -11/01/06: beazley - Minor change to error handling. The recommended way to skip characters - in the input is to use t.lexer.skip() as shown here: - - def t_error(t): - print "Illegal character '%s'" % t.value[0] - t.lexer.skip(1) - - The old approach of just using t.skip(1) will still work, but won't - be documented. - -10/31/06: beazley - Discarded tokens can now be specified as simple strings instead of - functions. To do this, simply include the text "ignore_" in the - token declaration. For example: - - t_ignore_cppcomment = r'//.*' - - Previously, this had to be done with a function. For example: - - def t_ignore_cppcomment(t): - r'//.*' - pass - - If start conditions/states are being used, state names should appear - before the "ignore_" text. - -10/19/06: beazley - The Lex module now provides support for flex-style start conditions - as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html. - Please refer to this document to understand this change note. Refer to - the PLY documentation for PLY-specific explanation of how this works. - - To use start conditions, you first need to declare a set of states in - your lexer file: - - states = ( - ('foo','exclusive'), - ('bar','inclusive') - ) - - This serves the same role as the %s and %x specifiers in flex. - - One a state has been declared, tokens for that state can be - declared by defining rules of the form t_state_TOK. For example: - - t_PLUS = '\+' # Rule defined in INITIAL state - t_foo_NUM = '\d+' # Rule defined in foo state - t_bar_NUM = '\d+' # Rule defined in bar state - - t_foo_bar_NUM = '\d+' # Rule defined in both foo and bar - t_ANY_NUM = '\d+' # Rule defined in all states - - In addition to defining tokens for each state, the t_ignore and t_error - specifications can be customized for specific states. For example: - - t_foo_ignore = " " # Ignored characters for foo state - def t_bar_error(t): - # Handle errors in bar state - - With token rules, the following methods can be used to change states - - def t_TOKNAME(t): - t.lexer.begin('foo') # Begin state 'foo' - t.lexer.push_state('foo') # Begin state 'foo', push old state - # onto a stack - t.lexer.pop_state() # Restore previous state - t.lexer.current_state() # Returns name of current state - - These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and - yy_top_state() functions in flex. - - The use of start states can be used as one way to write sub-lexers. - For example, the lexer or parser might instruct the lexer to start - generating a different set of tokens depending on the context. - - example/yply/ylex.py shows the use of start states to grab C/C++ - code fragments out of traditional yacc specification files. - - *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also - discussed various aspects of the design. - -10/19/06: beazley - Minor change to the way in which yacc.py was reporting shift/reduce - conflicts. Although the underlying LALR(1) algorithm was correct, - PLY was under-reporting the number of conflicts compared to yacc/bison - when precedence rules were in effect. This change should make PLY - report the same number of conflicts as yacc. - -10/19/06: beazley - Modified yacc so that grammar rules could also include the '-' - character. For example: - - def p_expr_list(p): - 'expression-list : expression-list expression' - - Suggested by Oldrich Jedlicka. - -10/18/06: beazley - Attribute lexer.lexmatch added so that token rules can access the re - match object that was generated. For example: - - def t_FOO(t): - r'some regex' - m = t.lexer.lexmatch - # Do something with m - - - This may be useful if you want to access named groups specified within - the regex for a specific token. Suggested by Oldrich Jedlicka. - -10/16/06: beazley - Changed the error message that results if an illegal character - is encountered and no default error function is defined in lex. - The exception is now more informative about the actual cause of - the error. - -Version 2.1 ------------------------------- -10/02/06: beazley - The last Lexer object built by lex() can be found in lex.lexer. - The last Parser object built by yacc() can be found in yacc.parser. - -10/02/06: beazley - New example added: examples/yply - - This example uses PLY to convert Unix-yacc specification files to - PLY programs with the same grammar. This may be useful if you - want to convert a grammar from bison/yacc to use with PLY. - -10/02/06: beazley - Added support for a start symbol to be specified in the yacc - input file itself. Just do this: - - start = 'name' - - where 'name' matches some grammar rule. For example: - - def p_name(p): - 'name : A B C' - ... - - This mirrors the functionality of the yacc %start specifier. - -09/30/06: beazley - Some new examples added.: - - examples/GardenSnake : A simple indentation based language similar - to Python. Shows how you might handle - whitespace. Contributed by Andrew Dalke. - - examples/BASIC : An implementation of 1964 Dartmouth BASIC. - Contributed by Dave against his better - judgement. - -09/28/06: beazley - Minor patch to allow named groups to be used in lex regular - expression rules. For example: - - t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)''' - - Patch submitted by Adam Ring. - -09/28/06: beazley - LALR(1) is now the default parsing method. To use SLR, use - yacc.yacc(method="SLR"). Note: there is no performance impact - on parsing when using LALR(1) instead of SLR. However, constructing - the parsing tables will take a little longer. - -09/26/06: beazley - Change to line number tracking. To modify line numbers, modify - the line number of the lexer itself. For example: - - def t_NEWLINE(t): - r'\n' - t.lexer.lineno += 1 - - This modification is both cleanup and a performance optimization. - In past versions, lex was monitoring every token for changes in - the line number. This extra processing is unnecessary for a vast - majority of tokens. Thus, this new approach cleans it up a bit. - - *** POTENTIAL INCOMPATIBILITY *** - You will need to change code in your lexer that updates the line - number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1" - -09/26/06: beazley - Added the lexing position to tokens as an attribute lexpos. This - is the raw index into the input text at which a token appears. - This information can be used to compute column numbers and other - details (e.g., scan backwards from lexpos to the first newline - to get a column position). - -09/25/06: beazley - Changed the name of the __copy__() method on the Lexer class - to clone(). This is used to clone a Lexer object (e.g., if - you're running different lexers at the same time). - -09/21/06: beazley - Limitations related to the use of the re module have been eliminated. - Several users reported problems with regular expressions exceeding - more than 100 named groups. To solve this, lex.py is now capable - of automatically splitting its master regular regular expression into - smaller expressions as needed. This should, in theory, make it - possible to specify an arbitrarily large number of tokens. - -09/21/06: beazley - Improved error checking in lex.py. Rules that match the empty string - are now rejected (otherwise they cause the lexer to enter an infinite - loop). An extra check for rules containing '#' has also been added. - Since lex compiles regular expressions in verbose mode, '#' is interpreted - as a regex comment, it is critical to use '\#' instead. - -09/18/06: beazley - Added a @TOKEN decorator function to lex.py that can be used to - define token rules where the documentation string might be computed - in some way. - - digit = r'([0-9])' - nondigit = r'([_A-Za-z])' - identifier = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)' - - from ply.lex import TOKEN - - @TOKEN(identifier) - def t_ID(t): - # Do whatever - - The @TOKEN decorator merely sets the documentation string of the - associated token function as needed for lex to work. - - Note: An alternative solution is the following: - - def t_ID(t): - # Do whatever - - t_ID.__doc__ = identifier - - Note: Decorators require the use of Python 2.4 or later. If compatibility - with old versions is needed, use the latter solution. - - The need for this feature was suggested by Cem Karan. - -09/14/06: beazley - Support for single-character literal tokens has been added to yacc. - These literals must be enclosed in quotes. For example: - - def p_expr(p): - "expr : expr '+' expr" - ... - - def p_expr(p): - 'expr : expr "-" expr' - ... - - In addition to this, it is necessary to tell the lexer module about - literal characters. This is done by defining the variable 'literals' - as a list of characters. This should be defined in the module that - invokes the lex.lex() function. For example: - - literals = ['+','-','*','/','(',')','='] - - or simply - - literals = '+=*/()=' - - It is important to note that literals can only be a single character. - When the lexer fails to match a token using its normal regular expression - rules, it will check the current character against the literal list. - If found, it will be returned with a token type set to match the literal - character. Otherwise, an illegal character will be signalled. - - -09/14/06: beazley - Modified PLY to install itself as a proper Python package called 'ply'. - This will make it a little more friendly to other modules. This - changes the usage of PLY only slightly. Just do this to import the - modules - - import ply.lex as lex - import ply.yacc as yacc - - Alternatively, you can do this: - - from ply import * - - Which imports both the lex and yacc modules. - Change suggested by Lee June. - -09/13/06: beazley - Changed the handling of negative indices when used in production rules. - A negative production index now accesses already parsed symbols on the - parsing stack. For example, - - def p_foo(p): - "foo: A B C D" - print p[1] # Value of 'A' symbol - print p[2] # Value of 'B' symbol - print p[-1] # Value of whatever symbol appears before A - # on the parsing stack. - - p[0] = some_val # Sets the value of the 'foo' grammer symbol - - This behavior makes it easier to work with embedded actions within the - parsing rules. For example, in C-yacc, it is possible to write code like - this: - - bar: A { printf("seen an A = %d\n", $1); } B { do_stuff; } - - In this example, the printf() code executes immediately after A has been - parsed. Within the embedded action code, $1 refers to the A symbol on - the stack. - - To perform this equivalent action in PLY, you need to write a pair - of rules like this: - - def p_bar(p): - "bar : A seen_A B" - do_stuff - - def p_seen_A(p): - "seen_A :" - print "seen an A =", p[-1] - - The second rule "seen_A" is merely a empty production which should be - reduced as soon as A is parsed in the "bar" rule above. The use - of the negative index p[-1] is used to access whatever symbol appeared - before the seen_A symbol. - - This feature also makes it possible to support inherited attributes. - For example: - - def p_decl(p): - "decl : scope name" - - def p_scope(p): - """scope : GLOBAL - | LOCAL""" - p[0] = p[1] - - def p_name(p): - "name : ID" - if p[-1] == "GLOBAL": - # ... - else if p[-1] == "LOCAL": - #... - - In this case, the name rule is inheriting an attribute from the - scope declaration that precedes it. - - *** POTENTIAL INCOMPATIBILITY *** - If you are currently using negative indices within existing grammar rules, - your code will break. This should be extremely rare if non-existent in - most cases. The argument to various grammar rules is not usually not - processed in the same way as a list of items. - -Version 2.0 ------------------------------- -09/07/06: beazley - Major cleanup and refactoring of the LR table generation code. Both SLR - and LALR(1) table generation is now performed by the same code base with - only minor extensions for extra LALR(1) processing. - -09/07/06: beazley - Completely reimplemented the entire LALR(1) parsing engine to use the - DeRemer and Pennello algorithm for calculating lookahead sets. This - significantly improves the performance of generating LALR(1) tables - and has the added feature of actually working correctly! If you - experienced weird behavior with LALR(1) in prior releases, this should - hopefully resolve all of those problems. Many thanks to - Andrew Waters and Markus Schoepflin for submitting bug reports - and helping me test out the revised LALR(1) support. - -Version 1.8 ------------------------------- -08/02/06: beazley - Fixed a problem related to the handling of default actions in LALR(1) - parsing. If you experienced subtle and/or bizarre behavior when trying - to use the LALR(1) engine, this may correct those problems. Patch - contributed by Russ Cox. Note: This patch has been superceded by - revisions for LALR(1) parsing in Ply-2.0. - -08/02/06: beazley - Added support for slicing of productions in yacc. - Patch contributed by Patrick Mezard. - -Version 1.7 ------------------------------- -03/02/06: beazley - Fixed infinite recursion problem ReduceToTerminals() function that - would sometimes come up in LALR(1) table generation. Reported by - Markus Schoepflin. - -03/01/06: beazley - Added "reflags" argument to lex(). For example: - - lex.lex(reflags=re.UNICODE) - - This can be used to specify optional flags to the re.compile() function - used inside the lexer. This may be necessary for special situations such - as processing Unicode (e.g., if you want escapes like \w and \b to consult - the Unicode character property database). The need for this suggested by - Andreas Jung. - -03/01/06: beazley - Fixed a bug with an uninitialized variable on repeated instantiations of parser - objects when the write_tables=0 argument was used. Reported by Michael Brown. - -03/01/06: beazley - Modified lex.py to accept Unicode strings both as the regular expressions for - tokens and as input. Hopefully this is the only change needed for Unicode support. - Patch contributed by Johan Dahl. - -03/01/06: beazley - Modified the class-based interface to work with new-style or old-style classes. - Patch contributed by Michael Brown (although I tweaked it slightly so it would work - with older versions of Python). - -Version 1.6 ------------------------------- -05/27/05: beazley - Incorporated patch contributed by Christopher Stawarz to fix an extremely - devious bug in LALR(1) parser generation. This patch should fix problems - numerous people reported with LALR parsing. - -05/27/05: beazley - Fixed problem with lex.py copy constructor. Reported by Dave Aitel, Aaron Lav, - and Thad Austin. - -05/27/05: beazley - Added outputdir option to yacc() to control output directory. Contributed - by Christopher Stawarz. - -05/27/05: beazley - Added rununit.py test script to run tests using the Python unittest module. - Contributed by Miki Tebeka. - -Version 1.5 ------------------------------- -05/26/04: beazley - Major enhancement. LALR(1) parsing support is now working. - This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu) - and optimized by David Beazley. To use LALR(1) parsing do - the following: - - yacc.yacc(method="LALR") - - Computing LALR(1) parsing tables takes about twice as long as - the default SLR method. However, LALR(1) allows you to handle - more complex grammars. For example, the ANSI C grammar - (in example/ansic) has 13 shift-reduce conflicts with SLR, but - only has 1 shift-reduce conflict with LALR(1). - -05/20/04: beazley - Added a __len__ method to parser production lists. Can - be used in parser rules like this: - - def p_somerule(p): - """a : B C D - | E F" - if (len(p) == 3): - # Must have been first rule - elif (len(p) == 2): - # Must be second rule - - Suggested by Joshua Gerth and others. - -Version 1.4 ------------------------------- -04/23/04: beazley - Incorporated a variety of patches contributed by Eric Raymond. - These include: - - 0. Cleans up some comments so they don't wrap on an 80-column display. - 1. Directs compiler errors to stderr where they belong. - 2. Implements and documents automatic line counting when \n is ignored. - 3. Changes the way progress messages are dumped when debugging is on. - The new format is both less verbose and conveys more information than - the old, including shift and reduce actions. - -04/23/04: beazley - Added a Python setup.py file to simply installation. Contributed - by Adam Kerrison. - -04/23/04: beazley - Added patches contributed by Adam Kerrison. - - - Some output is now only shown when debugging is enabled. This - means that PLY will be completely silent when not in debugging mode. - - - An optional parameter "write_tables" can be passed to yacc() to - control whether or not parsing tables are written. By default, - it is true, but it can be turned off if you don't want the yacc - table file. Note: disabling this will cause yacc() to regenerate - the parsing table each time. - -04/23/04: beazley - Added patches contributed by David McNab. This patch addes two - features: - - - The parser can be supplied as a class instead of a module. - For an example of this, see the example/classcalc directory. - - - Debugging output can be directed to a filename of the user's - choice. Use - - yacc(debugfile="somefile.out") - - -Version 1.3 ------------------------------- -12/10/02: jmdyck - Various minor adjustments to the code that Dave checked in today. - Updated test/yacc_{inf,unused}.exp to reflect today's changes. - -12/10/02: beazley - Incorporated a variety of minor bug fixes to empty production - handling and infinite recursion checking. Contributed by - Michael Dyck. - -12/10/02: beazley - Removed bogus recover() method call in yacc.restart() - -Version 1.2 ------------------------------- -11/27/02: beazley - Lexer and parser objects are now available as an attribute - of tokens and slices respectively. For example: - - def t_NUMBER(t): - r'\d+' - print t.lexer - - def p_expr_plus(t): - 'expr: expr PLUS expr' - print t.lexer - print t.parser - - This can be used for state management (if needed). - -10/31/02: beazley - Modified yacc.py to work with Python optimize mode. To make - this work, you need to use - - yacc.yacc(optimize=1) - - Furthermore, you need to first run Python in normal mode - to generate the necessary parsetab.py files. After that, - you can use python -O or python -OO. - - Note: optimized mode turns off a lot of error checking. - Only use when you are sure that your grammar is working. - Make sure parsetab.py is up to date! - -10/30/02: beazley - Added cloning of Lexer objects. For example: - - import copy - l = lex.lex() - lc = copy.copy(l) - - l.input("Some text") - lc.input("Some other text") - ... - - This might be useful if the same "lexer" is meant to - be used in different contexts---or if multiple lexers - are running concurrently. - -10/30/02: beazley - Fixed subtle bug with first set computation and empty productions. - Patch submitted by Michael Dyck. - -10/30/02: beazley - Fixed error messages to use "filename:line: message" instead - of "filename:line. message". This makes error reporting more - friendly to emacs. Patch submitted by François Pinard. - -10/30/02: beazley - Improvements to parser.out file. Terminals and nonterminals - are sorted instead of being printed in random order. - Patch submitted by François Pinard. - -10/30/02: beazley - Improvements to parser.out file output. Rules are now printed - in a way that's easier to understand. Contributed by Russ Cox. - -10/30/02: beazley - Added 'nonassoc' associativity support. This can be used - to disable the chaining of operators like a < b < c. - To use, simply specify 'nonassoc' in the precedence table - - precedence = ( - ('nonassoc', 'LESSTHAN', 'GREATERTHAN'), # Nonassociative operators - ('left', 'PLUS', 'MINUS'), - ('left', 'TIMES', 'DIVIDE'), - ('right', 'UMINUS'), # Unary minus operator - ) - - Patch contributed by Russ Cox. - -10/30/02: beazley - Modified the lexer to provide optional support for Python -O and -OO - modes. To make this work, Python *first* needs to be run in - unoptimized mode. This reads the lexing information and creates a - file "lextab.py". Then, run lex like this: - - # module foo.py - ... - ... - lex.lex(optimize=1) - - Once the lextab file has been created, subsequent calls to - lex.lex() will read data from the lextab file instead of using - introspection. In optimized mode (-O, -OO) everything should - work normally despite the loss of doc strings. - - To change the name of the file 'lextab.py' use the following: - - lex.lex(lextab="footab") - - (this creates a file footab.py) - - -Version 1.1 October 25, 2001 ------------------------------- - -10/25/01: beazley - Modified the table generator to produce much more compact data. - This should greatly reduce the size of the parsetab.py[c] file. - Caveat: the tables still need to be constructed so a little more - work is done in parsetab on import. - -10/25/01: beazley - There may be a possible bug in the cycle detector that reports errors - about infinite recursion. I'm having a little trouble tracking it - down, but if you get this problem, you can disable the cycle - detector as follows: - - yacc.yacc(check_recursion = 0) - -10/25/01: beazley - Fixed a bug in lex.py that sometimes caused illegal characters to be - reported incorrectly. Reported by Sverre Jørgensen. - -7/8/01 : beazley - Added a reference to the underlying lexer object when tokens are handled by - functions. The lexer is available as the 'lexer' attribute. This - was added to provide better lexing support for languages such as Fortran - where certain types of tokens can't be conveniently expressed as regular - expressions (and where the tokenizing function may want to perform a - little backtracking). Suggested by Pearu Peterson. - -6/20/01 : beazley - Modified yacc() function so that an optional starting symbol can be specified. - For example: - - yacc.yacc(start="statement") - - Normally yacc always treats the first production rule as the starting symbol. - However, if you are debugging your grammar it may be useful to specify - an alternative starting symbol. Idea suggested by Rich Salz. - -Version 1.0 June 18, 2001 --------------------------- -Initial public offering - |
