1324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver/* 2324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * [The "BSD license"] 3324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Copyright (c) 2010 Terence Parr 4324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * All rights reserved. 5324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 6324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Redistribution and use in source and binary forms, with or without 7324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * modification, are permitted provided that the following conditions 8324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * are met: 9324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 1. Redistributions of source code must retain the above copyright 10324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * notice, this list of conditions and the following disclaimer. 11324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 2. Redistributions in binary form must reproduce the above copyright 12324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * notice, this list of conditions and the following disclaimer in the 13324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * documentation and/or other materials provided with the distribution. 14324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 3. The name of the author may not be used to endorse or promote products 15324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * derived from this software without specific prior written permission. 16324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 17324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 18324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 19324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 20324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 21324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 22324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 23324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 24324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 25324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 26324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 27324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 28324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverpackage org.antlr.analysis; 29324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 30324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimport org.antlr.misc.Utils; 31324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimport org.antlr.tool.Grammar; 32324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 33324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimport java.util.HashSet; 34324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimport java.util.Set; 35324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 36324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver/** A module to perform optimizations on DFAs. 37324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 38324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * I could more easily (and more quickly) do some optimizations (such as 39324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * PRUNE_EBNF_EXIT_BRANCHES) during DFA construction, but then it 40324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * messes up the determinism checking. For example, it looks like 41324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * loop exit branches are unreachable if you prune exit branches 42324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * during DFA construction and before determinism checks. 43324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 44324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * In general, ANTLR's NFA->DFA->codegen pipeline seems very robust 45324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * to me which I attribute to a uniform and consistent set of data 46324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * structures. Regardless of what I want to "say"/implement, I do so 47324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * within the confines of, for example, a DFA. The code generator 48324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * can then just generate code--it doesn't have to do much thinking. 49324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Putting optimizations in the code gen code really starts to make 50324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * it a spagetti factory (uh oh, now I'm hungry!). The pipeline is 51324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * very testable; each stage has well defined input/output pairs. 52324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 53324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * ### Optimization: PRUNE_EBNF_EXIT_BRANCHES 54324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 55324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * There is no need to test EBNF block exit branches. Not only is it 56324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * an unneeded computation, but counter-intuitively, you actually get 57324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * better errors. You can report an error at the missing or extra 58324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * token rather than as soon as you've figured out you will fail. 59324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 60324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Imagine optional block "( DOT CLASS )? SEMI". ANTLR generates: 61324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 62324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * int alt=0; 63324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * if ( input.LA(1)==DOT ) { 64324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * alt=1; 65324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * } 66324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * else if ( input.LA(1)==SEMI ) { 67324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * alt=2; 68324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * } 69324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 70324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Clearly, since Parser.match() will ultimately find the error, we 71324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * do not want to report an error nor do we want to bother testing 72324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * lookahead against what follows the (...)? We want to generate 73324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * simply "should I enter the subrule?": 74324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 75324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * int alt=2; 76324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * if ( input.LA(1)==DOT ) { 77324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * alt=1; 78324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * } 79324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 80324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * NOTE 1. Greedy loops cannot be optimized in this way. For example, 81324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * "(greedy=false:'x'|.)* '\n'". You specifically need the exit branch 82324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * to tell you when to terminate the loop as the same input actually 83324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * predicts one of the alts (i.e., staying in the loop). 84324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 85324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * NOTE 2. I do not optimize cyclic DFAs at the moment as it doesn't 86324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * seem to work. ;) I'll have to investigate later to see what work I 87324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * can do on cyclic DFAs to make them have fewer edges. Might have 88324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * something to do with the EOT token. 89324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 90324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * ### PRUNE_SUPERFLUOUS_EOT_EDGES 91324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 92324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * When a token is a subset of another such as the following rules, ANTLR 93324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * quietly assumes the first token to resolve the ambiguity. 94324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 95324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * EQ : '=' ; 96324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * ASSIGNOP : '=' | '+=' ; 97324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 98324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * It can yield states that have only a single edge on EOT to an accept 99324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * state. This is a waste and messes up my code generation. ;) If 100324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Tokens rule DFA goes 101324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 102324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * s0 -'='-> s3 -EOT-> s5 (accept) 103324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 104324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * then s5 should be pruned and s3 should be made an accept. Do NOT do this 105324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * for keyword versus ID as the state with EOT edge emanating from it will 106324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * also have another edge. 107324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 108324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * ### Optimization: COLLAPSE_ALL_INCIDENT_EDGES 109324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 110324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Done during DFA construction. See method addTransition() in 111324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * NFAToDFAConverter. 112324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 113324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * ### Optimization: MERGE_STOP_STATES 114324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * 115324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * Done during DFA construction. See addDFAState() in NFAToDFAConverter. 116324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 117324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverpublic class DFAOptimizer { 118324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public static boolean PRUNE_EBNF_EXIT_BRANCHES = true; 119324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public static boolean PRUNE_TOKENS_RULE_SUPERFLUOUS_EOT_EDGES = true; 120324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public static boolean COLLAPSE_ALL_PARALLEL_EDGES = true; 121324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public static boolean MERGE_STOP_STATES = true; 122324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 123324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /** Used by DFA state machine generator to avoid infinite recursion 124324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * resulting from cycles int the DFA. This is a set of int state #s. 125324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * This is a side-effect of calling optimize; can't clear after use 126324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * because code gen needs it. 127324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 128324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected Set visited = new HashSet(); 129324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 130324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected Grammar grammar; 131324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 132324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public DFAOptimizer(Grammar grammar) { 133324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver this.grammar = grammar; 134324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 135324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 136324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver public void optimize() { 137324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // optimize each DFA in this grammar 138324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for (int decisionNumber=1; 139324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver decisionNumber<=grammar.getNumberOfDecisions(); 140324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver decisionNumber++) 141324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver { 142324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver DFA dfa = grammar.getLookaheadDFA(decisionNumber); 143324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver optimize(dfa); 144324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 145324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 146324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 147324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected void optimize(DFA dfa) { 148324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( dfa==null ) { 149324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return; // nothing to do 150324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 151324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /* 152324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver System.out.println("Optimize DFA "+dfa.decisionNFAStartState.decisionNumber+ 153324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver " num states="+dfa.getNumberOfStates()); 154324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 155324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver //long start = System.currentTimeMillis(); 156324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( PRUNE_EBNF_EXIT_BRANCHES && dfa.canInlineDecision() ) { 157324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.clear(); 158324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver int decisionType = 159324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver dfa.getNFADecisionStartState().decisionStateType; 160324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( dfa.isGreedy() && 161324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver (decisionType==NFAState.OPTIONAL_BLOCK_START || 162324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver decisionType==NFAState.LOOPBACK) ) 163324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver { 164324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver optimizeExitBranches(dfa.startState); 165324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 166324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 167324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // If the Tokens rule has syntactically ambiguous rules, try to prune 168324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( PRUNE_TOKENS_RULE_SUPERFLUOUS_EOT_EDGES && 169324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver dfa.isTokensRuleDecision() && 170324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver dfa.probe.stateToSyntacticallyAmbiguousTokensRuleAltsMap.size()>0 ) 171324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver { 172324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.clear(); 173324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver optimizeEOTBranches(dfa.startState); 174324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 175324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 176324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /* ack...code gen needs this, cannot optimize 177324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.clear(); 178324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver unlinkUnneededStateData(dfa.startState); 179324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 180324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver //long stop = System.currentTimeMillis(); 181324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver //System.out.println("minimized in "+(int)(stop-start)+" ms"); 182324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 183324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 184324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected void optimizeExitBranches(DFAState d) { 185324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Integer sI = Utils.integer(d.stateNumber); 186324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( visited.contains(sI) ) { 187324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return; // already visited 188324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 189324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.add(sI); 190324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver int nAlts = d.dfa.getNumberOfAlts(); 191324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for (int i = 0; i < d.getNumberOfTransitions(); i++) { 192324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Transition edge = (Transition) d.transition(i); 193324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver DFAState edgeTarget = ((DFAState)edge.target); 194324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /* 195324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver System.out.println(d.stateNumber+"-"+ 196324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edge.label.toString(d.dfa.nfa.grammar)+"->"+ 197324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edgeTarget.stateNumber); 198324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 199324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // if target is an accept state and that alt is the exit alt 200324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( edgeTarget.isAcceptState() && 201324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edgeTarget.getUniquelyPredictedAlt()==nAlts) 202324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver { 203324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /* 204324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver System.out.println("ignoring transition "+i+" to max alt "+ 205324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.dfa.getNumberOfAlts()); 206324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 207324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.removeTransition(i); 208324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver i--; // back up one so that i++ of loop iteration stays within bounds 209324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 210324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver optimizeExitBranches(edgeTarget); 211324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 212324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 213324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 214324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected void optimizeEOTBranches(DFAState d) { 215324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Integer sI = Utils.integer(d.stateNumber); 216324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( visited.contains(sI) ) { 217324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return; // already visited 218324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 219324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.add(sI); 220324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for (int i = 0; i < d.getNumberOfTransitions(); i++) { 221324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Transition edge = (Transition) d.transition(i); 222324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver DFAState edgeTarget = ((DFAState)edge.target); 223324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /* 224324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver System.out.println(d.stateNumber+"-"+ 225324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edge.label.toString(d.dfa.nfa.grammar)+"->"+ 226324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edgeTarget.stateNumber); 227324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 228324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // if only one edge coming out, it is EOT, and target is accept prune 229324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( PRUNE_TOKENS_RULE_SUPERFLUOUS_EOT_EDGES && 230324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edgeTarget.isAcceptState() && 231324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.getNumberOfTransitions()==1 && 232324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edge.label.isAtom() && 233324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edge.label.getAtom()==Label.EOT ) 234324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver { 235324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver //System.out.println("state "+d+" can be pruned"); 236324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // remove the superfluous EOT edge 237324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.removeTransition(i); 238324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.setAcceptState(true); // make it an accept state 239324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver // force it to uniquely predict the originally predicted state 240324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.cachedUniquelyPredicatedAlt = 241324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver edgeTarget.getUniquelyPredictedAlt(); 242324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver i--; // back up one so that i++ of loop iteration stays within bounds 243324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 244324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver optimizeEOTBranches(edgeTarget); 245324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 246324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 247324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 248324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver /** Walk DFA states, unlinking the nfa configs and whatever else I 249324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver * can to reduce memory footprint. 250324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver protected void unlinkUnneededStateData(DFAState d) { 251324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Integer sI = Utils.integer(d.stateNumber); 252324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if ( visited.contains(sI) ) { 253324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return; // already visited 254324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 255324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver visited.add(sI); 256324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver d.nfaConfigurations = null; 257324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for (int i = 0; i < d.getNumberOfTransitions(); i++) { 258324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Transition edge = (Transition) d.transition(i); 259324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver DFAState edgeTarget = ((DFAState)edge.target); 260324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver unlinkUnneededStateData(edgeTarget); 261324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 262324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver } 263324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver */ 264324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 265324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver} 266