1324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver#!/usr/bin/ruby 2324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver# encoding: utf-8 3324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 4324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin LICENSE 5324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 6324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver[The "BSD licence"] 7324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCopyright (c) 2009-2010 Kyle Yetter 8324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverAll rights reserved. 9324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 10324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverRedistribution and use in source and binary forms, with or without 11324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodification, are permitted provided that the following conditions 12324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverare met: 13324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 14324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1. Redistributions of source code must retain the above copyright 15324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver notice, this list of conditions and the following disclaimer. 16324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 2. Redistributions in binary form must reproduce the above copyright 17324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver notice, this list of conditions and the following disclaimer in the 18324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver documentation and/or other materials provided with the distribution. 19324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 3. The name of the author may not be used to endorse or promote products 20324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver derived from this software without specific prior written permission. 21324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 22324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 23324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 24324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 26324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverINCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 27324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverNOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 29324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 30324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 31324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 33324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 34324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 35324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule ANTLR3 36324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 37324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::Token 38324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 39324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverAt a minimum, tokens are data structures that bind together a chunk of text and 40324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvera corresponding type symbol, which categorizes/characterizes the content of the 41324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertext. Tokens also usually carry information about their location in the input, 42324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversuch as absolute character index, line number, and position within the line (or 43324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercolumn). 44324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 45324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFurthermore, ANTLR tokens are assigned a "channel" number, an extra degree of 46324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercategorization that groups things on a larger scale. Parsers will usually ignore 47324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertokens that have channel value 99 (the HIDDEN_CHANNEL), so you can keep things 48324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverlike comment and white space huddled together with neighboring tokens, 49324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvereffectively ignoring them without discarding them. 50324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 51324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR tokens also keep a reference to the source stream from which they 52324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveroriginated. Token streams will also provide an index value for the token, which 53324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverindicates the position of the token relative to other tokens in the stream, 54324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverstarting at zero. For example, the 22nd token pulled from a lexer by 55324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTokenStream will have index value 21. 56324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 57324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== Token as an Interface 58324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 59324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThis library provides a token implementation (see CommonToken). Additionally, 60324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveryou may write your own token class as long as you provide methods that give 61324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveraccess to the attributes expected by a token. Even though most of the ANTLR 62324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverlibrary tries to use duck-typing techniques instead of pure object-oriented type 63324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverchecking, it's a good idea to include this ANTLR3::Token into your customized 64324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertoken class. 65324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 66324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 67324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 68324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule Token 69324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include ANTLR3::Constants 70324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Comparable 71324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 72324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the token's associated chunk of text 73324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :text 74324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 75324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the integer value associated with the token's type 76324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :type 77324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 78324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the text's starting line number within the source (indexed starting at 1) 79324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :line 80324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 81324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the text's starting position in the line within the source (indexed starting at 0) 82324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :column 83324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 84324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the integer value of the channel to which the token is assigned 85324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :channel 86324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 87324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the index of the token with respect to other the other tokens produced during lexing 88324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :index 89324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 90324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # a reference to the input stream from which the token was extracted 91324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :input 92324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 93324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the absolute character index in the input at which the text starts 94324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :start 95324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 96324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the absolute character index in the input at which the text ends 97324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :stop 98324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 99324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :input_stream :input 100324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :input_stream= :input= 101324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :token_index :index 102324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :token_index= :index= 103324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 104324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 105324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # The match operator has been implemented to match against several different 106324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # attributes of a token for convenience in quick scripts 107324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 108324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # @example Match against an integer token type constant 109324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token =~ VARIABLE_NAME => true/false 110324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # @example Match against a token type name as a Symbol 111324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token =~ :FLOAT => true/false 112324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # @example Match the token text against a Regular Expression 113324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token =~ /^@[a-z_]\w*$/i 114324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # @example Compare the token's text to a string 115324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token =~ "class" 116324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 117324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def =~ obj 118324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case obj 119324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when Integer then type == obj 120324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when Symbol then name == obj.to_s 121324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when Regexp then obj =~ text 122324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when String then text == obj 123324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else super 124324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 125324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 126324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 127324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 128324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Tokens are comparable by their stream index values 129324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 130324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def <=> tk2 131324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index <=> tk2.index 132324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 133324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 134324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize_copy( orig ) 135324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.index = -1 136324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.type = orig.type 137324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.channel = orig.channel 138324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.text = orig.text.clone if orig.text 139324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.start = orig.start 140324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.stop = orig.stop 141324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.line = orig.line 142324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.column = orig.column 143324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.input = orig.input 144324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 145324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 146324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def concrete? 147324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver input && start && stop ? true : false 148324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 149324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 150324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def imaginary? 151324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver input && start && stop ? false : true 152324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 153324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 154324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def name 155324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_name( type ) 156324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 157324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 158324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def source_name 159324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver i = input and i.source_name 160324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 161324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 162324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def hidden? 163324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver channel == HIDDEN_CHANNEL 164324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 165324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 166324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def source_text 167324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver concrete? ? input.substring( start, stop ) : text 168324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 169324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 170324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 171324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Sets the token's channel value to HIDDEN_CHANNEL 172324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 173324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def hide! 174324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.channel = HIDDEN_CHANNEL 175324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 176324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 177324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def inspect 178324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver text_inspect = text ? "[#{ text.inspect }] " : ' ' 179324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver text_position = line > 0 ? "@ line #{ line } col #{ column } " : '' 180324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver stream_position = start ? "(#{ range.inspect })" : '' 181324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 182324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver front = index >= 0 ? "#{ index } " : '' 183324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver rep = front << name << text_inspect << 184324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver text_position << stream_position 185324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver rep.strip! 186324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver channel == DEFAULT_CHANNEL or rep << " (#{ channel.to_s })" 187324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return( rep ) 188324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 189324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 190324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def pretty_print( printer ) 191324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver printer.text( inspect ) 192324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 193324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 194324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def range 195324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver start..stop rescue nil 196324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 197324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 198324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def to_i 199324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index.to_i 200324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 201324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 202324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def to_s 203324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver text.to_s 204324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 205324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 206324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverprivate 207324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 208324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_name( type ) 209324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver BUILT_IN_TOKEN_NAMES[ type ] 210324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 211324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 212324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 213324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonToken = Struct.new( :type, :channel, :text, :input, :start, 214324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :stop, :index, :line, :column ) 215324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 216324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::CommonToken 217324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 218324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe base class for the standard implementation of Token. It is implemented as a 219324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversimple Struct as tokens are basically simple data structures binding together a 220324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverbunch of different information and Structs are slightly faster than a standard 221324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverObject with accessor methods implementation. 222324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 223324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverBy default, ANTLR generated ruby code will provide a customized subclass of 224324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonToken to track token-type names efficiently for debugging, inspection, and 225324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvergeneral utility. Thus code generated for a standard combo lexer-parser grammar 226324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernamed XYZ will have a base module named XYZ and a customized CommonToken 227324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversubclass named XYZ::Token. 228324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 229324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverHere is the token structure attribute list in order: 230324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 231324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>type</tt> 232324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>channel</tt> 233324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>text</tt> 234324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>input</tt> 235324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>start</tt> 236324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>stop</tt> 237324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>index</tt> 238324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>line</tt> 239324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver* <tt>column</tt> 240324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 241324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 242324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 243324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass CommonToken 244324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Token 245324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver DEFAULT_VALUES = { 246324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :channel => DEFAULT_CHANNEL, 247324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :index => -1, 248324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :line => 0, 249324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :column => -1 250324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver }.freeze 251324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 252324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def self.token_name( type ) 253324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver BUILT_IN_TOKEN_NAMES[ type ] 254324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 255324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 256324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def self.create( fields = {} ) 257324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver fields = DEFAULT_VALUES.merge( fields ) 258324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver args = members.map { |name| fields[ name.to_sym ] } 259324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver new( *args ) 260324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 261324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 262324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # allows you to make a copy of a token with a different class 263324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def self.from_token( token ) 264324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver new( 265324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token.type, token.channel, token.text ? token.text.clone : nil, 266324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token.input, token.start, token.stop, -1, token.line, token.column 267324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ) 268324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 269324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 270324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize( type = nil, channel = DEFAULT_CHANNEL, text = nil, 271324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver input = nil, start = nil, stop = nil, index = -1, 272324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver line = 0, column = -1 ) 273324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver super 274324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? and yield( self ) 275324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.text.nil? && self.start && self.stop and 276324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.text = self.input.substring( self.start, self.stop ) 277324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 278324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 279324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :input_stream :input 280324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :input_stream= :input= 281324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :token_index :index 282324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias :token_index= :index= 283324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 284324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 285324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule Constants 286324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 287324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # End of File / End of Input character and token type 288324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver EOF_TOKEN = CommonToken.new( EOF ).freeze 289324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver INVALID_TOKEN = CommonToken.new( INVALID_TOKEN_TYPE ).freeze 290324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver SKIP_TOKEN = CommonToken.new( INVALID_TOKEN_TYPE ).freeze 291324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 292324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 293324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 294324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 295324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::TokenSource 296324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 297324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenSource is a simple mixin module that demands an 298324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplementation of the method #next_token. In return, it 299324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdefines methods #next and #each, which provide basic 300324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveriterator methods for token generators. Furthermore, it 301324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverincludes Enumerable to provide the standard Ruby iteration 302324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods to token generators, like lexers. 303324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 304324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 305324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 306324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule TokenSource 307324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Constants 308324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Enumerable 309324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver extend ClassMacros 310324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 311324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :next_token 312324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 313324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def next 314324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token = next_token() 315324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver raise StopIteration if token.nil? || token.type == EOF 316324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return token 317324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 318324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 319324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def each 320324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? or return enum_for( :each ) 321324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver while token = next_token and token.type != EOF 322324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver yield( token ) 323324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 324324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 325324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 326324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 327324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def to_stream( options = {} ) 328324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if block_given? 329324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver CommonTokenStream.new( self, options ) { | t, stream | yield( t, stream ) } 330324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 331324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver CommonTokenStream.new( self, options ) 332324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 333324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 334324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 335324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 336324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 337324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::TokenFactory 338324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 339324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThere are a variety of different entities throughout the ANTLR runtime library 340324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthat need to create token objects This module serves as a mixin that provides 341324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods for constructing tokens. 342324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 343324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIncluding this module provides a +token_class+ attribute. Instance of the 344324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverincluding class can create tokens using the token class (which defaults to 345324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR3::CommonToken). Token classes are presumed to have an #initialize method 346324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthat can be called without any parameters and the token objects are expected to 347324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverhave the standard token attributes (see ANTLR3::Token). 348324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 349324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 350324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 351324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule TokenFactory 352324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_writer :token_class 353324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_class 354324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_class ||= begin 355324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.class.token_class rescue 356324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self::Token rescue 357324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ANTLR3::CommonToken 358324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 359324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 360324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 361324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def create_token( *args ) 362324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if block_given? 363324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_class.new( *args ) do |*targs| 364324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver yield( *targs ) 365324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 366324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 367324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_class.new( *args ) 368324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 369324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 370324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 371324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 372324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 373324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::TokenScheme 374324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 375324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenSchemes exist to handle the problem of defining token types as integer 376324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvervalues while maintaining meaningful text names for the types. They are 377324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdynamically defined modules that map integer values to constants with token-type 378324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernames. 379324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 380324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver--- 381324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 382324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFundamentally, tokens exist to take a chunk of text and identify it as belonging 383324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto some category, like "VARIABLE" or "INTEGER". In code, the category is 384324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrepresented by an integer -- some arbitrary value that ANTLR will decide to use 385324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveras it is creating the recognizer. The purpose of using an integer (instead of 386324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversay, a ruby symbol) is that ANTLR's decision logic often needs to test whether a 387324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertoken's type falls within a range, which is not possible with symbols. 388324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 389324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe downside of token types being represented as integers is that a developer 390324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverneeds to be able to reference the unknown type value by name in action code. 391324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFurthermore, code that references the type by name and tokens that can be 392324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinspected with names in place of type values are more meaningful to a developer. 393324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 394324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverSince ANTLR requires token type names to follow capital-letter naming 395324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverconventions, defining types as named constants of the recognizer class resolves 396324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe problem of referencing type values by name. Thus, a token type like 397324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver``VARIABLE'' can be represented by a number like 5 and referenced within code by 398324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver+VARIABLE+. However, when a recognizer creates tokens, the name of the token's 399324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertype cannot be seen without using the data defined in the recognizer. 400324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 401324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOf course, tokens could be defined with a name attribute that could be specified 402324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhen tokens are created. However, doing so would make tokens take up more space 403324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthan necessary, as well as making it difficult to change the type of a token 404324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhile maintaining a correct name value. 405324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 406324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenSchemes exist as a technique to manage token type referencing and name 407324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverextraction. They: 408324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 409324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver1. keep token type references clear and understandable in recognizer code 410324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver2. permit access to a token's type-name independently of recognizer objects 411324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver3. allow multiple classes to share the same token information 412324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 413324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== Building Token Schemes 414324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 415324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenScheme is a subclass of Module. Thus, it has the method 416324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>TokenScheme.new(tk_class = nil) { ... module-level code ...}</tt>, which 417324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwill evaluate the block in the context of the scheme (module), similarly to 418324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverModule#module_eval. Before evaluating the block, <tt>.new</tt> will setup the 419324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule with the following actions: 420324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 421324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver1. define a customized token class (more on that below) 422324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver2. add a new constant, TOKEN_NAMES, which is a hash that maps types to names 423324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver3. dynamically populate the new scheme module with a couple instance methods 424324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver4. include ANTLR3::Constants in the new scheme module 425324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 426324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverAs TokenScheme the class functions as a metaclass, figuring out some of the 427324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverscoping behavior can be mildly confusing if you're trying to get a handle of the 428324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverentity for your own purposes. Remember that all of the instance methods of 429324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenScheme function as module-level methods of TokenScheme instances, ala 430324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver+attr_accessor+ and friends. 431324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 432324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>TokenScheme#define_token(name_symbol, int_value)</tt> adds a constant 433324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdefinition <tt>name_symbol</tt> with the value <tt>int_value</tt>. It is 434324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveressentially like <tt>Module#const_set</tt>, except it forbids constant 435324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveroverwriting (which would mess up recognizer code fairly badly) and adds an 436324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinverse type-to-name map to its own <tt>TOKEN_NAMES</tt> table. 437324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>TokenScheme#define_tokens</tt> is a convenience method for defining many 438324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertypes with a hash pairing names to values. 439324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 440324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>TokenScheme#register_name(value, name_string)</tt> specifies a custom 441324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertype-to-name definition. This is particularly useful for the anonymous tokens 442324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthat ANTLR generates for literal strings in the grammar specification. For 443324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverexample, if you refer to the literal <tt>'='</tt> in some parser rule in your 444324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvergrammar, ANTLR will add a lexer rule for the literal and give the token a name 445324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverlike <tt>T__<i>x</i></tt>, where <tt><i>x</i></tt> is the type's integer value. 446324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverSince this is pretty meaningless to a developer, generated code should add a 447324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverspecial name definition for type value <tt><i>x</i></tt> with the string 448324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>"'='"</tt>. 449324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 450324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== Sample TokenScheme Construction 451324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 452324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver TokenData = ANTLR3::TokenScheme.new do 453324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_tokens( 454324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :INT => 4, 455324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :ID => 6, 456324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :T__5 => 5, 457324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver :WS => 7 458324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ) 459324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 460324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # note the self:: scoping below is due to the fact that 461324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # ruby lexically-scopes constant names instead of 462324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # looking up in the current scope 463324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver register_name(self::T__5, "'='") 464324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 465324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 466324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver TokenData::ID # => 6 467324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver TokenData::T__5 # => 5 468324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver TokenData.token_name(4) # => 'INT' 469324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver TokenData.token_name(5) # => "'='" 470324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 471324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver class ARecognizerOrSuch < ANTLR3::Parser 472324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include TokenData 473324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ID # => 6 474324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 475324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 476324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== Custom Token Classes and Relationship with Tokens 477324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 478324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen a TokenScheme is created, it will define a subclass of ANTLR3::CommonToken 479324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand assigned it to the constant name +Token+. This token class will both include 480324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand extend the scheme module. Since token schemes define the private instance 481324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethod <tt>token_name(type)</tt>, instances of the token class are now able to 482324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverprovide their type names. The Token method <tt>name</tt> uses the 483324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>token_name</tt> method to provide the type name as if it were a simple 484324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverattribute without storing the name itself. 485324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 486324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen a TokenScheme is included in a recognizer class, the class will now have 487324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe token types as named constants, a type-to-name map constant +TOKEN_NAMES+, 488324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand a grammar-specific subclass of ANTLR3::CommonToken assigned to the constant 489324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverToken. Thus, when recognizers need to manufacture tokens, instead of using the 490324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvergeneric CommonToken class, they can create tokens using the customized Token 491324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass provided by the token scheme. 492324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 493324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIf you need to use a token class other than CommonToken, you can pass the class 494324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveras a parameter to TokenScheme.new, which will be used in place of the 495324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdynamically-created CommonToken subclass. 496324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 497324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 498324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 499324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass TokenScheme < ::Module 500324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include TokenFactory 501324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 502324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def self.new( tk_class = nil, &body ) 503324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver super() do 504324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk_class ||= Class.new( ::ANTLR3::CommonToken ) 505324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self.token_class = tk_class 506324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 507324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver const_set( :TOKEN_NAMES, ::ANTLR3::Constants::BUILT_IN_TOKEN_NAMES.clone ) 508324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 509324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @types = ::ANTLR3::Constants::BUILT_IN_TOKEN_NAMES.invert 510324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @unused = ::ANTLR3::Constants::MIN_TOKEN_TYPE 511324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 512324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver scheme = self 513324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_method( :token_scheme ) { scheme } 514324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_method( :token_names ) { scheme::TOKEN_NAMES } 515324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_method( :token_name ) do |type| 516324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver begin 517324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names[ type ] or super 518324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver rescue NoMethodError 519324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ::ANTLR3::CommonToken.token_name( type ) 520324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 521324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 522324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver module_function :token_name, :token_names 523324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 524324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include ANTLR3::Constants 525324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 526324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver body and module_eval( &body ) 527324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 528324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 529324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 530324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def self.build( *token_names ) 531324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names = [ token_names ].flatten! 532324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names.compact! 533324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names.uniq! 534324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk_class = Class === token_names.first ? token_names.shift : nil 535324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver value_maps, names = token_names.partition { |i| Hash === i } 536324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver new( tk_class ) do 537324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for value_map in value_maps 538324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_tokens( value_map ) 539324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 540324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 541324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for name in names 542324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_token( name ) 543324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 544324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 545324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 546324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 547324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 548324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def included( mod ) 549324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver super 550324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver mod.extend( self ) 551324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 552324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver private :included 553324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 554324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :unused, :types 555324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 556324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def define_tokens( token_map = {} ) 557324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for token_name, token_value in token_map 558324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver define_token( token_name, token_value ) 559324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 560324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 561324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 562324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 563324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def define_token( name, value = nil ) 564324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver name = name.to_s 565324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 566324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if current_value = @types[ name ] 567324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token type has already been defined 568324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # raise an error unless value is the same as the current value 569324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver value ||= current_value 570324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver unless current_value == value 571324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver raise NameError.new( 572324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver "new token type definition ``#{ name } = #{ value }'' conflicts " << 573324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver "with existing type definition ``#{ name } = #{ current_value }''", name 574324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ) 575324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 576324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 577324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver value ||= @unused 578324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if name =~ /^[A-Z]\w*$/ 579324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver const_set( name, @types[ name ] = value ) 580324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 581324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver constant = "T__#{ value }" 582324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver const_set( constant, @types[ constant ] = value ) 583324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @types[ name ] = value 584324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 585324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver register_name( value, name ) unless built_in_type?( value ) 586324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 587324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 588324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver value >= @unused and @unused = value + 1 589324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 590324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 591324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 592324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def register_names( *names ) 593324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if names.length == 1 and Hash === names.first 594324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver names.first.each do |value, name| 595324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver register_name( value, name ) 596324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 597324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 598324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver names.each_with_index do |name, i| 599324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver type_value = Constants::MIN_TOKEN_TYPE + i 600324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver register_name( type_value, name ) 601324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 602324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 603324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 604324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 605324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def register_name( type_value, name ) 606324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver name = name.to_s.freeze 607324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if token_names.has_key?( type_value ) 608324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver current_name = token_names[ type_value ] 609324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver current_name == name and return name 610324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 611324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if current_name == "T__#{ type_value }" 612324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # only an anonymous name is registered -- upgrade the name to the full literal name 613324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names[ type_value ] = name 614324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver elsif name == "T__#{ type_value }" 615324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # ignore name downgrade from literal to anonymous constant 616324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return current_name 617324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 618324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver error = NameError.new( 619324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver "attempted assignment of token type #{ type_value }" << 620324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver " to name #{ name } conflicts with existing name #{ current_name }", name 621324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ) 622324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver raise error 623324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 624324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 625324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token_names[ type_value ] = name.to_s.freeze 626324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 627324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 628324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 629324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def built_in_type?( type_value ) 630324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Constants::BUILT_IN_TOKEN_NAMES.fetch( type_value, false ) and true 631324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 632324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 633324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_defined?( name_or_value ) 634324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case value 635324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when Integer then token_names.has_key?( name_or_value ) 636324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else const_defined?( name_or_value.to_s ) 637324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 638324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 639324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 640324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def []( name_or_value ) 641324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case name_or_value 642324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when Integer then token_names.fetch( name_or_value, nil ) 643324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else const_get( name_or_value.to_s ) rescue token_names.index( name_or_value ) 644324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 645324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 646324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 647324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_class 648324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self::Token 649324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 650324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 651324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_class=( klass ) 652324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Class === klass or raise( TypeError, "token_class must be a Class" ) 653324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Util.silence_warnings do 654324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver klass < self or klass.send( :include, self ) 655324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver const_set( :Token, klass ) 656324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 657324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 658324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 659324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 660324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 661324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 662