1324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver#!/usr/bin/ruby 2324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver# encoding: utf-8 3324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 4324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin LICENSE 5324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 6324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver[The "BSD licence"] 7324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCopyright (c) 2009-2010 Kyle Yetter 8324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverAll rights reserved. 9324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 10324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverRedistribution and use in source and binary forms, with or without 11324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodification, are permitted provided that the following conditions 12324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverare met: 13324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 14324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1. Redistributions of source code must retain the above copyright 15324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver notice, this list of conditions and the following disclaimer. 16324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 2. Redistributions in binary form must reproduce the above copyright 17324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver notice, this list of conditions and the following disclaimer in the 18324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver documentation and/or other materials provided with the distribution. 19324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 3. The name of the author may not be used to endorse or promote products 20324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver derived from this software without specific prior written permission. 21324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 22324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR 23324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES 24324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, 26324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverINCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 27324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverNOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 29324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 30324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 31324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 33324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 34324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 35324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule ANTLR3 36324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 37324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 38324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::Stream 39324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 40324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver= ANTLR3 Streams 41324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 42324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThis documentation first covers the general concept of streams as used by ANTLR 43324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognizers, and then discusses the specific <tt>ANTLR3::Stream</tt> module. 44324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 45324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== ANTLR Stream Classes 46324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 47324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR recognizers need a way to walk through input data in a serialized IO-style 48324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverfashion. They also need some book-keeping about the input to provide useful 49324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinformation to developers, such as current line number and column. Furthermore, 50324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto implement backtracking and various error recovery techniques, recognizers 51324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverneed a way to record various locations in the input at a number of points in the 52324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognition process so the input state may be restored back to a prior state. 53324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 54324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR bundles all of this functionality into a number of Stream classes, each 55324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdesigned to be used by recognizers for a specific recognition task. Most of the 56324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream hierarchy is implemented in antlr3/stream.rb, which is loaded by default 57324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhen 'antlr3' is required. 58324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 59324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver--- 60324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 61324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverHere's a brief overview of the various stream classes and their respective 62324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverpurpose: 63324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 64324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStringStream:: 65324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Similar to StringIO from the standard Ruby library, StringStream wraps raw 66324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver String data in a Stream interface for use by ANTLR lexers. 67324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFileStream:: 68324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver A subclass of StringStream, FileStream simply wraps data read from an IO or 69324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver File object for use by lexers. 70324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTokenStream:: 71324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver The job of a TokenStream is to read lexer output and then provide ANTLR 72324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver parsers with the means to sequential walk through series of tokens. 73324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver CommonTokenStream is the default TokenStream implementation. 74324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenRewriteStream:: 75324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver A subclass of CommonTokenStream, TokenRewriteStreams provide rewriting-parsers 76324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver the ability to produce new output text from an input token-sequence by 77324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver managing rewrite "programs" on top of the stream. 78324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTreeNodeStream:: 79324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver In a similar fashion to CommonTokenStream, CommonTreeNodeStream feeds tokens 80324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver to recognizers in a sequential fashion. However, the stream object serializes 81324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver an Abstract Syntax Tree into a flat, one-dimensional sequence, but preserves 82324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver the two-dimensional shape of the tree using special UP and DOWN tokens. The 83324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver sequence is primarily used by ANTLR Tree Parsers. *note* -- this is not 84324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver defined in antlr3/stream.rb, but antlr3/tree.rb 85324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 86324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver--- 87324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 88324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe next few sections cover the most significant methods of all stream classes. 89324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 90324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== consume / look / peek 91324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 92324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.consume</tt> is used to advance a stream one unit. StringStreams are 93324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveradvanced by one character and TokenStreams are advanced by one token. 94324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 95324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.peek(k = 1)</tt> is used to quickly retrieve the object of interest 96324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto a recognizer at look-ahead position specified by <tt>k</tt>. For 97324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>StringStreams</b>, this is the <i>integer value of the character</i> 98324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>k</tt> characters ahead of the stream cursor. For <b>TokenStreams</b>, this 99324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveris the <i>integer token type of the token</i> <tt>k</tt> tokens ahead of the 100324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverstream cursor. 101324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 102324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.look(k = 1)</tt> is used to retrieve the full object of interest at 103324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverlook-ahead position specified by <tt>k</tt>. While <tt>peek</tt> provides the 104324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<i>bare-minimum lightweight information</i> that the recognizer needs, 105324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>look</tt> provides the <i>full object of concern</i> in the stream. For 106324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>StringStreams</b>, this is a <i>string object containing the single 107324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacter</i> <tt>k</tt> characters ahead of the stream cursor. For 108324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>TokenStreams</b>, this is the <i>full token structure</i> <tt>k</tt> tokens 109324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverahead of the stream cursor. 110324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 111324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>Note:</b> in most ANTLR runtime APIs for other languages, <tt>peek</tt> is 112324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented by some method with a name like <tt>LA(k)</tt> and <tt>look</tt> is 113324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented by some method with a name like <tt>LT(k)</tt>. When writing this 114324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverRuby runtime API, I found this naming practice both confusing, ambiguous, and 115324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverun-Ruby-like. Thus, I chose <tt>peek</tt> and <tt>look</tt> to represent a 116324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverquick-look (peek) and a full-fledged look-ahead operation (look). If this causes 117324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverconfusion or any sort of compatibility strife for developers using this 118324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplementation, all apologies. 119324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 120324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== mark / rewind / release 121324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 122324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>marker = stream.mark</tt> causes the stream to record important information 123324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverabout the current stream state, place the data in an internal memory table, and 124324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverreturn a memento, <tt>marker</tt>. The marker object is typically an integer key 125324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto the stream's internal memory table. 126324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 127324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverUsed in tandem with, <tt>stream.rewind(mark = last_marker)</tt>, the marker can 128324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverbe used to restore the stream to an earlier state. This is used by recognizers 129324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto perform tasks such as backtracking and error recovery. 130324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 131324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.release(marker = last_marker)</tt> can be used to release an existing 132324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverstate marker from the memory table. 133324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 134324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== seek 135324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 136324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.seek(position)</tt> moves the stream cursor to an absolute position 137324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwithin the stream, basically like typical ruby <tt>IO#seek</tt> style methods. 138324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverHowever, unlike <tt>IO#seek</tt>, ANTLR streams currently always use absolute 139324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverposition seeking. 140324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 141324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== The Stream Module 142324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 143324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>ANTLR3::Stream</tt> is an abstract-ish base mixin for all IO-like stream 144324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclasses used by ANTLR recognizers. 145324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 146324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe module doesn't do much on its own besides define arguably annoying 147324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver``abstract'' pseudo-methods that demand implementation when it is mixed in to a 148324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass that wants to be a Stream. Right now this exists as an artifact of porting 149324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe ANTLR Java/Python runtime library to Ruby. In Java, of course, this is 150324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrepresented as an interface. In Ruby, however, objects are duck-typed and 151324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinterfaces aren't that useful as programmatic entities -- in fact, it's mildly 152324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwasteful to have a module like this hanging out. Thus, I may axe it. 153324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 154324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen mixed in, it does give the class a #size and #source_name attribute 155324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods. 156324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 157324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverExcept in a small handful of places, most of the ANTLR runtime library uses 158324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverduck-typing and not type checking on objects. This means that the methods which 159324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermanipulate stream objects don't usually bother checking that the object is a 160324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream and assume that the object implements the proper stream interface. Thus, 161324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverit is not strictly necessary that custom stream objects include ANTLR3::Stream, 162324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthough it isn't a bad idea. 163324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 164324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 165324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 166324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule Stream 167324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include ANTLR3::Constants 168324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver extend ClassMacros 169324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 170324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 171324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: consume 172324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # used to advance a stream one unit (such as character or token) 173324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :consume 174324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 175324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 176324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: peek( k = 1 ) 177324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # used to quickly retreive the object of interest to a recognizer at lookahead 178324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # position specified by <tt>k</tt> (such as integer value of a character or an 179324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # integer token type) 180324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :peek 181324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 182324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 183324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: look( k = 1 ) 184324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # used to retreive the full object of interest at lookahead position specified 185324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # by <tt>k</tt> (such as a character string or a token structure) 186324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :look 187324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 188324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 189324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: mark 190324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # saves the current position for the purposes of backtracking and 191324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns a value to pass to #rewind at a later time 192324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :mark 193324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 194324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 195324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: index 196324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns the current position of the stream 197324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :index 198324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 199324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 200324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: rewind( marker = last_marker ) 201324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # restores the stream position using the state information previously saved 202324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # by the given marker 203324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :rewind 204324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 205324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 206324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: release( marker = last_marker ) 207324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # clears the saved state information associated with the given marker value 208324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :release 209324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 210324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 211324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: seek( position ) 212324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # move the stream to the given absolute index given by +position+ 213324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :seek 214324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 215324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 216324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the total number of symbols in the stream 217324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :size 218324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 219324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 220324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # indicates an identifying name for the stream -- usually the file path of the input 221324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :source_name 222324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 223324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 224324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::CharacterStream 225324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 226324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCharacterStream further extends the abstract-ish base mixin Stream to add 227324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods specific to navigating character-based input data. Thus, it serves as an 228324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimmitation of the Java interface for text-based streams, which are primarily 229324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverused by lexers. 230324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 231324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIt adds the ``abstract'' method, <tt>substring(start, stop)</tt>, which must be 232324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented to return a slice of the input string from position <tt>start</tt> 233324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto position <tt>stop</tt>. It also adds attribute accessor methods <tt>line</tt> 234324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand <tt>column</tt>, which are expected to indicate the current line number and 235324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverposition within the current line, respectively. 236324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 237324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== A Word About <tt>line</tt> and <tt>column</tt> attributes 238324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 239324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverPresumably, the concept of <tt>line</tt> and <tt>column</tt> attirbutes of text 240324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverare familliar to most developers. Line numbers of text are indexed from number 1 241324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverup (not 0). Column numbers are indexed from 0 up. Thus, examining sample text: 242324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 243324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Hey this is the first line. 244324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver Oh, and this is the second line. 245324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 246324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverLine 1 is the string "Hey this is the first line\\n". If a character stream is at 247324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverline 2, character 0, the stream cursor is sitting between the characters "\\n" 248324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand "O". 249324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 250324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver*Note:* most ANTLR runtime APIs for other languages refer to <tt>column</tt> 251324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwith the more-precise, but lengthy name <tt>charPositionInLine</tt>. I prefered 252324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto keep it simple and familliar in this Ruby runtime API. 253324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 254324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 255324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 256324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule CharacterStream 257324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Stream 258324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver extend ClassMacros 259324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Constants 260324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 261324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 262324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: substring(start,stop) 263324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :substring 264324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 265324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :line 266324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :column 267324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 268324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 269324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 270324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::TokenStream 271324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 272324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenStream further extends the abstract-ish base mixin Stream to add methods 273324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverspecific to navigating token sequences. Thus, it serves as an imitation of the 274324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverJava interface for token-based streams, which are used by many different 275324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercomponents in ANTLR, including parsers and tree parsers. 276324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 277324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== Token Streams 278324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 279324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverToken streams wrap a sequence of token objects produced by some token source, 280324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverusually a lexer. They provide the operations required by higher-level 281324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognizers, such as parsers and tree parsers for navigating through the 282324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversequence of tokens. Unlike simple character-based streams, such as StringStream, 283324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertoken-based streams have an additional level of complexity because they must 284324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermanage the task of "tuning" to a specific token channel. 285324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 286324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOne of the main advantages of ANTLR-based recognition is the token 287324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<i>channel</i> feature, which allows you to hold on to all tokens of interest 288324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhile only presenting a specific set of interesting tokens to a parser. For 289324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverexample, if you need to hide whitespace and comments from a parser, but hang on 290324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto them for some other purpose, you have the lexer assign the comments and 291324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhitespace to channel value HIDDEN as it creates the tokens. 292324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 293324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen you create a token stream, you can tune it to some specific channel value. 294324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThen, all <tt>peek</tt>, <tt>look</tt>, and <tt>consume</tt> operations only 295324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveryield tokens that have the same value for <tt>channel</tt>. The stream skips 296324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverover any non-matching tokens in between. 297324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 298324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== The TokenStream Interface 299324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 300324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIn addition to the abstract methods and attribute methods provided by the base 301324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream module, TokenStream adds a number of additional method implementation 302324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrequirements and attributes. 303324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 304324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 305324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 306324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule TokenStream 307324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Stream 308324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver extend ClassMacros 309324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 310324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 311324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # expected to return the token source object (such as a lexer) from which 312324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # all tokens in the stream were retreived 313324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :token_source 314324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 315324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 316324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # expected to return the value of the last marker produced by a call to 317324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # <tt>stream.mark</tt> 318324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :last_marker 319324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 320324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 321324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # expected to return the integer index of the stream cursor 322324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :position 323324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 324324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 325324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the integer channel value to which the stream is ``tuned'' 326324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :channel 327324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 328324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 329324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: to_s(start=0,stop=tokens.length-1) 330324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # should take the tokens between start and stop in the sequence, extract their text 331324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # and return the concatenation of all the text chunks 332324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :to_s 333324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 334324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ## 335324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # :method: at( i ) 336324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return the stream symbol at index +i+ 337324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver abstract :at 338324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 339324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 340324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::StringStream 341324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 342324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverA StringStream's purpose is to wrap the basic, naked text input of a recognition 343324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversystem. Like all other stream types, it provides serial navigation of the input; 344324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvera recognizer can arbitrarily step forward and backward through the stream's 345324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversymbols as it requires. StringStream and its subclasses are they main way to 346324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverfeed text input into an ANTLR Lexer for token processing. 347324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 348324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe stream's symbols of interest, of course, are character values. Thus, the 349324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver#peek method returns the integer character value at look-ahead position 350324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>k</tt> and the #look method returns the character value as a +String+. They 351324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveralso track various pieces of information such as the line and column numbers at 352324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe current position. 353324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 354324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== Note About Text Encoding 355324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 356324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThis version of the runtime library primarily targets ruby version 1.8, which 357324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdoes not have strong built-in support for multi-byte character encodings. Thus, 358324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacters are assumed to be represented by a single byte -- an integer between 359324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte 360324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacters, but currently this library does not provide any streams to handle 361324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernon-ASCII encoding. However, encoding-savvy recognition code is a future 362324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdevelopment goal for this project. 363324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 364324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 365324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 366324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass StringStream 367324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver NEWLINE = ?\n.ord 368324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 369324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include CharacterStream 370324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 371324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # current integer character index of the stream 372324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :position 373324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 374324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the current line number of the input, indexed upward from 1 375324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :line 376324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 377324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the current character position within the current line, indexed upward from 0 378324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :column 379324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 380324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the name associated with the stream -- usually a file name 381324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # defaults to <tt>"(string)"</tt> 382324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_accessor :name 383324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 384324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the entire string that is wrapped by the stream 385324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :data 386324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver attr_reader :string 387324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 388324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if RUBY_VERSION =~ /^1\.9/ 389324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 390324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # creates a new StringStream object where +data+ is the string data to stream. 391324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # accepts the following options in a symbol-to-value hash: 392324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 393324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt> 394324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:line] the initial line number; default: +1+ 395324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:column] the initial column number; default: +0+ 396324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 397324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize( data, options = {} ) # for 1.9 398324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string = data.to_s.encode( Encoding::UTF_8 ).freeze 399324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data = @string.codepoints.to_a.freeze 400324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = options.fetch :position, 0 401324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @line = options.fetch :line, 1 402324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column = options.fetch :column, 0 403324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers = [] 404324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name ||= options[ :file ] || options[ :name ] # || '(string)' 405324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver mark 406324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 407324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 408324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 409324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # identical to #peek, except it returns the character value as a String 410324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 411324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def look( k = 1 ) # for 1.9 412324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k == 0 and return nil 413324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k += 1 if k < 0 414324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 415324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index = @position + k - 1 416324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index < 0 and return nil 417324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 418324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string[ index ] 419324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 420324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 421324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 422324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 423324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # creates a new StringStream object where +data+ is the string data to stream. 424324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # accepts the following options in a symbol-to-value hash: 425324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 426324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt> 427324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:line] the initial line number; default: +1+ 428324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:column] the initial column number; default: +0+ 429324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 430324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize( data, options = {} ) # for 1.8 431324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data = data.to_s 432324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data.equal?( data ) and @data = @data.clone 433324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data.freeze 434324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string = @data 435324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = options.fetch :position, 0 436324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @line = options.fetch :line, 1 437324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column = options.fetch :column, 0 438324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers = [] 439324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name ||= options[ :file ] || options[ :name ] # || '(string)' 440324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver mark 441324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 442324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 443324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 444324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # identical to #peek, except it returns the character value as a String 445324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 446324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def look( k = 1 ) # for 1.8 447324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k == 0 and return nil 448324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k += 1 if k < 0 449324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 450324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index = @position + k - 1 451324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index < 0 and return nil 452324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 453324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver c = @data[ index ] and c.chr 454324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 455324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 456324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 457324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 458324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def size 459324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data.length 460324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 461324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 462324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias length size 463324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 464324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 465324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # rewinds the stream back to the start and clears out any existing marker entries 466324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 467324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def reset 468324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver initial_location = @markers.first 469324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position, @line, @column = initial_location 470324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers.clear 471324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers << initial_location 472324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 473324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 474324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 475324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 476324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # advance the stream by one character; returns the character consumed 477324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 478324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def consume 479324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver c = @data[ @position ] || EOF 480324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if @position < @data.length 481324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column += 1 482324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if c == NEWLINE 483324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @line += 1 484324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column = 0 485324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 486324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position += 1 487324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 488324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return( c ) 489324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 490324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 491324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 492324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return the character at look-ahead distance +k+ as an integer. <tt>k = 1</tt> represents 493324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the current character. +k+ greater than 1 represents upcoming characters. A negative 494324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # value of +k+ returns previous characters consumed, where <tt>k = -1</tt> is the last 495324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # character consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+ 496324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 497324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def peek( k = 1 ) 498324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k == 0 and return nil 499324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k += 1 if k < 0 500324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index = @position + k - 1 501324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index < 0 and return nil 502324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data[ index ] or EOF 503324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 504324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 505324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 506324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return a substring around the stream cursor at a distance +k+ 507324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # if <tt>k >= 0</tt>, return the next k characters 508324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # if <tt>k < 0</tt>, return the previous <tt>|k|</tt> characters 509324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 510324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def through( k ) 511324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if k >= 0 then @string[ @position, k ] else 512324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around 513324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string[ start ... @position ] 514324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 515324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 516324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 517324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # operator style look-ahead 518324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias >> look 519324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 520324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # operator style look-behind 521324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def <<( k ) 522324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self << -k 523324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 524324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 525324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias index position 526324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias character_index position 527324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 528324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias source_name name 529324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 530324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 531324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Returns true if the stream appears to be at the beginning of a new line. 532324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # This is an extra utility method for use inside lexer actions if needed. 533324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 534324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def beginning_of_line? 535324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position.zero? or @data[ @position - 1 ] == NEWLINE 536324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 537324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 538324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 539324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Returns true if the stream appears to be at the end of a new line. 540324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # This is an extra utility method for use inside lexer actions if needed. 541324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 542324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def end_of_line? 543324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @data[ @position ] == NEWLINE #if @position < @data.length 544324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 545324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 546324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 547324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Returns true if the stream has been exhausted. 548324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # This is an extra utility method for use inside lexer actions if needed. 549324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 550324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def end_of_string? 551324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position >= @data.length 552324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 553324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 554324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 555324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # Returns true if the stream appears to be at the beginning of a stream (position = 0). 556324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # This is an extra utility method for use inside lexer actions if needed. 557324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 558324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def beginning_of_string? 559324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position == 0 560324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 561324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 562324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias eof? end_of_string? 563324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias bof? beginning_of_string? 564324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 565324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 566324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # record the current stream location parameters in the stream's marker table and 567324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return an integer-valued bookmark that may be used to restore the stream's 568324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # position with the #rewind method. This method is used to implement backtracking. 569324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 570324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def mark 571324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver state = [ @position, @line, @column ].freeze 572324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers << state 573324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return @markers.length - 1 574324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 575324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 576324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 577324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # restore the stream to an earlier location recorded by #mark. If no marker value is 578324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # provided, the last marker generated by #mark will be used. 579324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 580324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def rewind( marker = @markers.length - 1, release = true ) 581324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ( marker >= 0 and location = @markers[ marker ] ) or return( self ) 582324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position, @line, @column = location 583324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver release( marker ) if release 584324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 585324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 586324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 587324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 588324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the total number of markers currently in existence 589324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 590324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def mark_depth 591324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers.length 592324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 593324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 594324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 595324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the last marker value created by a call to #mark 596324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 597324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def last_marker 598324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers.length - 1 599324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 600324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 601324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 602324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # let go of the bookmark data for the marker and all marker 603324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # values created after the marker. 604324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 605324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def release( marker = @markers.length - 1 ) 606324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver marker.between?( 1, @markers.length - 1 ) or return 607324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @markers.pop( @markers.length - marker ) 608324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 609324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 610324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 611324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 612324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # jump to the absolute position value given by +index+. 613324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # note: if +index+ is before the current position, the +line+ and +column+ 614324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # attributes of the stream will probably be incorrect 615324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 616324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def seek( index ) 617324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index = index.bound( 0, @data.length ) # ensures index is within the stream's range 618324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if index > @position 619324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver skipped = through( index - @position ) 620324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if lc = skipped.count( "\n" ) and lc.zero? 621324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column += skipped.length 622324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 623324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @line += lc 624324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @column = skipped.length - skipped.rindex( "\n" ) - 1 625324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 626324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 627324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = index 628324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return nil 629324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 630324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 631324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 632324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # customized object inspection that shows: 633324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # * the stream class 634324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # * the stream's location in <tt>index / line:column</tt> format 635324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # * +before_chars+ characters before the cursor (6 characters by default) 636324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # * +after_chars+ characters after the cursor (10 characters by default) 637324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 638324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def inspect( before_chars = 6, after_chars = 10 ) 639324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver before = through( -before_chars ).inspect 640324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position - before_chars > 0 and before.insert( 0, '... ' ) 641324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 642324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver after = through( after_chars ).inspect 643324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position + after_chars + 1 < @data.length and after << ' ...' 644324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 645324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver location = "#@position / line #@line:#@column" 646324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>" 647324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 648324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 649324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 650324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return the string slice between position +start+ and +stop+ 651324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 652324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def substring( start, stop ) 653324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string[ start, stop - start + 1 ] 654324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 655324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 656324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 657324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # identical to String#[] 658324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 659324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def []( start, *args ) 660324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @string[ start, *args ] 661324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 662324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 663324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 664324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 665324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::FileStream 666324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 667324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFileStream is a character stream that uses data stored in some external file. It 668324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveris nearly identical to StringStream and functions as use data located in a file 669324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhile automatically setting up the +source_name+ and +line+ parameters. It does 670324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernot actually use any buffered IO operations throughout the stream navigation 671324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverprocess. Instead, it reads the file data once when the stream is initialized. 672324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 673324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 674324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 675324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass FileStream < StringStream 676324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 677324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 678324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # creates a new FileStream object using the given +file+ object. 679324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If +file+ is a path string, the file will be read and the contents 680324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # will be used and the +name+ attribute will be set to the path. 681324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If +file+ is an IO-like object (that responds to :read), 682324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the content of the object will be used and the stream will 683324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # attempt to set its +name+ object first trying the method #name 684324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # on the object, then trying the method #path on the object. 685324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 686324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # see StringStream.new for a list of additional options 687324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the constructer accepts 688324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 689324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize( file, options = {} ) 690324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case file 691324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when $stdin then 692324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver data = $stdin.read 693324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name = '(stdin)' 694324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when ARGF 695324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver data = file.read 696324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name = file.path 697324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when ::File then 698324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver file = file.clone 699324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver file.reopen( file.path, 'r' ) 700324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name = file.path 701324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver data = file.read 702324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver file.close 703324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 704324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if file.respond_to?( :read ) 705324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver data = file.read 706324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if file.respond_to?( :name ) then @name = file.name 707324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver elsif file.respond_to?( :path ) then @name = file.path 708324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 709324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 710324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @name = file.to_s 711324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if test( ?f, @name ) then data = File.read( @name ) 712324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else raise ArgumentError, "could not find an existing file at %p" % @name 713324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 714324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 715324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 716324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver super( data, options ) 717324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 718324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 719324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 720324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 721324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::CommonTokenStream 722324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 723324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTokenStream serves as the primary token stream implementation for feeding 724324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversequential token input into parsers. 725324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 726324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverUsing some TokenSource (such as a lexer), the stream collects a token sequence, 727324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversetting the token's <tt>index</tt> attribute to indicate the token's position 728324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwithin the stream. The streams may be tuned to some channel value; off-channel 729324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertokens will be filtered out by the #peek, #look, and #consume methods. 730324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 731324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== Sample Usage 732324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 733324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 734324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver source_input = ANTLR3::StringStream.new("35 * 4 - 1") 735324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver lexer = Calculator::Lexer.new(source_input) 736324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens = ANTLR3::CommonTokenStream.new(lexer) 737324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 738324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # assume this grammar defines whitespace as tokens on channel HIDDEN 739324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # and numbers and operations as tokens on channel DEFAULT 740324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens.look # => 0 INT['35'] @ line 1 col 0 (0..1) 741324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens.look(2) # => 2 MULT["*"] @ line 1 col 2 (3..3) 742324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens.tokens(0, 2) 743324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # => [0 INT["35"] @line 1 col 0 (0..1), 744324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1 WS[" "] @line 1 col 2 (1..1), 745324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 2 MULT["*"] @ line 1 col 3 (3..3)] 746324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # notice the #tokens method does not filter off-channel tokens 747324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 748324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver lexer.reset 749324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver hidden_tokens = 750324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ANTLR3::CommonTokenStream.new(lexer, :channel => ANTLR3::HIDDEN) 751324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver hidden_tokens.look # => 1 WS[' '] @ line 1 col 2 (1..1) 752324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 753324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end 754324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 755324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass CommonTokenStream 756324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include TokenStream 757324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver include Enumerable 758324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 759324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 760324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # constructs a new token stream using the +token_source+ provided. +token_source+ is 761324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # usually a lexer, but can be any object that implements +next_token+ and includes 762324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # ANTLR3::TokenSource. 763324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 764324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If a block is provided, each token harvested will be yielded and if the block 765324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns a +nil+ or +false+ value, the token will not be added to the stream -- 766324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # it will be discarded. 767324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 768324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # === Options 769324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:channel] The channel value the stream should be tuned to initially 770324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # [:source_name] The source name (file name) attribute of the stream 771324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 772324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # === Example 773324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 774324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # # create a new token stream that is tuned to channel :comment, and 775324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # # discard all WHITE_SPACE tokens 776324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # ANTLR3::CommonTokenStream.new(lexer, :channel => :comment) do |token| 777324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token.name != 'WHITE_SPACE' 778324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # end 779324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 780324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def initialize( token_source, options = {} ) 781324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case token_source 782324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when CommonTokenStream 783324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # this is useful in cases where you want to convert a CommonTokenStream 784324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # to a RewriteTokenStream or other variation of the standard token stream 785324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver stream = token_source 786324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_source = stream.token_source 787324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @channel = options.fetch( :channel ) { stream.channel or DEFAULT_CHANNEL } 788324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @source_name = options.fetch( :source_name ) { stream.source_name } 789324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens = stream.tokens.map { | t | t.dup } 790324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 791324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_source = token_source 792324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @channel = options.fetch( :channel, DEFAULT_CHANNEL ) 793324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @source_name = options.fetch( :source_name ) { @token_source.source_name rescue nil } 794324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens = @token_source.to_a 795324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 796324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @last_marker = nil 797324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens = block_given? ? tokens.select { | t | yield( t, self ) } : tokens 798324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.each_with_index { |t, i| t.index = i } 799324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = 800324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if first_token = @tokens.find { |t| t.channel == @channel } 801324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.index( first_token ) 802324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else @tokens.length 803324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 804324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 805324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 806324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 807324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # resets the token stream and rebuilds it with a potentially new token source. 808324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If no +token_source+ value is provided, the stream will attempt to reset the 809324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # current +token_source+ by calling +reset+ on the object. The stream will 810324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # then clear the token buffer and attempt to harvest new tokens. Identical in 811324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # behavior to CommonTokenStream.new, if a block is provided, tokens will be 812324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # yielded and discarded if the block returns a +false+ or +nil+ value. 813324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 814324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def rebuild( token_source = nil ) 815324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if token_source.nil? 816324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_source.reset rescue nil 817324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else @token_source = token_source 818324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 819324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens = block_given? ? @token_source.select { |token| yield( token ) } : 820324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_source.to_a 821324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.each_with_index { |t, i| t.index = i } 822324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @last_marker = nil 823324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = 824324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if first_token = @tokens.find { |t| t.channel == @channel } 825324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.index( first_token ) 826324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else @tokens.length 827324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 828324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 829324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 830324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 831324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 832324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # tune the stream to a new channel value 833324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 834324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def tune_to( channel ) 835324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @channel = channel 836324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 837324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 838324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def token_class 839324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @token_source.token_class 840324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver rescue NoMethodError 841324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position == -1 and fill_buffer 842324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.empty? ? CommonToken : @tokens.first.class 843324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 844324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 845324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias index position 846324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 847324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def size 848324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.length 849324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 850324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 851324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias length size 852324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 853324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ###### State-Control ################################################ 854324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 855324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 856324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # rewind the stream to its initial state 857324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 858324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def reset 859324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = 0 860324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position += 1 while token = @tokens[ @position ] and 861324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token.channel != @channel 862324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @last_marker = nil 863324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 864324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 865324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 866324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 867324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # bookmark the current position of the input stream 868324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 869324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def mark 870324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @last_marker = @position 871324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 872324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 873324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def release( marker = nil ) 874324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # do nothing 875324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 876324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 877324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 878324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def rewind( marker = @last_marker, release = true ) 879324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver seek( marker ) 880324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 881324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 882324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 883324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # saves the current stream position, yields to the block, 884324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # and then ensures the stream's position is restored before 885324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returning the value of the block 886324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 887324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def hold( pos = @position ) 888324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? or return enum_for( :hold, pos ) 889324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver begin 890324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver yield 891324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ensure 892324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver seek( pos ) 893324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 894324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 895324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 896324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ###### Stream Navigation ########################################### 897324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 898324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 899324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # advance the stream one step to the next on-channel token 900324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 901324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def consume 902324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token = @tokens[ @position ] || EOF_TOKEN 903324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if @position < @tokens.length 904324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = future?( 2 ) || @tokens.length 905324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 906324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return( token ) 907324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 908324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 909324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 910324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # jump to the stream position specified by +index+ 911324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # note: seek does not check whether or not the 912324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # token at the specified position is on-channel, 913324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 914324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def seek( index ) 915324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = index.to_i.bound( 0, @tokens.length ) 916324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 917324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 918324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 919324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 920324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # return the type of the on-channel token at look-ahead distance +k+. <tt>k = 1</tt> represents 921324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the current token. +k+ greater than 1 represents upcoming on-channel tokens. A negative 922324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # value of +k+ returns previous on-channel tokens consumed, where <tt>k = -1</tt> is the last 923324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # on-channel token consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+ 924324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 925324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def peek( k = 1 ) 926324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk = look( k ) and return( tk.type ) 927324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 928324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 929324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 930324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # operates simillarly to #peek, but returns the full token object at look-ahead position +k+ 931324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 932324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def look( k = 1 ) 933324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver index = future?( k ) or return nil 934324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.fetch( index, EOF_TOKEN ) 935324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 936324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 937324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias >> look 938324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def << k 939324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver self >> -k 940324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 941324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 942324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 943324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns the index of the on-channel token at look-ahead position +k+ or nil if no other 944324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # on-channel tokens exist 945324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 946324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def future?( k = 1 ) 947324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position == -1 and fill_buffer 948324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 949324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case 950324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when k == 0 then nil 951324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when k < 0 then past?( -k ) 952324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when k == 1 then @position 953324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 954324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # since the stream only yields on-channel 955324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # tokens, the stream can't just go to the 956324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # next position, but rather must skip 957324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # over off-channel tokens 958324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ( k - 1 ).times.inject( @position ) do |cursor, | 959324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver begin 960324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk = @tokens.at( cursor += 1 ) or return( cursor ) 961324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # ^- if tk is nil (i.e. i is outside array limits) 962324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end until tk.channel == @channel 963324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver cursor 964324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 965324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 966324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 967324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 968324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 969324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns the index of the on-channel token at look-behind position +k+ or nil if no other 970324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # on-channel tokens exist before the current token 971324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 972324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def past?( k = 1 ) 973324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position == -1 and fill_buffer 974324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 975324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver case 976324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when k == 0 then nil 977324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver when @position - k < 0 then nil 978324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver else 979324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 980324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver k.times.inject( @position ) do |cursor, | 981324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver begin 982324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver cursor <= 0 and return( nil ) 983324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk = @tokens.at( cursor -= 1 ) or return( nil ) 984324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end until tk.channel == @channel 985324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver cursor 986324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 987324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 988324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 989324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 990324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 991324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 992324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # yields each token in the stream (including off-channel tokens) 993324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If no block is provided, the method returns an Enumerator object. 994324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # #each accepts the same arguments as #tokens 995324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 996324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def each( *args ) 997324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? or return enum_for( :each, *args ) 998324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens( *args ).each { |token| yield( token ) } 999324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1000324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1001324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1002324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1003324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # yields each token in the stream with the given channel value 1004324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If no channel value is given, the stream's tuned channel value will be used. 1005324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If no block is given, an enumerator will be returned. 1006324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1007324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def each_on_channel( channel = @channel ) 1008324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? or return enum_for( :each_on_channel, channel ) 1009324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver for token in @tokens 1010324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver token.channel == channel and yield( token ) 1011324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1012324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1013324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1014324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1015324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # iterates through the token stream, yielding each on channel token along the way. 1016324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # After iteration has completed, the stream's position will be restored to where 1017324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # it was before #walk was called. While #each or #each_on_channel does not change 1018324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # the positions stream during iteration, #walk advances through the stream. This 1019324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # makes it possible to look ahead and behind the current token during iteration. 1020324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # If no block is given, an enumerator will be returned. 1021324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1022324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def walk 1023324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver block_given? or return enum_for( :walk ) 1024324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver initial_position = @position 1025324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver begin 1026324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver while token = look and token.type != EOF 1027324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver consume 1028324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver yield( token ) 1029324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1030324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return self 1031324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ensure 1032324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @position = initial_position 1033324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1034324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1035324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1036324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1037324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns a copy of the token buffer. If +start+ and +stop+ are provided, tokens 1038324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # returns a slice of the token buffer from <tt>start..stop</tt>. The parameters 1039324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # are converted to integers with their <tt>to_i</tt> methods, and thus tokens 1040324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # can be provided to specify start and stop. If a block is provided, tokens are 1041324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # yielded and filtered out of the return array if the block returns a +false+ 1042324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # or +nil+ value. 1043324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1044324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def tokens( start = nil, stop = nil ) 1045324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver stop.nil? || stop >= @tokens.length and stop = @tokens.length - 1 1046324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver start.nil? || stop < 0 and start = 0 1047324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens = @tokens[ start..stop ] 1048324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1049324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver if block_given? 1050324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tokens.delete_if { |t| not yield( t ) } 1051324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1052324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1053324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver return( tokens ) 1054324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1055324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1056324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1057324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def at( i ) 1058324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens.at i 1059324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1060324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1061324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1062324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # identical to Array#[], as applied to the stream's token buffer 1063324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1064324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def []( i, *args ) 1065324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens[ i, *args ] 1066324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1067324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1068324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver ###### Standard Conversion Methods ############################### 1069324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def inspect 1070324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver string = "#<%p: @token_source=%p @ %p/%p" % 1071324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver [ self.class, @token_source.class, @position, @tokens.length ] 1072324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk = look( -1 ) and string << " #{ tk.inspect } <--" 1073324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver tk = look( 1 ) and string << " --> #{ tk.inspect }" 1074324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver string << '>' 1075324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1076324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1077324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1078324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # fetches the text content of all tokens between +start+ and +stop+ and 1079324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # joins the chunks into a single string 1080324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver # 1081324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver def extract_text( start = 0, stop = @tokens.length - 1 ) 1082324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver start = start.to_i.at_least( 0 ) 1083324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver stop = stop.to_i.at_most( @tokens.length ) 1084324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver @tokens[ start..stop ].map! { |t| t.text }.join( '' ) 1085324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver end 1086324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1087324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver alias to_s extract_text 1088324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1089324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 1090324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1091324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend 1092