1324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver#!/usr/bin/ruby
2324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver# encoding: utf-8
3324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
4324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin LICENSE
5324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
6324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver[The "BSD licence"]
7324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCopyright (c) 2009-2010 Kyle Yetter
8324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverAll rights reserved.
9324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
10324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverRedistribution and use in source and binary forms, with or without
11324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodification, are permitted provided that the following conditions
12324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverare met:
13324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
14324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 1. Redistributions of source code must retain the above copyright
15324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    notice, this list of conditions and the following disclaimer.
16324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 2. Redistributions in binary form must reproduce the above copyright
17324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    notice, this list of conditions and the following disclaimer in the
18324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    documentation and/or other materials provided with the distribution.
19324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver 3. The name of the author may not be used to endorse or promote products
20324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    derived from this software without specific prior written permission.
21324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
22324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
23324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
24324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
25324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
26324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverINCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
27324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverNOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
28324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverDATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
29324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
30324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
31324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTHIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
33324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
34324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
35324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule ANTLR3
36324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
37324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
38324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::Stream
39324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
40324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver= ANTLR3 Streams
41324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
42324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThis documentation first covers the general concept of streams as used by ANTLR
43324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognizers, and then discusses the specific <tt>ANTLR3::Stream</tt> module.
44324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
45324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== ANTLR Stream Classes
46324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
47324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR recognizers need a way to walk through input data in a serialized IO-style
48324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverfashion. They also need some book-keeping about the input to provide useful
49324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinformation to developers, such as current line number and column. Furthermore,
50324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto implement backtracking and various error recovery techniques, recognizers
51324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverneed a way to record various locations in the input at a number of points in the
52324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognition process so the input state may be restored back to a prior state.
53324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
54324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverANTLR bundles all of this functionality into a number of Stream classes, each
55324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdesigned to be used by recognizers for a specific recognition task. Most of the
56324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream hierarchy is implemented in antlr3/stream.rb, which is loaded by default
57324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhen 'antlr3' is required.
58324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
59324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver---
60324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
61324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverHere's a brief overview of the various stream classes and their respective
62324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverpurpose:
63324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
64324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStringStream::
65324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  Similar to StringIO from the standard Ruby library, StringStream wraps raw
66324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  String data in a Stream interface for use by ANTLR lexers.
67324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFileStream::
68324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  A subclass of StringStream, FileStream simply wraps data read from an IO or
69324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  File object for use by lexers.
70324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTokenStream::
71324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  The job of a TokenStream is to read lexer output and then provide ANTLR
72324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  parsers with the means to sequential walk through series of tokens.
73324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  CommonTokenStream is the default TokenStream implementation.
74324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenRewriteStream::
75324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  A subclass of CommonTokenStream, TokenRewriteStreams provide rewriting-parsers
76324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  the ability to produce new output text from an input token-sequence by
77324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  managing rewrite "programs" on top of the stream.
78324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTreeNodeStream::
79324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  In a similar fashion to CommonTokenStream, CommonTreeNodeStream feeds tokens
80324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  to recognizers in a sequential fashion. However, the stream object serializes
81324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  an Abstract Syntax Tree into a flat, one-dimensional sequence, but preserves
82324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  the two-dimensional shape of the tree using special UP and DOWN tokens. The
83324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  sequence is primarily used by ANTLR Tree Parsers. *note* -- this is not
84324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  defined in antlr3/stream.rb, but antlr3/tree.rb
85324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
86324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver---
87324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
88324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe next few sections cover the most significant methods of all stream classes. 
89324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
90324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== consume / look / peek
91324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
92324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.consume</tt> is used to advance a stream one unit. StringStreams are
93324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveradvanced by one character and TokenStreams are advanced by one token.
94324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
95324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.peek(k = 1)</tt> is used to quickly retrieve the object of interest
96324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto a recognizer at look-ahead position specified by <tt>k</tt>. For
97324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>StringStreams</b>, this is the <i>integer value of the character</i>
98324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>k</tt> characters ahead of the stream cursor. For <b>TokenStreams</b>, this
99324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveris the <i>integer token type of the token</i> <tt>k</tt> tokens ahead of the
100324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverstream cursor.
101324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
102324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.look(k = 1)</tt> is used to retrieve the full object of interest at
103324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverlook-ahead position specified by <tt>k</tt>. While <tt>peek</tt> provides the
104324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<i>bare-minimum lightweight information</i> that the recognizer needs,
105324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>look</tt> provides the <i>full object of concern</i> in the stream. For
106324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>StringStreams</b>, this is a <i>string object containing the single
107324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacter</i> <tt>k</tt> characters ahead of the stream cursor. For
108324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>TokenStreams</b>, this is the <i>full token structure</i> <tt>k</tt> tokens
109324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverahead of the stream cursor.
110324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
111324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<b>Note:</b> in most ANTLR runtime APIs for other languages, <tt>peek</tt> is
112324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented by some method with a name like <tt>LA(k)</tt> and <tt>look</tt> is
113324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented by some method with a name like <tt>LT(k)</tt>. When writing this
114324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverRuby runtime API, I found this naming practice both confusing, ambiguous, and
115324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverun-Ruby-like. Thus, I chose <tt>peek</tt> and <tt>look</tt> to represent a
116324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverquick-look (peek) and a full-fledged look-ahead operation (look). If this causes
117324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverconfusion or any sort of compatibility strife for developers using this
118324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplementation, all apologies.
119324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
120324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== mark / rewind / release
121324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
122324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>marker = stream.mark</tt> causes the stream to record important information
123324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverabout the current stream state, place the data in an internal memory table, and
124324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverreturn a memento, <tt>marker</tt>. The marker object is typically an integer key
125324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto the stream's internal memory table.
126324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
127324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverUsed in tandem with, <tt>stream.rewind(mark = last_marker)</tt>, the marker can
128324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverbe used to restore the stream to an earlier state. This is used by recognizers
129324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto perform tasks such as backtracking and error recovery.
130324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
131324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.release(marker = last_marker)</tt> can be used to release an existing
132324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverstate marker from the memory table.
133324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
134324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== seek
135324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
136324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>stream.seek(position)</tt> moves the stream cursor to an absolute position
137324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwithin the stream, basically like typical ruby <tt>IO#seek</tt> style methods.
138324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverHowever, unlike <tt>IO#seek</tt>, ANTLR streams currently always use absolute
139324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverposition seeking.
140324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
141324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== The Stream Module
142324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
143324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>ANTLR3::Stream</tt> is an abstract-ish base mixin for all IO-like stream
144324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclasses used by ANTLR recognizers.
145324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
146324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe module doesn't do much on its own besides define arguably annoying
147324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver``abstract'' pseudo-methods that demand implementation when it is mixed in to a
148324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass that wants to be a Stream. Right now this exists as an artifact of porting
149324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe ANTLR Java/Python runtime library to Ruby. In Java, of course, this is
150324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrepresented as an interface. In Ruby, however, objects are duck-typed and
151324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverinterfaces aren't that useful as programmatic entities -- in fact, it's mildly
152324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwasteful to have a module like this hanging out. Thus, I may axe it.
153324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
154324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen mixed in, it does give the class a #size and #source_name attribute
155324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods.
156324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
157324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverExcept in a small handful of places, most of the ANTLR runtime library uses
158324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverduck-typing and not type checking on objects. This means that the methods which
159324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermanipulate stream objects don't usually bother checking that the object is a
160324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream and assume that the object implements the proper stream interface. Thus,
161324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverit is not strictly necessary that custom stream objects include ANTLR3::Stream,
162324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthough it isn't a bad idea.
163324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
164324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
165324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
166324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule Stream
167324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include ANTLR3::Constants
168324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  extend ClassMacros
169324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
170324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
171324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: consume
172324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # used to advance a stream one unit (such as character or token)
173324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :consume
174324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
175324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
176324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: peek( k = 1 )
177324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # used to quickly retreive the object of interest to a recognizer at lookahead
178324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # position specified by <tt>k</tt> (such as integer value of a character or an
179324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # integer token type)
180324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :peek
181324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
182324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
183324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: look( k = 1 )
184324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # used to retreive the full object of interest at lookahead position specified
185324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # by <tt>k</tt> (such as a character string or a token structure)
186324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :look
187324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
188324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
189324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: mark
190324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # saves the current position for the purposes of backtracking and
191324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns a value to pass to #rewind at a later time
192324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :mark
193324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
194324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
195324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: index
196324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns the current position of the stream
197324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :index
198324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
199324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
200324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: rewind( marker = last_marker )
201324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # restores the stream position using the state information previously saved
202324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # by the given marker
203324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :rewind
204324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
205324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
206324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: release( marker = last_marker )
207324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # clears the saved state information associated with the given marker value
208324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :release
209324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
210324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
211324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: seek( position )
212324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # move the stream to the given absolute index given by +position+
213324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :seek
214324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
215324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
216324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the total number of symbols in the stream
217324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :size
218324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
219324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
220324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # indicates an identifying name for the stream -- usually the file path of the input
221324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_accessor :source_name
222324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
223324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
224324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::CharacterStream
225324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
226324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCharacterStream further extends the abstract-ish base mixin Stream to add
227324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermethods specific to navigating character-based input data. Thus, it serves as an
228324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimmitation of the Java interface for text-based streams, which are primarily
229324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverused by lexers.
230324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
231324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIt adds the ``abstract'' method, <tt>substring(start, stop)</tt>, which must be
232324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverimplemented to return a slice of the input string from position <tt>start</tt>
233324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto position <tt>stop</tt>. It also adds attribute accessor methods <tt>line</tt>
234324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand <tt>column</tt>, which are expected to indicate the current line number and
235324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverposition within the current line, respectively.
236324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
237324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== A Word About <tt>line</tt> and <tt>column</tt> attributes
238324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
239324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverPresumably, the concept of <tt>line</tt> and <tt>column</tt> attirbutes of text
240324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverare familliar to most developers. Line numbers of text are indexed from number 1
241324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverup (not 0). Column numbers are indexed from 0 up. Thus, examining sample text:
242324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
243324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  Hey this is the first line.
244324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  Oh, and this is the second line.
245324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
246324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverLine 1 is the string "Hey this is the first line\\n". If a character stream is at
247324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverline 2, character 0, the stream cursor is sitting between the characters "\\n"
248324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverand "O".
249324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
250324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver*Note:* most ANTLR runtime APIs for other languages refer to <tt>column</tt>
251324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwith the more-precise, but lengthy name <tt>charPositionInLine</tt>. I prefered
252324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto keep it simple and familliar in this Ruby runtime API.
253324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
254324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
255324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
256324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule CharacterStream
257324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include Stream
258324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  extend ClassMacros
259324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include Constants
260324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
261324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
262324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: substring(start,stop)
263324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :substring
264324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
265324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_accessor :line
266324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_accessor :column
267324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
268324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
269324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
270324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::TokenStream
271324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
272324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverTokenStream further extends the abstract-ish base mixin Stream to add methods
273324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverspecific to navigating token sequences. Thus, it serves as an imitation of the
274324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverJava interface for token-based streams, which are used by many different
275324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercomponents in ANTLR, including parsers and tree parsers.
276324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
277324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== Token Streams
278324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
279324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverToken streams wrap a sequence of token objects produced by some token source,
280324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverusually a lexer. They provide the operations required by higher-level
281324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrecognizers, such as parsers and tree parsers for navigating through the
282324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversequence of tokens. Unlike simple character-based streams, such as StringStream,
283324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertoken-based streams have an additional level of complexity because they must
284324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermanage the task of "tuning" to a specific token channel.
285324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
286324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverOne of the main advantages of ANTLR-based recognition is the token
287324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<i>channel</i> feature, which allows you to hold on to all tokens of interest
288324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhile only presenting a specific set of interesting tokens to a parser. For
289324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverexample, if you need to hide whitespace and comments from a parser, but hang on
290324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverto them for some other purpose, you have the lexer assign the comments and
291324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhitespace to channel value HIDDEN as it creates the tokens.
292324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
293324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverWhen you create a token stream, you can tune it to some specific channel value.
294324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThen, all <tt>peek</tt>, <tt>look</tt>, and <tt>consume</tt> operations only
295324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveryield tokens that have the same value for <tt>channel</tt>. The stream skips
296324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverover any non-matching tokens in between.
297324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
298324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver== The TokenStream Interface
299324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
300324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverIn addition to the abstract methods and attribute methods provided by the base
301324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverStream module, TokenStream adds a number of additional method implementation
302324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverrequirements and attributes.
303324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
304324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
305324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
306324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvermodule TokenStream
307324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include Stream
308324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  extend ClassMacros
309324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
310324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
311324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # expected to return the token source object (such as a lexer) from which
312324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # all tokens in the stream were retreived
313324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :token_source
314324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
315324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
316324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # expected to return the value of the last marker produced by a call to 
317324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # <tt>stream.mark</tt>
318324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :last_marker
319324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
320324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
321324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # expected to return the integer index of the stream cursor
322324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :position
323324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
324324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
325324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the integer channel value to which the stream is ``tuned''
326324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_accessor :channel
327324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
328324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
329324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: to_s(start=0,stop=tokens.length-1)
330324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # should take the tokens between start and stop in the sequence, extract their text
331324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # and return the concatenation of all the text chunks
332324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :to_s
333324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
334324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ##
335324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # :method: at( i )
336324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return the stream symbol at index +i+
337324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  abstract :at
338324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
339324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
340324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::StringStream
341324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
342324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverA StringStream's purpose is to wrap the basic, naked text input of a recognition
343324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversystem. Like all other stream types, it provides serial navigation of the input;
344324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvera recognizer can arbitrarily step forward and backward through the stream's
345324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversymbols as it requires. StringStream and its subclasses are they main way to
346324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverfeed text input into an ANTLR Lexer for token processing.
347324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
348324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThe stream's symbols of interest, of course, are character values. Thus, the
349324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver#peek method returns the integer character value at look-ahead position
350324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver<tt>k</tt> and the #look method returns the character value as a +String+. They
351324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveralso track various pieces of information such as the line and column numbers at
352324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverthe current position.
353324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
354324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== Note About Text Encoding
355324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
356324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverThis version of the runtime library primarily targets ruby version 1.8, which
357324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdoes not have strong built-in support for multi-byte character encodings. Thus,
358324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacters are assumed to be represented by a single byte -- an integer between
359324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver0 and 255. Ruby 1.9 does provide built-in encoding support for multi-byte
360324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvercharacters, but currently this library does not provide any streams to handle
361324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernon-ASCII encoding. However, encoding-savvy recognition code is a future
362324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverdevelopment goal for this project.
363324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
364324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
365324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
366324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass StringStream
367324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  NEWLINE = ?\n.ord
368324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
369324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include CharacterStream
370324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
371324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # current integer character index of the stream
372324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :position
373324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
374324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the current line number of the input, indexed upward from 1
375324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :line
376324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
377324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the current character position within the current line, indexed upward from 0
378324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :column
379324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
380324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the name associated with the stream -- usually a file name
381324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # defaults to <tt>"(string)"</tt>
382324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_accessor :name
383324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
384324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the entire string that is wrapped by the stream
385324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :data
386324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  attr_reader :string
387324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
388324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  if RUBY_VERSION =~ /^1\.9/
389324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
390324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # creates a new StringStream object where +data+ is the string data to stream.
391324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # accepts the following options in a symbol-to-value hash:
392324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #
393324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt>
394324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:line] the initial line number; default: +1+
395324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:column] the initial column number; default: +0+
396324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # 
397324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    def initialize( data, options = {} )      # for 1.9
398324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @string   = data.to_s.encode( Encoding::UTF_8 ).freeze
399324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @data     = @string.codepoints.to_a.freeze
400324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @position = options.fetch :position, 0
401324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @line     = options.fetch :line, 1
402324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @column   = options.fetch :column, 0
403324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @markers  = []
404324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @name   ||= options[ :file ] || options[ :name ] # || '(string)'
405324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      mark
406324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
407324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
408324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #
409324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # identical to #peek, except it returns the character value as a String
410324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # 
411324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    def look( k = 1 )               # for 1.9
412324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      k == 0 and return nil
413324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      k += 1 if k < 0
414324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
415324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      index = @position + k - 1
416324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      index < 0 and return nil
417324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
418324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @string[ index ]
419324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
420324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
421324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  else
422324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
423324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # creates a new StringStream object where +data+ is the string data to stream.
424324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # accepts the following options in a symbol-to-value hash:
425324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #
426324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:file or :name] the (file) name to associate with the stream; default: <tt>'(string)'</tt>
427324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:line] the initial line number; default: +1+
428324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # [:column] the initial column number; default: +0+
429324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # 
430324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    def initialize( data, options = {} )    # for 1.8
431324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @data = data.to_s
432324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @data.equal?( data ) and @data = @data.clone
433324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @data.freeze
434324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @string = @data
435324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @position = options.fetch :position, 0
436324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @line = options.fetch :line, 1
437324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @column = options.fetch :column, 0
438324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @markers = []
439324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @name ||= options[ :file ] || options[ :name ] # || '(string)'
440324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      mark
441324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
442324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
443324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #
444324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # identical to #peek, except it returns the character value as a String
445324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # 
446324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    def look( k = 1 )                        # for 1.8
447324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      k == 0 and return nil
448324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      k += 1 if k < 0
449324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
450324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      index = @position + k - 1
451324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      index < 0 and return nil
452324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
453324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      c = @data[ index ] and c.chr
454324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
455324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
456324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
457324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
458324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def size
459324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @data.length
460324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
461324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
462324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias length size
463324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
464324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
465324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # rewinds the stream back to the start and clears out any existing marker entries
466324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
467324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def reset
468324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    initial_location = @markers.first
469324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position, @line, @column = initial_location
470324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers.clear
471324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers << initial_location
472324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
473324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
474324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
475324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
476324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # advance the stream by one character; returns the character consumed
477324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
478324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def consume
479324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    c = @data[ @position ] || EOF
480324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if @position < @data.length
481324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @column += 1
482324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      if c == NEWLINE
483324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @line += 1
484324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @column = 0
485324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
486324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @position += 1
487324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
488324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return( c )
489324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
490324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
491324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
492324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return the character at look-ahead distance +k+ as an integer. <tt>k = 1</tt> represents
493324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the current character. +k+ greater than 1 represents upcoming characters. A negative
494324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # value of +k+ returns previous characters consumed, where <tt>k = -1</tt> is the last
495324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # character consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+
496324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
497324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def peek( k = 1 )
498324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    k == 0 and return nil
499324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    k += 1 if k < 0
500324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    index = @position + k - 1
501324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    index < 0 and return nil
502324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @data[ index ] or EOF
503324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
504324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
505324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
506324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return a substring around the stream cursor at a distance +k+
507324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # if <tt>k >= 0</tt>, return the next k characters
508324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # if <tt>k < 0</tt>, return the previous <tt>|k|</tt> characters
509324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
510324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def through( k )
511324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if k >= 0 then @string[ @position, k ] else
512324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      start = ( @position + k ).at_least( 0 ) # start cannot be negative or index will wrap around
513324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @string[ start ... @position ]
514324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
515324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
516324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
517324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # operator style look-ahead
518324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias >> look
519324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
520324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # operator style look-behind
521324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def <<( k )
522324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    self << -k
523324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
524324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
525324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias index position
526324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias character_index position
527324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
528324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias source_name name
529324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
530324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
531324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # Returns true if the stream appears to be at the beginning of a new line.
532324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # This is an extra utility method for use inside lexer actions if needed.
533324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
534324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def beginning_of_line?
535324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position.zero? or @data[ @position - 1 ] == NEWLINE
536324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
537324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
538324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
539324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # Returns true if the stream appears to be at the end of a new line.
540324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # This is an extra utility method for use inside lexer actions if needed.
541324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
542324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def end_of_line?
543324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @data[ @position ] == NEWLINE #if @position < @data.length
544324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
545324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
546324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
547324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # Returns true if the stream has been exhausted.
548324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # This is an extra utility method for use inside lexer actions if needed.
549324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
550324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def end_of_string?
551324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position >= @data.length
552324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
553324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
554324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
555324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # Returns true if the stream appears to be at the beginning of a stream (position = 0).
556324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # This is an extra utility method for use inside lexer actions if needed.
557324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
558324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def beginning_of_string?
559324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position == 0
560324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
561324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
562324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias eof? end_of_string?
563324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias bof? beginning_of_string?
564324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
565324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
566324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # record the current stream location parameters in the stream's marker table and
567324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return an integer-valued bookmark that may be used to restore the stream's
568324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # position with the #rewind method. This method is used to implement backtracking.
569324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
570324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def mark
571324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    state = [ @position, @line, @column ].freeze
572324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers << state
573324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return @markers.length - 1
574324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
575324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
576324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
577324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # restore the stream to an earlier location recorded by #mark. If no marker value is
578324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # provided, the last marker generated by #mark will be used.
579324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
580324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def rewind( marker = @markers.length - 1, release = true )
581324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    ( marker >= 0 and location = @markers[ marker ] ) or return( self )
582324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position, @line, @column = location
583324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    release( marker ) if release
584324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
585324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
586324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
587324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
588324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the total number of markers currently in existence
589324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
590324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def mark_depth
591324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers.length
592324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
593324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
594324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
595324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the last marker value created by a call to #mark
596324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
597324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def last_marker
598324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers.length - 1
599324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
600324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
601324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
602324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # let go of the bookmark data for the marker and all marker
603324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # values created after the marker.
604324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
605324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def release( marker = @markers.length - 1 )
606324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    marker.between?( 1, @markers.length - 1 ) or return
607324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @markers.pop( @markers.length - marker )
608324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
609324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
610324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
611324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
612324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # jump to the absolute position value given by +index+.
613324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # note: if +index+ is before the current position, the +line+ and +column+
614324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #       attributes of the stream will probably be incorrect
615324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
616324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def seek( index )
617324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    index = index.bound( 0, @data.length )  # ensures index is within the stream's range
618324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if index > @position
619324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      skipped = through( index - @position )
620324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      if lc = skipped.count( "\n" ) and lc.zero?
621324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @column += skipped.length
622324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      else
623324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @line += lc
624324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @column = skipped.length - skipped.rindex( "\n" ) - 1
625324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
626324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
627324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position = index
628324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return nil
629324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
630324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
631324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
632324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # customized object inspection that shows:
633324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # * the stream class
634324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # * the stream's location in <tt>index / line:column</tt> format
635324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # * +before_chars+ characters before the cursor (6 characters by default)
636324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # * +after_chars+ characters after the cursor (10 characters by default)
637324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
638324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def inspect( before_chars = 6, after_chars = 10 )
639324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    before = through( -before_chars ).inspect
640324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position - before_chars > 0 and before.insert( 0, '... ' )
641324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
642324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    after = through( after_chars ).inspect
643324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position + after_chars + 1 < @data.length and after << ' ...'
644324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
645324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    location = "#@position / line #@line:#@column"
646324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    "#<#{ self.class }: #{ before } | #{ after } @ #{ location }>"
647324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
648324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
649324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
650324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return the string slice between position +start+ and +stop+
651324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
652324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def substring( start, stop )
653324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @string[ start, stop - start + 1 ]
654324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
655324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
656324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
657324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # identical to String#[]
658324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
659324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def []( start, *args )
660324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @string[ start, *args ]
661324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
662324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
663324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
664324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
665324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::FileStream
666324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
667324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverFileStream is a character stream that uses data stored in some external file. It
668324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruveris nearly identical to StringStream and functions as use data located in a file
669324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwhile automatically setting up the +source_name+ and +line+ parameters. It does
670324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvernot actually use any buffered IO operations throughout the stream navigation
671324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverprocess. Instead, it reads the file data once when the stream is initialized.
672324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
673324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
674324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
675324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass FileStream < StringStream
676324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
677324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
678324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # creates a new FileStream object using the given +file+ object.
679324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If +file+ is a path string, the file will be read and the contents
680324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # will be used and the +name+ attribute will be set to the path.
681324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If +file+ is an IO-like object (that responds to :read),
682324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the content of the object will be used and the stream will
683324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # attempt to set its +name+ object first trying the method #name
684324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # on the object, then trying the method #path on the object.
685324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
686324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # see StringStream.new for a list of additional options
687324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the constructer accepts
688324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
689324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def initialize( file, options = {} )
690324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    case file
691324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when $stdin then
692324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      data = $stdin.read
693324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @name = '(stdin)'
694324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when ARGF
695324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      data = file.read
696324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @name = file.path
697324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when ::File then
698324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      file = file.clone
699324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      file.reopen( file.path, 'r' )
700324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @name = file.path
701324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      data = file.read
702324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      file.close
703324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    else
704324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      if file.respond_to?( :read )
705324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        data = file.read
706324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        if file.respond_to?( :name ) then @name = file.name
707324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        elsif file.respond_to?( :path ) then @name = file.path
708324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        end
709324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      else
710324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @name = file.to_s
711324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        if test( ?f, @name ) then data = File.read( @name )
712324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        else raise ArgumentError, "could not find an existing file at %p" % @name
713324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        end
714324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
715324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
716324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    super( data, options )
717324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
718324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
719324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
720324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
721324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=begin rdoc ANTLR3::CommonTokenStream
722324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
723324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverCommonTokenStream serves as the primary token stream implementation for feeding
724324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversequential token input into parsers.
725324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
726324c4644fee44b9898524c09511bd33c3f12e2dfBen GruverUsing some TokenSource (such as a lexer), the stream collects a token sequence,
727324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruversetting the token's <tt>index</tt> attribute to indicate the token's position
728324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverwithin the stream. The streams may be tuned to some channel value; off-channel
729324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruvertokens will be filtered out by the #peek, #look, and #consume methods.
730324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
731324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=== Sample Usage
732324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
733324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
734324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  source_input = ANTLR3::StringStream.new("35 * 4 - 1")
735324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  lexer = Calculator::Lexer.new(source_input)
736324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  tokens = ANTLR3::CommonTokenStream.new(lexer)
737324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
738324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # assume this grammar defines whitespace as tokens on channel HIDDEN
739324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # and numbers and operations as tokens on channel DEFAULT
740324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  tokens.look         # => 0 INT['35'] @ line 1 col 0 (0..1)
741324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  tokens.look(2)      # => 2 MULT["*"] @ line 1 col 2 (3..3)
742324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  tokens.tokens(0, 2)
743324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # => [0 INT["35"] @line 1 col 0 (0..1), 
744324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #     1 WS[" "] @line 1 col 2 (1..1), 
745324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    #     2 MULT["*"] @ line 1 col 3 (3..3)]
746324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # notice the #tokens method does not filter off-channel tokens
747324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
748324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  lexer.reset
749324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  hidden_tokens = 
750324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    ANTLR3::CommonTokenStream.new(lexer, :channel => ANTLR3::HIDDEN)
751324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  hidden_tokens.look # => 1 WS[' '] @ line 1 col 2 (1..1)
752324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
753324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver=end
754324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
755324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverclass CommonTokenStream
756324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include TokenStream
757324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  include Enumerable
758324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
759324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
760324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # constructs a new token stream using the +token_source+ provided. +token_source+ is
761324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # usually a lexer, but can be any object that implements +next_token+ and includes
762324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # ANTLR3::TokenSource.
763324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
764324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If a block is provided, each token harvested will be yielded and if the block
765324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns a +nil+ or +false+ value, the token will not be added to the stream --
766324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # it will be discarded.
767324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
768324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # === Options
769324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # [:channel] The channel value the stream should be tuned to initially
770324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # [:source_name] The source name (file name) attribute of the stream
771324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
772324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # === Example
773324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
774324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #   # create a new token stream that is tuned to channel :comment, and
775324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #   # discard all WHITE_SPACE tokens
776324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #   ANTLR3::CommonTokenStream.new(lexer, :channel => :comment) do |token|
777324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #     token.name != 'WHITE_SPACE'
778324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #   end
779324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
780324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def initialize( token_source, options = {} )
781324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    case token_source
782324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when CommonTokenStream
783324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # this is useful in cases where you want to convert a CommonTokenStream
784324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # to a RewriteTokenStream or other variation of the standard token stream
785324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      stream = token_source
786324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @token_source = stream.token_source
787324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @channel = options.fetch( :channel ) { stream.channel or DEFAULT_CHANNEL }
788324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @source_name = options.fetch( :source_name ) { stream.source_name }
789324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      tokens = stream.tokens.map { | t | t.dup }
790324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    else
791324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @token_source = token_source
792324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @channel = options.fetch( :channel, DEFAULT_CHANNEL )
793324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @source_name = options.fetch( :source_name ) {  @token_source.source_name rescue nil }
794324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      tokens = @token_source.to_a
795324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
796324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @last_marker = nil
797324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens = block_given? ? tokens.select { | t | yield( t, self ) } : tokens
798324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.each_with_index { |t, i| t.index = i }
799324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position = 
800324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      if first_token = @tokens.find { |t| t.channel == @channel }
801324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @tokens.index( first_token )
802324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      else @tokens.length
803324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
804324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
805324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
806324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
807324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # resets the token stream and rebuilds it with a potentially new token source.
808324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If no +token_source+ value is provided, the stream will attempt to reset the
809324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # current +token_source+ by calling +reset+ on the object. The stream will
810324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # then clear the token buffer and attempt to harvest new tokens. Identical in
811324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # behavior to CommonTokenStream.new, if a block is provided, tokens will be
812324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # yielded and discarded if the block returns a +false+ or +nil+ value.
813324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
814324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def rebuild( token_source = nil )
815324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if token_source.nil?
816324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @token_source.reset rescue nil
817324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    else @token_source = token_source
818324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
819324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens = block_given? ? @token_source.select { |token| yield( token ) } :   
820324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver                             @token_source.to_a
821324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.each_with_index { |t, i| t.index = i }
822324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @last_marker = nil
823324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position = 
824324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      if first_token = @tokens.find { |t| t.channel == @channel }
825324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        @tokens.index( first_token )
826324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      else @tokens.length
827324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
828324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
829324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
830324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
831324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
832324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # tune the stream to a new channel value
833324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
834324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def tune_to( channel )
835324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @channel = channel
836324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
837324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
838324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def token_class
839324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @token_source.token_class
840324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  rescue NoMethodError
841324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position == -1 and fill_buffer
842324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.empty? ? CommonToken : @tokens.first.class
843324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
844324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
845324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias index position
846324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
847324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def size
848324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.length
849324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
850324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
851324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias length size
852324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
853324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ###### State-Control ################################################
854324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
855324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
856324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # rewind the stream to its initial state
857324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
858324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def reset
859324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position = 0
860324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position += 1 while token = @tokens[ @position ] and
861324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver                         token.channel != @channel
862324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @last_marker = nil
863324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
864324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
865324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
866324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
867324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # bookmark the current position of the input stream
868324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
869324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def mark
870324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @last_marker = @position
871324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
872324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
873324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def release( marker = nil )
874324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    # do nothing
875324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
876324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
877324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
878324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def rewind( marker = @last_marker, release = true )
879324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    seek( marker )
880324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
881324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
882324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
883324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # saves the current stream position, yields to the block,
884324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # and then ensures the stream's position is restored before
885324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returning the value of the block
886324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #  
887324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def hold( pos = @position )
888324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    block_given? or return enum_for( :hold, pos )
889324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    begin
890324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      yield
891324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    ensure
892324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      seek( pos )
893324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
894324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
895324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
896324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ###### Stream Navigation ###########################################
897324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
898324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
899324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # advance the stream one step to the next on-channel token
900324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
901324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def consume
902324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    token = @tokens[ @position ] || EOF_TOKEN
903324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if @position < @tokens.length
904324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @position = future?( 2 ) || @tokens.length
905324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
906324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return( token )
907324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
908324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
909324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
910324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # jump to the stream position specified by +index+
911324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # note: seek does not check whether or not the
912324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #       token at the specified position is on-channel,
913324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
914324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def seek( index )
915324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position = index.to_i.bound( 0, @tokens.length )
916324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return self
917324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
918324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
919324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
920324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # return the type of the on-channel token at look-ahead distance +k+. <tt>k = 1</tt> represents
921324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the current token. +k+ greater than 1 represents upcoming on-channel tokens. A negative
922324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # value of +k+ returns previous on-channel tokens consumed, where <tt>k = -1</tt> is the last
923324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # on-channel token consumed. <tt>k = 0</tt> has undefined behavior and returns +nil+
924324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
925324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def peek( k = 1 )
926324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    tk = look( k ) and return( tk.type )
927324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
928324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
929324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
930324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # operates simillarly to #peek, but returns the full token object at look-ahead position +k+
931324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
932324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def look( k = 1 )
933324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    index = future?( k ) or return nil
934324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.fetch( index, EOF_TOKEN )
935324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
936324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
937324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias >> look
938324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def << k
939324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    self >> -k
940324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
941324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
942324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
943324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns the index of the on-channel token at look-ahead position +k+ or nil if no other
944324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # on-channel tokens exist
945324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
946324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def future?( k = 1 )
947324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position == -1 and fill_buffer
948324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
949324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    case
950324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when k == 0 then nil
951324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when k < 0 then past?( -k )
952324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when k == 1 then @position
953324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    else
954324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # since the stream only yields on-channel
955324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # tokens, the stream can't just go to the
956324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # next position, but rather must skip
957324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      # over off-channel tokens
958324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      ( k - 1 ).times.inject( @position ) do |cursor, |
959324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        begin
960324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver          tk = @tokens.at( cursor += 1 ) or return( cursor )
961324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver          # ^- if tk is nil (i.e. i is outside array limits)
962324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        end until tk.channel == @channel
963324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        cursor
964324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
965324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
966324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
967324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
968324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
969324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns the index of the on-channel token at look-behind position +k+ or nil if no other
970324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # on-channel tokens exist before the current token
971324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
972324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def past?( k = 1 )
973324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @position == -1 and fill_buffer
974324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
975324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    case
976324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when k == 0 then nil
977324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    when @position - k < 0 then nil
978324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    else
979324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
980324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      k.times.inject( @position ) do |cursor, |
981324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        begin
982324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver          cursor <= 0 and return( nil )
983324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver          tk = @tokens.at( cursor -= 1 ) or return( nil )
984324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        end until tk.channel == @channel
985324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        cursor
986324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
987324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      
988324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
989324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
990324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
991324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
992324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # yields each token in the stream (including off-channel tokens)
993324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If no block is provided, the method returns an Enumerator object.
994324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # #each accepts the same arguments as #tokens
995324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
996324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def each( *args )
997324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    block_given? or return enum_for( :each, *args )
998324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    tokens( *args ).each { |token| yield( token ) }
999324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1000324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1001324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1002324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
1003324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # yields each token in the stream with the given channel value
1004324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If no channel value is given, the stream's tuned channel value will be used.
1005324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If no block is given, an enumerator will be returned. 
1006324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1007324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def each_on_channel( channel = @channel )
1008324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    block_given? or return enum_for( :each_on_channel, channel )
1009324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    for token in @tokens
1010324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      token.channel == channel and yield( token )
1011324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
1012324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1013324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1014324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
1015324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # iterates through the token stream, yielding each on channel token along the way.
1016324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # After iteration has completed, the stream's position will be restored to where
1017324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # it was before #walk was called. While #each or #each_on_channel does not change
1018324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # the positions stream during iteration, #walk advances through the stream. This
1019324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # makes it possible to look ahead and behind the current token during iteration.
1020324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # If no block is given, an enumerator will be returned. 
1021324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1022324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def walk
1023324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    block_given? or return enum_for( :walk )
1024324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    initial_position = @position
1025324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    begin
1026324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      while token = look and token.type != EOF
1027324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        consume
1028324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver        yield( token )
1029324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      end
1030324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      return self
1031324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    ensure
1032324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      @position = initial_position
1033324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
1034324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1035324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1036324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1037324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns a copy of the token buffer. If +start+ and +stop+ are provided, tokens
1038324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # returns a slice of the token buffer from <tt>start..stop</tt>. The parameters
1039324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # are converted to integers with their <tt>to_i</tt> methods, and thus tokens
1040324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # can be provided to specify start and stop. If a block is provided, tokens are
1041324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # yielded and filtered out of the return array if the block returns a +false+
1042324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # or +nil+ value. 
1043324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1044324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def tokens( start = nil, stop = nil )
1045324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    stop.nil?  || stop >= @tokens.length and stop = @tokens.length - 1
1046324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    start.nil? || stop < 0 and start = 0
1047324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    tokens = @tokens[ start..stop ]
1048324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
1049324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    if block_given?
1050324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      tokens.delete_if { |t| not yield( t ) }
1051324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    end
1052324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    
1053324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    return( tokens )
1054324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1055324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1056324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1057324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def at( i )
1058324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens.at i
1059324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1060324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1061324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
1062324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # identical to Array#[], as applied to the stream's token buffer
1063324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1064324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def []( i, *args )
1065324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens[ i, *args ]
1066324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1067324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1068324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  ###### Standard Conversion Methods ###############################
1069324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def inspect
1070324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    string = "#<%p: @token_source=%p @ %p/%p" %
1071324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver      [ self.class, @token_source.class, @position, @tokens.length ]
1072324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    tk = look( -1 ) and string << " #{ tk.inspect } <--"
1073324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    tk = look( 1 ) and string << " --> #{ tk.inspect }"
1074324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    string << '>'
1075324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1076324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1077324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  #
1078324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # fetches the text content of all tokens between +start+ and +stop+ and
1079324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # joins the chunks into a single string
1080324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  # 
1081324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  def extract_text( start = 0, stop = @tokens.length - 1 )
1082324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    start = start.to_i.at_least( 0 )
1083324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    stop = stop.to_i.at_most( @tokens.length )
1084324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver    @tokens[ start..stop ].map! { |t| t.text }.join( '' )
1085324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  end
1086324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1087324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  alias to_s extract_text
1088324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver  
1089324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
1090324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruver
1091324c4644fee44b9898524c09511bd33c3f12e2dfBen Gruverend
1092