1ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/* 2ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 3b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho* Copyright (C) 2001-2011 IBM and others. All rights reserved. 4ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 5ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* Date Name Description 6ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* 03/22/2000 helena Creation. 7ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 8ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru*/ 9ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 10ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#ifndef SEARCH_H 11ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#define SEARCH_H 12ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 13ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/utypes.h" 14ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 15ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 16ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * \file 17ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * \brief C++ API: SearchIterator object. 18ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 19ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 2085bf2e2fbc60a9f938064abc8127d61da7d19882Claire Ho#if !UCONFIG_NO_COLLATION && !UCONFIG_NO_BREAK_ITERATION 21ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 22ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/uobject.h" 23ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/unistr.h" 24ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/chariter.h" 25ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/brkiter.h" 26ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/usearch.h" 27ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 28ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 29ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* @stable ICU 2.0 30ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru*/ 31ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Querustruct USearch; 32ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 33ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* @stable ICU 2.0 34ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru*/ 35ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Querutypedef struct USearch USearch; 36ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 37ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste QueruU_NAMESPACE_BEGIN 38ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 39ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 40ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 41ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator</tt> is an abstract base class that provides 42ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * methods to search for a pattern within a text string. Instances of 43ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator</tt> maintain a current position and scans over the 44ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * target text, returning the indices the pattern is matched and the length 45ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * of each match. 46ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 47ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator</tt> defines a protocol for text searching. 48ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Subclasses provide concrete implementations of various search algorithms. 49ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * For example, <tt>StringSearch</tt> implements language-sensitive pattern 50ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * matching based on the comparison rules defined in a 51ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>RuleBasedCollator</tt> object. 52ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 53ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Other options for searching includes using a BreakIterator to restrict 54ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the points at which matches are detected. 55ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 56ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator</tt> provides an API that is similar to that of 57ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * other text iteration classes such as <tt>BreakIterator</tt>. Using 58ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * this class, it is easy to scan through text looking for all occurances of 59ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * a given pattern. The following example uses a <tt>StringSearch</tt> 60ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * object to find all instances of "fox" in the target string. Any other 61ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * subclass of <tt>SearchIterator</tt> can be used in an identical 62ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * manner. 63ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <pre><code> 64ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UnicodeString target("The quick brown fox jumped over the lazy fox"); 65ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UnicodeString pattern("fox"); 66ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 67ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * SearchIterator *iter = new StringSearch(pattern, target); 68ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UErrorCode error = U_ZERO_ERROR; 69ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * for (int pos = iter->first(error); pos != USEARCH_DONE; 70ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pos = iter->next(error)) { 71ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * printf("Found match at %d pos, length is %d\n", pos, 72ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iter.getMatchLength()); 73ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * } 74ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * </code></pre> 75ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 76ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see StringSearch 77ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see RuleBasedCollator 78ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 79ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruclass U_I18N_API SearchIterator : public UObject { 80ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 81ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Querupublic: 82ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 83ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public constructors and destructors ------------------------------- 84ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 85ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 86ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Copy constructor that creates a SearchIterator instance with the same 87ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * behavior, and iterating over the same text. 88ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param other the SearchIterator instance to be copied. 89ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 90ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 91ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru SearchIterator(const SearchIterator &other); 92ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 93ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 94ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Destructor. Cleans up the search iterator data struct. 95ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 96ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 97ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual ~SearchIterator(); 98ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 99ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public get and set methods ---------------------------------------- 100ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 101ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 102ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the index to point to the given position, and clears any state 103ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * that's affected. 104ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 105ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This method takes the argument index and sets the position in the text 106ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * string accordingly without checking if the index is pointing to a 107ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * valid starting point to begin searching. 108ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position within the text to be set. If position is less 109ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * than or greater than the text range for searching, 110ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * an U_INDEX_OUTOFBOUNDS_ERROR will be returned 111ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 112ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 113ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 114ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setOffset(int32_t position, UErrorCode &status) = 0; 115ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 116ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 117ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Return the current index in the text being searched. 118ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If the iteration has gone past the end of the text 119ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * (or past the beginning for a backwards search), USEARCH_DONE 120ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is returned. 121ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return current index in the text being searched. 122ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 123ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 124ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t getOffset(void) const = 0; 125ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 126ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 127ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the text searching attributes located in the enum 128ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USearchAttribute with values from the enum USearchAttributeValue. 129ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USEARCH_DEFAULT can be used for all attributes for resetting. 130ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param attribute text attribute (enum USearchAttribute) to be set 131ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param value text attribute value 132ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 133ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 134ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 135ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void setAttribute(USearchAttribute attribute, 136ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru USearchAttributeValue value, 137ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UErrorCode &status); 138ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 139ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 140ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Gets the text searching attributes 141ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param attribute text attribute (enum USearchAttribute) to be retrieve 142ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return text attribute value 143ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 144ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 145ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru USearchAttributeValue getAttribute(USearchAttribute attribute) const; 146ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 147ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 148ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the index to the match in the text string that was searched. 149ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This call returns a valid result only after a successful call to 150ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>first</tt>, <tt>next</tt>, <tt>previous</tt>, or <tt>last</tt>. 151ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Just after construction, or after a searching method returns 152ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>USEARCH_DONE</tt>, this method will return <tt>USEARCH_DONE</tt>. 153ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 154ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Use getMatchedLength to get the matched string length. 155ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return index of a substring within the text string that is being 156ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * searched. 157ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #first 158ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #next 159ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #previous 160ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #last 161ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 162ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 163ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t getMatchedStart(void) const; 164ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 165ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 166ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the length of text in the string which matches the search 167ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern. This call returns a valid result only after a successful call 168ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * to <tt>first</tt>, <tt>next</tt>, <tt>previous</tt>, or <tt>last</tt>. 169ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Just after construction, or after a searching method returns 170ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>USEARCH_DONE</tt>, this method will return 0. 171ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The length of the match in the target text, or 0 if there 172ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is no match currently. 173ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #first 174ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #next 175ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #previous 176ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #last 177ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 178ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 179ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t getMatchedLength(void) const; 180ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 181ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 182ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the text that was matched by the most recent call to 183ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>first</tt>, <tt>next</tt>, <tt>previous</tt>, or <tt>last</tt>. 184ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If the iterator is not pointing at a valid match (e.g. just after 185ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * construction or after <tt>USEARCH_DONE</tt> has been returned, 186ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * returns an empty string. 187ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param result stores the matched string or an empty string if a match 188ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is not found. 189ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #first 190ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #next 191ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #previous 192ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #last 193ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 194ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 195ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void getMatchedText(UnicodeString &result) const; 196ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 197ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 198ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Set the BreakIterator that will be used to restrict the points 199ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * at which matches are detected. The user is responsible for deleting 200ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the breakiterator. 201ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A BreakIterator that will be used to restrict the 202ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * points at which matches are detected. If a match is 203ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * found, but the match's start or end index is not a 204ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * boundary as determined by the <tt>BreakIterator</tt>, 205ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the match will be rejected and another will be searched 206ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * for. If this parameter is <tt>NULL</tt>, no break 207ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * detection is attempted. 208ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 209ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see BreakIterator 210ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 211ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 212ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void setBreakIterator(BreakIterator *breakiter, UErrorCode &status); 213ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 214ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 215ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the BreakIterator that is used to restrict the points at 216ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * which matches are detected. This will be the same object that was 217ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * passed to the constructor or to <tt>setBreakIterator</tt>. 218ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note that <tt>NULL</tt> is a legal value; it means that break 219ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * detection should not be attempted. 220ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return BreakIterator used to restrict matchings. 221ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #setBreakIterator 222ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 223ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 224ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const BreakIterator * getBreakIterator(void) const; 225ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 226ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 227ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Set the string text to be searched. Text iteration will hence begin at 228ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the start of the text string. This method is useful if you want to 229ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * re-use an iterator to search for the same pattern within a different 230ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * body of text. The user is responsible for deleting the text. 231ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text string to be searched. 232ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors. If the text length is 0, 233ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * an U_ILLEGAL_ARGUMENT_ERROR is returned. 234ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 235ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 236ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setText(const UnicodeString &text, UErrorCode &status); 237ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 238ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 239ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Set the string text to be searched. Text iteration will hence begin at 240ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the start of the text string. This method is useful if you want to 241ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * re-use an iterator to search for the same pattern within a different 242ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * body of text. 243ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 244ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note: No parsing of the text within the <tt>CharacterIterator</tt> 245ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be done during searching for this version. The block of text 246ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in <tt>CharacterIterator</tt> will be used as it is. 247ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The user is responsible for deleting the text. 248ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text string iterator to be searched. 249ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If the text length is 0 then an 250ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * U_ILLEGAL_ARGUMENT_ERROR is returned. 251ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 252ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 253ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setText(CharacterIterator &text, UErrorCode &status); 254ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 255ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 256ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Return the string text to be searched. 257ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return text string to be searched. 258ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 259ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 260ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const UnicodeString & getText(void) const; 261ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 262ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // operator overloading ---------------------------------------------- 263ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 264ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 265ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Equality operator. 266ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that SearchIterator instance to be compared. 267ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return TRUE if both BreakIterators are of the same class, have the 268ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * same behavior, terates over the same text and have the same 269ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * attributes. FALSE otherwise. 270ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 271ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 272ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual UBool operator==(const SearchIterator &that) const; 273ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 274ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 275ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Not-equal operator. 276ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that SearchIterator instance to be compared. 277ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return FALSE if operator== returns TRUE, and vice versa. 278ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 279ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 280ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UBool operator!=(const SearchIterator &that) const; 281ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 282ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public methods ---------------------------------------------------- 283ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 284ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 285ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns a copy of SearchIterator with the same behavior, and 286ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iterating over the same text, as this one. Note that all data will be 287ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * replicated, except for the text string to be searched. 288ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return cloned object 289ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 290ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 291ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual SearchIterator* safeClone(void) const = 0; 292ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 293ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 294ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the first index at which the string text matches the search 295ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern. The iterator is adjusted so that its current index (as 296ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * returned by <tt>getOffset</tt>) is the match position if one 297ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * was found. 298ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 299ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the iterator will be adjusted to the index USEARCH_DONE 300ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 301ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The character index of the first match, or 302ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>USEARCH_DONE</tt> if there are no matches. 303ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 304ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 305ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 306ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t first(UErrorCode &status); 307ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 308ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 309b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * Returns the first index equal or greater than <tt>position</tt> at which the 310ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * string text matches the search pattern. The iterator is adjusted so 311ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * that its current index (as returned by <tt>getOffset</tt>) is the 312b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * match position if one was found. 313b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and the 314b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * iterator will be adjusted to the index <tt>USEARCH_DONE</tt>. 315ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position where search if to start from. If position is less 316ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * than or greater than the text range for searching, 317ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * an U_INDEX_OUTOFBOUNDS_ERROR will be returned 318ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 319ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The character index of the first match following 320ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>position</tt>, or <tt>USEARCH_DONE</tt> if there are no 321ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * matches. 322ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 323ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 324ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 325ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t following(int32_t position, UErrorCode &status); 326ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 327ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 328ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the last index in the target text at which it matches the 329ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * search pattern. The iterator is adjusted so that its current index 330ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * (as returned by <tt>getOffset</tt>) is the match position if one was 331ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * found. 332ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 333ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the iterator will be adjusted to the index USEARCH_DONE. 334ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 335ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The index of the first match, or <tt>USEARCH_DONE</tt> if 336ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * there are no matches. 337ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 338ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 339ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 340ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t last(UErrorCode &status); 341ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 342ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 343ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the first index less than <tt>position</tt> at which the string 344ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * text matches the search pattern. The iterator is adjusted so that its 345ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * current index (as returned by <tt>getOffset</tt>) is the match 346ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * position if one was found. If a match is not found, 347ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>USEARCH_DONE</tt> will be returned and the iterator will be 348ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * adjusted to the index USEARCH_DONE 349b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * <p> 350b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * When <tt>USEARCH_OVERLAP</tt> option is off, the last index of the 351b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * result match is always less than <tt>position</tt>. 352b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * When <tt>USERARCH_OVERLAP</tt> is on, the result match may span across 353b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * <tt>position</tt>. 354b26ce3a7367e4ed2ee7ddddcdc3f3d3377a455c2claireho * 355ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position where search is to start from. If position is less 356ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * than or greater than the text range for searching, 357ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * an U_INDEX_OUTOFBOUNDS_ERROR will be returned 358ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 359ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The character index of the first match preceding 360ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>position</tt>, or <tt>USEARCH_DONE</tt> if there are 361ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * no matches. 362ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 363ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 364ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 365ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t preceding(int32_t position, UErrorCode &status); 366ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 367ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 368ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the index of the next point at which the text matches the 369ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * search pattern, starting from the current position 370ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The iterator is adjusted so that its current index (as returned by 371ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>getOffset</tt>) is the match position if one was found. 372ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 373ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the iterator will be adjusted to a position after the end of the text 374ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * string. 375ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 376ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The index of the next match after the current position, 377ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * or <tt>USEARCH_DONE</tt> if there are no more matches. 378ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 379ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 380ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 381ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t next(UErrorCode &status); 382ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 383ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 384ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns the index of the previous point at which the string text 385ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * matches the search pattern, starting at the current position. 386ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The iterator is adjusted so that its current index (as returned by 387ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>getOffset</tt>) is the match position if one was found. 388ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 389ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the iterator will be adjusted to the index USEARCH_DONE 390ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 391ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The index of the previous match before the current position, 392ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * or <tt>USEARCH_DONE</tt> if there are no more matches. 393ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #getOffset 394ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 395ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 396ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru int32_t previous(UErrorCode &status); 397ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 398ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 399ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Resets the iteration. 400ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Search will begin at the start of the text string if a forward 401ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iteration is initiated before a backwards iteration. Otherwise if a 402ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * backwards iteration is initiated before a forwards iteration, the 403ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * search will begin at the end of the text string. 404ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 405ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 406ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void reset(); 407ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 408ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruprotected: 409ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // protected data members --------------------------------------------- 410ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 411ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 412ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * C search data struct 413ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 414ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 415ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru USearch *m_search_; 416ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 417ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 418ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Break iterator. 419ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Currently the C++ breakiterator does not have getRules etc to reproduce 420ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * another in C. Hence we keep the original around and do the verification 421ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * at the end of the match. The user is responsible for deleting this 422ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * break iterator. 423ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 424ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 425ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *m_breakiterator_; 426ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 427ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 428ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Unicode string version of the search text 429ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 430ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 431ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UnicodeString m_text_; 432ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 433ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // protected constructors and destructors ----------------------------- 434ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 435ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 436ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Default constructor. 437ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Initializes data to the default values. 438ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 439ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 440ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru SearchIterator(); 441ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 442ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 443ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Constructor for use by subclasses. 444ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The target text to be searched. 445ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A {@link BreakIterator} that is used to restrict the 446ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * points at which matches are detected. If 447ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>handleNext</tt> or <tt>handlePrev</tt> finds a 448ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * match, but the match's start or end index is not a 449ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * boundary as determined by the <tt>BreakIterator</tt>, 450ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the match is rejected and <tt>handleNext</tt> or 451ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>handlePrev</tt> is called again. If this parameter 452ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is <tt>NULL</tt>, no break detection is attempted. 453ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handleNext 454ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handlePrev 455ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 456ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 457ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru SearchIterator(const UnicodeString &text, 458ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *breakiter = NULL); 459ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 460ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 461ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Constructor for use by subclasses. 462ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 463ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note: No parsing of the text within the <tt>CharacterIterator</tt> 464ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be done during searching for this version. The block of text 465ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in <tt>CharacterIterator</tt> will be used as it is. 466ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The target text to be searched. 467ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A {@link BreakIterator} that is used to restrict the 468ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * points at which matches are detected. If 469ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>handleNext</tt> or <tt>handlePrev</tt> finds a 470ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * match, but the match's start or end index is not a 471ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * boundary as determined by the <tt>BreakIterator</tt>, 472ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the match is rejected and <tt>handleNext</tt> or 473ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>handlePrev</tt> is called again. If this parameter 474ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is <tt>NULL</tt>, no break detection is attempted. 475ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handleNext 476ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handlePrev 477ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 478ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 479ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru SearchIterator(CharacterIterator &text, BreakIterator *breakiter = NULL); 480ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 481ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // protected methods -------------------------------------------------- 482ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 483ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 484ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Assignment operator. Sets this iterator to have the same behavior, 485ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * and iterate over the same text, as the one passed in. 486ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that instance to be copied. 487ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 488ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 489ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru SearchIterator & operator=(const SearchIterator &that); 490ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 491ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 492ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Abstract method which subclasses override to provide the mechanism 493ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * for finding the next match in the target text. This allows different 494ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * subclasses to provide different search algorithms. 495ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 496ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is found, the implementation should return the index at 497ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * which the match starts and should call 498ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>setMatchLength</tt> with the number of characters 499ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in the target text that make up the match. If no match is found, the 500ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * method should return USEARCH_DONE. 501ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 502ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position The index in the target text at which the search 503ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * should start. 504ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for error codes if it occurs. 505ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return index at which the match starts, else if match is not found 506ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USEARCH_DONE is returned 507ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #setMatchLength 508ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 509ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 510ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t handleNext(int32_t position, UErrorCode &status) 511ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru = 0; 512ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 513ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 514ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Abstract method which subclasses override to provide the mechanism for 515ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * finding the previous match in the target text. This allows different 516ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * subclasses to provide different search algorithms. 517ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 518ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is found, the implementation should return the index at 519ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * which the match starts and should call 520ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>setMatchLength</tt> with the number of characters 521ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in the target text that make up the match. If no match is found, the 522ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * method should return USEARCH_DONE. 523ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 524ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position The index in the target text at which the search 525ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * should start. 526ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for error codes if it occurs. 527ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return index at which the match starts, else if match is not found 528ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USEARCH_DONE is returned 529ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #setMatchLength 530ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 531ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 532ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t handlePrev(int32_t position, UErrorCode &status) 533ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru = 0; 534ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 535ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 536ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the length of the currently matched string in the text string to 537ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * be searched. 538ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Subclasses' <tt>handleNext</tt> and <tt>handlePrev</tt> 539ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * methods should call this when they find a match in the target text. 540ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param length length of the matched text. 541ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handleNext 542ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handlePrev 543ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 544ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 545ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setMatchLength(int32_t length); 546ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 547ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 548ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the offset of the currently matched string in the text string to 549ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * be searched. 550ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Subclasses' <tt>handleNext</tt> and <tt>handlePrev</tt> 551ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * methods should call this when they find a match in the target text. 552ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position start offset of the matched text. 553ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handleNext 554ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see #handlePrev 555ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 556ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 557ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setMatchStart(int32_t position); 558ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 559ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 560ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * sets match not found 561ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 562ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 563ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void setMatchNotFound(); 564ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru}; 565ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 566ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruinline UBool SearchIterator::operator!=(const SearchIterator &that) const 567ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru{ 568ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru return !operator==(that); 569ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru} 570ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste QueruU_NAMESPACE_END 571ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 572ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#endif /* #if !UCONFIG_NO_COLLATION */ 573ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 574ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#endif 575ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 576