1ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/* 2ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 385bf2e2fbc60a9f938064abc8127d61da7d19882Claire Ho* Copyright (C) 2001-2008 IBM and others. All rights reserved. 4ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 5ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* Date Name Description 6ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru* 03/22/2000 helena Creation. 7ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru********************************************************************** 8ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru*/ 9ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 10ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#ifndef STSEARCH_H 11ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#define STSEARCH_H 12ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 13ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/utypes.h" 14ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 15ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 16ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * \file 17ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * \brief C++ API: Service for searching text based on RuleBasedCollator. 18ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 19ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 2085bf2e2fbc60a9f938064abc8127d61da7d19882Claire Ho#if !UCONFIG_NO_COLLATION && !UCONFIG_NO_BREAK_ITERATION 21ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 22ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/tblcoll.h" 23ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/coleitr.h" 24ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#include "unicode/search.h" 25ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 26ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste QueruU_NAMESPACE_BEGIN 27ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 28ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru/** 29ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 30ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>StringSearch</tt> is a <tt>SearchIterator</tt> that provides 31ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * language-sensitive text searching based on the comparison rules defined 32ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in a {@link RuleBasedCollator} object. 33ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * StringSearch ensures that language eccentricity can be 34ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * handled, e.g. for the German collator, characters ß and SS will be matched 35ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * if case is chosen to be ignored. 36ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * See the <a href="http://source.icu-project.org/repos/icu/icuhtml/trunk/design/collation/ICU_collation_design.htm"> 37ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * "ICU Collation Design Document"</a> for more information. 38ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 39ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The algorithm implemented is a modified form of the Boyer Moore's search. 40ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * For more information see 41ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <a href="http://icu-project.org/docs/papers/efficient_text_searching_in_java.html"> 42ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * "Efficient Text Searching in Java"</a>, published in <i>Java Report</i> 43ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in February, 1999, for further information on the algorithm. 44ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 45ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * There are 2 match options for selection:<br> 46ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Let S' be the sub-string of a text string S between the offsets start and 47ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * end <start, end>. 48ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <br> 49ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * A pattern string P matches a text string S at the offsets <start, end> 50ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * if 51ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <pre> 52ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * option 1. Some canonical equivalent of P matches some canonical equivalent 53ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * of S' 54ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * option 2. P matches S' and if P starts or ends with a combining mark, 55ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * there exists no non-ignorable combining mark before or after S? 56ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in S respectively. 57ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * </pre> 58ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Option 2. will be the default. 59ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 60ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This search has APIs similar to that of other text iteration mechanisms 61ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * such as the break iterators in <tt>BreakIterator</tt>. Using these 62ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * APIs, it is easy to scan through text looking for all occurances of 63ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * a given pattern. This search iterator allows changing of direction by 64ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * calling a <tt>reset</tt> followed by a <tt>next</tt> or <tt>previous</tt>. 65ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Though a direction change can occur without calling <tt>reset</tt> first, 66ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * this operation comes with some speed penalty. 67ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Match results in the forward direction will match the result matches in 68ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the backwards direction in the reverse order 69ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 70ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator</tt> provides APIs to specify the starting position 71ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * within the text string to be searched, e.g. <tt>setOffset</tt>, 72ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>preceding</tt> and <tt>following</tt>. Since the 73ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * starting position will be set as it is specified, please take note that 74ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * there are some danger points which the search may render incorrect 75ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * results: 76ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <ul> 77ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <li> The midst of a substring that requires normalization. 78ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <li> If the following match is to be found, the position should not be the 79ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * second character which requires to be swapped with the preceding 80ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * character. Vice versa, if the preceding match is to be found, 81ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * position to search from should not be the first character which 82ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * requires to be swapped with the next character. E.g certain Thai and 83ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Lao characters require swapping. 84ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <li> If a following pattern match is to be found, any position within a 85ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * contracting sequence except the first will fail. Vice versa if a 86ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * preceding pattern match is to be found, a invalid starting point 87ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * would be any character within a contracting sequence except the last. 88ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * </ul> 89ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 90ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * A breakiterator can be used if only matches at logical breaks are desired. 91ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Using a breakiterator will only give you results that exactly matches the 92ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * boundaries given by the breakiterator. For instance the pattern "e" will 93ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * not be found in the string "\u00e9" if a character break iterator is used. 94ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 95ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Options are provided to handle overlapping matches. 96ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * E.g. In English, overlapping matches produces the result 0 and 2 97ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * for the pattern "abab" in the text "ababab", where else mutually 98ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * exclusive matches only produce the result of 0. 99ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 100ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Though collator attributes will be taken into consideration while 101ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * performing matches, there are no APIs here for setting and getting the 102ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * attributes. These attributes can be set by getting the collator 103ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * from <tt>getCollator</tt> and using the APIs in <tt>coll.h</tt>. 104ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Lastly to update StringSearch to the new collator attributes, 105ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * reset() has to be called. 106ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 107ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Restriction: <br> 108ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Currently there are no composite characters that consists of a 109ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * character with combining class > 0 before a character with combining 110ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * class == 0. However, if such a character exists in the future, 111ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * StringSearch does not guarantee the results for option 1. 112ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 113ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Consult the <tt>SearchIterator</tt> documentation for information on 114ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * and examples of how to use instances of this class to implement text 115ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * searching. 116ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <pre><code> 117ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UnicodeString target("The quick brown fox jumps over the lazy dog."); 118ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UnicodeString pattern("fox"); 119ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 120ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * UErrorCode error = U_ZERO_ERROR; 121ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * StringSearch iter(pattern, target, Locale::getUS(), NULL, status); 122ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * for (int pos = iter.first(error); 123ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pos != USEARCH_DONE; 124ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pos = iter.next(error)) 125ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * { 126ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * printf("Found match at %d pos, length is %d\n", pos, 127ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iter.getMatchLength()); 128ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * } 129ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * </code></pre> 130ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 131ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note, StringSearch is not to be subclassed. 132ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * </p> 133ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see SearchIterator 134ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see RuleBasedCollator 135ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @since ICU 2.0 136ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 137ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 138ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruclass U_I18N_API StringSearch : public SearchIterator 139ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru{ 140ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Querupublic: 141ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 142ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public constructors and destructors -------------------------------- 143ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 144ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 145ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Creating a <tt>StringSearch</tt> instance using the argument locale 146ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * language rule set. A collator will be created in the process, which 147ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be owned by this instance and will be deleted during 148ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * destruction 149ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param pattern The text for which this object will search. 150ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The text in which to search for the pattern. 151ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param locale A locale which defines the language-sensitive 152ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * comparison rules used to determine whether text in the 153ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern and target matches. 154ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A <tt>BreakIterator</tt> object used to constrain 155ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the matches that are found. Matches whose start and end 156ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * indices in the target text are not boundaries as 157ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determined by the <tt>BreakIterator</tt> are 158ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ignored. If this behavior is not desired, 159ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>NULL</tt> can be passed in instead. 160ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If pattern or text is NULL, or if 161ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * either the length of pattern or text is 0 then an 162ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * U_ILLEGAL_ARGUMENT_ERROR is returned. 163ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 164ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 165ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(const UnicodeString &pattern, const UnicodeString &text, 166ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const Locale &locale, 167ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *breakiter, 168ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UErrorCode &status); 169ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 170ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 171ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Creating a <tt>StringSearch</tt> instance using the argument collator 172ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * language rule set. Note, user retains the ownership of this collator, 173ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * it does not get destroyed during this instance's destruction. 174ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param pattern The text for which this object will search. 175ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The text in which to search for the pattern. 176ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param coll A <tt>RuleBasedCollator</tt> object which defines 177ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the language-sensitive comparison rules used to 178ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determine whether text in the pattern and target 179ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * matches. User is responsible for the clearing of this 180ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * object. 181ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A <tt>BreakIterator</tt> object used to constrain 182ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the matches that are found. Matches whose start and end 183ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * indices in the target text are not boundaries as 184ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determined by the <tt>BreakIterator</tt> are 185ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ignored. If this behavior is not desired, 186ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>NULL</tt> can be passed in instead. 187ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If either the length of pattern or 188ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * text is 0 then an U_ILLEGAL_ARGUMENT_ERROR is returned. 189ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 190ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 191ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(const UnicodeString &pattern, 192ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const UnicodeString &text, 193ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru RuleBasedCollator *coll, 194ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *breakiter, 195ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UErrorCode &status); 196ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 197ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 198ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Creating a <tt>StringSearch</tt> instance using the argument locale 199ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * language rule set. A collator will be created in the process, which 200ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be owned by this instance and will be deleted during 201ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * destruction 202ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 203ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note: No parsing of the text within the <tt>CharacterIterator</tt> 204ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be done during searching for this version. The block of text 205ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in <tt>CharacterIterator</tt> will be used as it is. 206ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param pattern The text for which this object will search. 207ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The text iterator in which to search for the pattern. 208ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param locale A locale which defines the language-sensitive 209ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * comparison rules used to determine whether text in the 210ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern and target matches. User is responsible for 211ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the clearing of this object. 212ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A <tt>BreakIterator</tt> object used to constrain 213ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the matches that are found. Matches whose start and end 214ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * indices in the target text are not boundaries as 215ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determined by the <tt>BreakIterator</tt> are 216ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ignored. If this behavior is not desired, 217ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>NULL</tt> can be passed in instead. 218ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If either the length of pattern or 219ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * text is 0 then an U_ILLEGAL_ARGUMENT_ERROR is returned. 220ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 221ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 222ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(const UnicodeString &pattern, CharacterIterator &text, 223ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const Locale &locale, 224ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *breakiter, 225ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UErrorCode &status); 226ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 227ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 228ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Creating a <tt>StringSearch</tt> instance using the argument collator 229ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * language rule set. Note, user retains the ownership of this collator, 230ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * it does not get destroyed during this instance's destruction. 231ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 232ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note: No parsing of the text within the <tt>CharacterIterator</tt> 233ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be done during searching for this version. The block of text 234ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in <tt>CharacterIterator</tt> will be used as it is. 235ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param pattern The text for which this object will search. 236ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text The text in which to search for the pattern. 237ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param coll A <tt>RuleBasedCollator</tt> object which defines 238ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the language-sensitive comparison rules used to 239ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determine whether text in the pattern and target 240ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * matches. User is responsible for the clearing of this 241ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * object. 242ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param breakiter A <tt>BreakIterator</tt> object used to constrain 243ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the matches that are found. Matches whose start and end 244ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * indices in the target text are not boundaries as 245ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * determined by the <tt>BreakIterator</tt> are 246ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ignored. If this behavior is not desired, 247ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>NULL</tt> can be passed in instead. 248ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If either the length of pattern or 249ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * text is 0 then an U_ILLEGAL_ARGUMENT_ERROR is returned. 250ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 251ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 252ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(const UnicodeString &pattern, CharacterIterator &text, 253ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru RuleBasedCollator *coll, 254ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru BreakIterator *breakiter, 255ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UErrorCode &status); 256ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 257ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 258ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Copy constructor that creates a StringSearch instance with the same 259ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * behavior, and iterating over the same text. 260ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that StringSearch instance to be copied. 261ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 262ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 263ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(const StringSearch &that); 264ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 265ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 266ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Destructor. Cleans up the search iterator data struct. 267ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a collator is created in the constructor, it will be destroyed here. 268ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 269ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 270ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual ~StringSearch(void); 271ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 272ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 273ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Clone this object. 274ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Clones can be used concurrently in multiple threads. 275ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If an error occurs, then NULL is returned. 276ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The caller must delete the clone. 277ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 278ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return a clone of this object 279ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 280ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @see getDynamicClassID 281ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.8 282ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 283ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch *clone() const; 284ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 285ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // operator overloading --------------------------------------------- 286ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 287ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 288ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Assignment operator. Sets this iterator to have the same behavior, 289ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * and iterate over the same text, as the one passed in. 290ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that instance to be copied. 291ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 292ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 293ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch & operator=(const StringSearch &that); 294ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 295ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 296ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Equality operator. 297ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param that instance to be compared. 298ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return TRUE if both instances have the same attributes, 299ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * breakiterators, collators and iterate over the same text 300ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * while looking for the same pattern. 301ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 302ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 303ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual UBool operator==(const SearchIterator &that) const; 304ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 305ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public get and set methods ---------------------------------------- 306ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 307ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 308ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the index to point to the given position, and clears any state 309ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * that's affected. 310ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 311ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This method takes the argument index and sets the position in the text 312ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * string accordingly without checking if the index is pointing to a 313ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * valid starting point to begin searching. 314ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position within the text to be set. If position is less 315ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * than or greater than the text range for searching, 316ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * an U_INDEX_OUTOFBOUNDS_ERROR will be returned 317ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if it occurs 318ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 319ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 320ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setOffset(int32_t position, UErrorCode &status); 321ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 322ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 323ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Return the current index in the text being searched. 324ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If the iteration has gone past the end of the text 325ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * (or past the beginning for a backwards search), USEARCH_DONE 326ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is returned. 327ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return current index in the text being searched. 328ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 329ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 330ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t getOffset(void) const; 331ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 332ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 333ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Set the target text to be searched. 334ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Text iteration will hence begin at the start of the text string. 335ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This method is 336ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * useful if you want to re-use an iterator to search for the same 337ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern within a different body of text. 338ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text text string to be searched 339ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If the text length is 0 then an 340ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * U_ILLEGAL_ARGUMENT_ERROR is returned. 341ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 342ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 343ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setText(const UnicodeString &text, UErrorCode &status); 344ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 345ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 346ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Set the target text to be searched. 347ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Text iteration will hence begin at the start of the text string. 348ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * This method is 349ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * useful if you want to re-use an iterator to search for the same 350ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * pattern within a different body of text. 351ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Note: No parsing of the text within the <tt>CharacterIterator</tt> 352ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * will be done during searching for this version. The block of text 353ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * in <tt>CharacterIterator</tt> will be used as it is. 354ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param text text string to be searched 355ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If the text length is 0 then an 356ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * U_ILLEGAL_ARGUMENT_ERROR is returned. 357ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 358ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 359ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void setText(CharacterIterator &text, UErrorCode &status); 360ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 361ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 362ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Gets the collator used for the language rules. 363ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 364ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Caller may modify but <b>must not</b> delete the <tt>RuleBasedCollator</tt>! 365ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Modifications to this collator will affect the original collator passed in to 366ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the <tt>StringSearch></tt> constructor or to setCollator, if any. 367ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return collator used for string search 368ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 369ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 370ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru RuleBasedCollator * getCollator() const; 371ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 372ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 373ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the collator used for the language rules. User retains the 374ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ownership of this collator, thus the responsibility of deletion lies 375ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * with the user. This method causes internal data such as Boyer-Moore 376ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * shift tables to be recalculated, but the iterator's position is 377ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * unchanged. 378ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param coll collator 379ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any 380ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 381ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 382ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void setCollator(RuleBasedCollator *coll, UErrorCode &status); 383ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 384ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 385ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Sets the pattern used for matching. 386ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Internal data like the Boyer Moore table will be recalculated, but 387ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the iterator's position is unchanged. 388ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param pattern search pattern to be found 389ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any. If the pattern length is 0 then an 390ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * U_ILLEGAL_ARGUMENT_ERROR is returned. 391ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 392ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 393ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru void setPattern(const UnicodeString &pattern, UErrorCode &status); 394ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 395ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 396ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Gets the search pattern. 397ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return pattern used for matching 398ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 399ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 400ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru const UnicodeString & getPattern() const; 401ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 402ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // public methods ---------------------------------------------------- 403ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 404ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 405ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Reset the iteration. 406ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Search will begin at the start of the text string if a forward 407ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iteration is initiated before a backwards iteration. Otherwise if 408ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * a backwards iteration is initiated before a forwards iteration, the 409ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * search will begin at the end of the text string. 410ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 411ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 412ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual void reset(); 413ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 414ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 415ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Returns a copy of StringSearch with the same behavior, and 416ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * iterating over the same text, as this one. Note that all data will be 417ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * replicated, except for the user-specified collator and the 418ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * breakiterator. 419ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return cloned object 420ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 421ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 422ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual SearchIterator * safeClone(void) const; 423ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 424ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 425ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ICU "poor man's RTTI", returns a UClassID for the actual class. 426ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 427ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.2 428ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 429ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual UClassID getDynamicClassID() const; 430ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 431ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 432ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * ICU "poor man's RTTI", returns a UClassID for this class. 433ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * 434ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.2 435ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 436ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru static UClassID U_EXPORT2 getStaticClassID(); 437ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 438ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruprotected: 439ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 440ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // protected method ------------------------------------------------- 441ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 442ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 443ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Search forward for matching text, starting at a given location. 444ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Clients should not call this method directly; instead they should 445ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * call {@link SearchIterator#next }. 446ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 447ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is found, this method returns the index at which the match 448ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * starts and calls {@link SearchIterator#setMatchLength } with the number 449ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * of characters in the target text that make up the match. If no match 450ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is found, the method returns <tt>USEARCH_DONE</tt>. 451ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 452ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The <tt>StringSearch</tt> is adjusted so that its current index 453ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * (as returned by {@link #getOffset }) is the match position if one was 454ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * found. 455ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 456ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the <tt>StringSearch</tt> will be adjusted to the index USEARCH_DONE. 457ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position The index in the target text at which the search 458ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * starts 459ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any occurs 460ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The index at which the matched text in the target starts, or 461ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USEARCH_DONE if no match was found. 462ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 463ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 464ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t handleNext(int32_t position, UErrorCode &status); 465ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 466ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 467ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Search backward for matching text, starting at a given location. 468ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Clients should not call this method directly; instead they should call 469ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <tt>SearchIterator.previous()</tt>, which this method overrides. 470ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 471ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is found, this method returns the index at which the match 472ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * starts and calls {@link SearchIterator#setMatchLength } with the number 473ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * of characters in the target text that make up the match. If no match 474ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * is found, the method returns <tt>USEARCH_DONE</tt>. 475ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * <p> 476ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * The <tt>StringSearch</tt> is adjusted so that its current index 477ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * (as returned by {@link #getOffset }) is the match position if one was 478ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * found. 479ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * If a match is not found, <tt>USEARCH_DONE</tt> will be returned and 480ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * the <tt>StringSearch</tt> will be adjusted to the index USEARCH_DONE. 481ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param position The index in the target text at which the search 482ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * starts. 483ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @param status for errors if any occurs 484ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @return The index at which the matched text in the target starts, or 485ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * USEARCH_DONE if no match was found. 486ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 487ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 488ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru virtual int32_t handlePrev(int32_t position, UErrorCode &status); 489ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 490ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queruprivate : 491ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru StringSearch(); // default constructor not implemented 492ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 493ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru // private data members ---------------------------------------------- 494ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 495ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 496ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * RuleBasedCollator, contains exactly the same UCollator * in m_strsrch_ 497ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 498ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 499ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru RuleBasedCollator m_collator_; 500ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 501ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * Pattern text 502ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 503ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 504ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UnicodeString m_pattern_; 505ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru /** 506ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * String search struct data 507ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru * @stable ICU 2.0 508ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru */ 509ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru UStringSearch *m_strsrch_; 510ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 511ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru}; 512ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 513ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste QueruU_NAMESPACE_END 514ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 515ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#endif /* #if !UCONFIG_NO_COLLATION */ 516ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 517ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru#endif 518ac04d0bbe12b3ef54518635711412f178cb4d16Jean-Baptiste Queru 519