1f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* 2f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)********************************************************************** 3f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Copyright (C) 1998-2010, International Business Machines 4f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Corporation and others. All Rights Reserved. 5f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)********************************************************************** 6f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 7f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* File ustring.h 8f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 9f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Modification History: 10f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 11f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Date Name Description 12f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 12/07/98 bertrand Creation. 13f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)****************************************************************************** 14f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)*/ 15f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 16f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#ifndef USTRING_H 17f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#define USTRING_H 18f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 19f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#include "unicode/utypes.h" 20f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#include "unicode/putil.h" 21f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#include "unicode/uiter.h" 22f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 23f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** Simple declaration for u_strToTitle() to avoid including unicode/ubrk.h. @stable ICU 2.1*/ 24f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#ifndef UBRK_TYPEDEF_UBREAK_ITERATOR 25f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define UBRK_TYPEDEF_UBREAK_ITERATOR 26f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) typedef struct UBreakIterator UBreakIterator; 27f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 28f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 29f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 30f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \file 31f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \brief C API: Unicode string handling functions 32f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 33f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * These C API functions provide general Unicode string handling. 34f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 35f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Some functions are equivalent in name, signature, and behavior to the ANSI C <string.h> 36f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * functions. (For example, they do not check for bad arguments like NULL string pointers.) 37f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * In some cases, only the thread-safe variant of such a function is implemented here 38f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (see u_strtok_r()). 39f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 40f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Other functions provide more Unicode-specific functionality like locale-specific 41f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * upper/lower-casing and string comparison in code point order. 42f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 43f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * ICU uses 16-bit Unicode (UTF-16) in the form of arrays of UChar code units. 44f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * UTF-16 encodes each Unicode code point with either one or two UChar code units. 45f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (This is the default form of Unicode, and a forward-compatible extension of the original, 46f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * fixed-width form that was known as UCS-2. UTF-16 superseded UCS-2 with Unicode 2.0 47f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * in 1996.) 48f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 49f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Some APIs accept a 32-bit UChar32 value for a single code point. 50f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 51f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * ICU also handles 16-bit Unicode text with unpaired surrogates. 52f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Such text is not well-formed UTF-16. 53f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Code-point-related functions treat unpaired surrogates as surrogate code points, 54f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * i.e., as separate units. 55f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 56f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Although UTF-16 is a variable-width encoding form (like some legacy multi-byte encodings), 57f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * it is much more efficient even for random access because the code unit values 58f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * for single-unit characters vs. lead units vs. trail units are completely disjoint. 59f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This means that it is easy to determine character (code point) boundaries from 60f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * random offsets in the string. 61f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 62f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unicode (UTF-16) string processing is optimized for the single-unit case. 63f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Although it is important to support supplementary characters 64f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (which use pairs of lead/trail code units called "surrogates"), 65f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * their occurrence is rare. Almost all characters in modern use require only 66f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * a single UChar code unit (i.e., their code point values are <=0xffff). 67f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 68f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For more details see the User Guide Strings chapter (http://icu-project.org/userguide/strings.html). 69f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For a discussion of the handling of unpaired surrogates see also 70f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Jitterbug 2145 and its icu mailing list proposal on 2002-sep-18. 71f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 72f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 73f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 74f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \defgroup ustring_ustrlen String Length 75f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \ingroup ustring_strlen 76f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 77f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/*@{*/ 78f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 79f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Determine the length of an array of UChar. 80f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 81f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The array of UChars, NULL (U+0000) terminated. 82f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The number of UChars in <code>chars</code>, minus the terminator. 83f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 84f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 85f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 86f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strlen(const UChar *s); 87f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/*@}*/ 88f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 89f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 90f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Count Unicode code points in the length UChar code units of the string. 91f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A code point may occupy either one or two UChar code units. 92f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Counting code points involves reading all code units. 93f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 94f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This functions is basically the inverse of the U16_FWD_N() macro (see utf.h). 95f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 96f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The input string. 97f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length The number of UChar code units to be checked, or -1 to count all 98f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * code points before the first NUL (U+0000). 99f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The number of code points in the specified code units. 100f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 101f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 102f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 103f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_countChar32(const UChar *s, int32_t length); 104f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 105f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 106f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Check if the string contains more Unicode code points than a certain number. 107f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is more efficient than counting all code points in the entire string 108f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and comparing that number with a threshold. 109f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function may not need to scan the string at all if the length is known 110f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (not -1 for NUL-termination) and falls within a certain range, and 111f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * never needs to count more than 'number+1' code points. 112f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Logically equivalent to (u_countChar32(s, length)>number). 113f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A Unicode code point may occupy either one or two UChar code units. 114f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 115f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The input string. 116f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length The length of the string, or -1 if it is NUL-terminated. 117f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param number The number of code points in the string is compared against 118f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the 'number' parameter. 119f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return Boolean value for whether the string contains more Unicode code points 120f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * than 'number'. Same as (u_countChar32(s, length)>number). 121f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 122f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 123f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UBool U_EXPORT2 124f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strHasMoreChar32Than(const UChar *s, int32_t length, int32_t number); 125f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 126f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 127f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Concatenate two ustrings. Appends a copy of <code>src</code>, 128f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * including the null terminator, to <code>dst</code>. The initial copied 129f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * character from <code>src</code> overwrites the null terminator in <code>dst</code>. 130f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 131f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 132f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 133f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 134f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 135f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 136f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 137f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcat(UChar *dst, 138f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src); 139f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 140f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 141f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Concatenate two ustrings. 142f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Appends at most <code>n</code> characters from <code>src</code> to <code>dst</code>. 143f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Adds a terminating NUL. 144f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If src is too long, then only <code>n-1</code> characters will be copied 145f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * before the terminating NUL. 146f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If <code>n<=0</code> then dst is not modified. 147f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 148f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 149f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 150f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to append. 151f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 152f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 153f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 154f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 155f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strncat(UChar *dst, 156f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 157f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t n); 158f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 159f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 160f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a substring in a string. 161f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The substring is found at code point boundaries. 162f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * That means that if the substring begins with 163f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * a trail surrogate or ends with a lead surrogate, 164f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * then it is found only if these surrogates stand alone in the text. 165f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, the substring edge units would be matched against 166f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * halves of surrogate pairs. 167f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 168f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 169f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param substring The substring to find (NUL-terminated). 170f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>substring</code> in <code>s</code>, 171f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>s</code> itself if the <code>substring</code> is empty, 172f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>substring</code> is not in <code>s</code>. 173f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 174f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 175f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrstr 176f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 177f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 178f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 179f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 180f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strstr(const UChar *s, const UChar *substring); 181f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 182f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 183f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a substring in a string. 184f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The substring is found at code point boundaries. 185f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * That means that if the substring begins with 186f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * a trail surrogate or ends with a lead surrogate, 187f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * then it is found only if these surrogates stand alone in the text. 188f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, the substring edge units would be matched against 189f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * halves of surrogate pairs. 190f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 191f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search. 192f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length The length of s (number of UChars), or -1 if it is NUL-terminated. 193f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param substring The substring to find (NUL-terminated). 194f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subLength The length of substring (number of UChars), or -1 if it is NUL-terminated. 195f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>substring</code> in <code>s</code>, 196f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>s</code> itself if the <code>substring</code> is empty, 197f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>substring</code> is not in <code>s</code>. 198f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 199f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 200f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strstr 201f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 202f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 203f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 204f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFindFirst(const UChar *s, int32_t length, const UChar *substring, int32_t subLength); 205f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 206f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 207f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a BMP code point in a string. 208f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 209f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 210f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 211f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 212f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 213f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The BMP code point to find. 214f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>c</code> in <code>s</code> 215f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 216f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 217f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 218f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strchr32 219f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memchr 220f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strstr 221f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 222f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 223f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 224f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strchr(const UChar *s, UChar c); 225f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 226f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 227f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a code point in a string. 228f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 229f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 230f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 231f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 232f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 233f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The code point to find. 234f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>c</code> in <code>s</code> 235f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 236f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 237f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 238f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strchr 239f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memchr32 240f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strstr 241f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 242f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 243f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 244f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strchr32(const UChar *s, UChar32 c); 245f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 246f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 247f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a substring in a string. 248f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The substring is found at code point boundaries. 249f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * That means that if the substring begins with 250f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * a trail surrogate or ends with a lead surrogate, 251f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * then it is found only if these surrogates stand alone in the text. 252f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, the substring edge units would be matched against 253f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * halves of surrogate pairs. 254f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 255f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 256f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param substring The substring to find (NUL-terminated). 257f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>substring</code> in <code>s</code>, 258f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>s</code> itself if the <code>substring</code> is empty, 259f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>substring</code> is not in <code>s</code>. 260f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 261f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 262f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strstr 263f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 264f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 265f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 266f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 267f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strrstr(const UChar *s, const UChar *substring); 268f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 269f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 270f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a substring in a string. 271f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The substring is found at code point boundaries. 272f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * That means that if the substring begins with 273f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * a trail surrogate or ends with a lead surrogate, 274f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * then it is found only if these surrogates stand alone in the text. 275f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, the substring edge units would be matched against 276f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * halves of surrogate pairs. 277f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 278f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search. 279f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length The length of s (number of UChars), or -1 if it is NUL-terminated. 280f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param substring The substring to find (NUL-terminated). 281f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subLength The length of substring (number of UChars), or -1 if it is NUL-terminated. 282f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>substring</code> in <code>s</code>, 283f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>s</code> itself if the <code>substring</code> is empty, 284f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>substring</code> is not in <code>s</code>. 285f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 286f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 287f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strstr 288f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 289f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 290f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 291f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFindLast(const UChar *s, int32_t length, const UChar *substring, int32_t subLength); 292f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 293f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 294f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a BMP code point in a string. 295f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 296f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 297f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 298f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 299f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 300f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The BMP code point to find. 301f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>c</code> in <code>s</code> 302f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 303f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 304f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 305f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrchr32 306f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memrchr 307f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrstr 308f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 309f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 310f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 311f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strrchr(const UChar *s, UChar c); 312f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 313f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 314f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a code point in a string. 315f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 316f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 317f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 318f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 319f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (NUL-terminated). 320f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The code point to find. 321f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>c</code> in <code>s</code> 322f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 323f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 324f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 325f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrchr 326f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memchr32 327f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrstr 328f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 329f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 330f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 331f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strrchr32(const UChar *s, UChar32 c); 332f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 333f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 334f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Locates the first occurrence in the string <code>string</code> of any of the characters 335f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * in the string <code>matchSet</code>. 336f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Works just like C's strpbrk but with Unicode. 337f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 338f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param string The string in which to search, NUL-terminated. 339f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param matchSet A NUL-terminated string defining a set of code points 340f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * for which to search in the text string. 341f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the character in <code>string</code> that matches one of the 342f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * characters in <code>matchSet</code>, or NULL if no such character is found. 343f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 344f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 345f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 346f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strpbrk(const UChar *string, const UChar *matchSet); 347f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 348f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 349f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Returns the number of consecutive characters in <code>string</code>, 350f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * beginning with the first, that do not occur somewhere in <code>matchSet</code>. 351f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Works just like C's strcspn but with Unicode. 352f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 353f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param string The string in which to search, NUL-terminated. 354f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param matchSet A NUL-terminated string defining a set of code points 355f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * for which to search in the text string. 356f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The number of initial characters in <code>string</code> that do not 357f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * occur in <code>matchSet</code>. 358f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strspn 359f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 360f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 361f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 362f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcspn(const UChar *string, const UChar *matchSet); 363f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 364f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 365f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Returns the number of consecutive characters in <code>string</code>, 366f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * beginning with the first, that occur somewhere in <code>matchSet</code>. 367f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Works just like C's strspn but with Unicode. 368f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 369f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param string The string in which to search, NUL-terminated. 370f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param matchSet A NUL-terminated string defining a set of code points 371f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * for which to search in the text string. 372f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The number of initial characters in <code>string</code> that do 373f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * occur in <code>matchSet</code>. 374f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strcspn 375f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 376f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 377f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 378f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strspn(const UChar *string, const UChar *matchSet); 379f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 380f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 381f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The string tokenizer API allows an application to break a string into 382f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * tokens. Unlike strtok(), the saveState (the current pointer within the 383f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * original string) is maintained in saveState. In the first call, the 384f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * argument src is a pointer to the string. In subsequent calls to 385f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * return successive tokens of that string, src must be specified as 386f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * NULL. The value saveState is set by this function to maintain the 387f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function's position within the string, and on each subsequent call 388f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * you must give this argument the same variable. This function does 389f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * handle surrogate pairs. This function is similar to the strtok_r() 390f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the POSIX Threads Extension (1003.1c-1995) version. 391f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 392f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src String containing token(s). This string will be modified. 393f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * After the first call to u_strtok_r(), this argument must 394f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * be NULL to get to the next token. 395f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param delim Set of delimiter characters (Unicode code points). 396f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param saveState The current pointer within the original string, 397f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which is set by this function. The saveState 398f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * parameter should the address of a local variable of type 399f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * UChar *. (i.e. defined "Uhar *myLocalSaveState" and use 400f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * &myLocalSaveState for this parameter). 401f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the next token found in src, or NULL 402f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * when there are no more tokens. 403f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 404f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 405f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 406f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strtok_r(UChar *src, 407f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *delim, 408f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar **saveState); 409f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 410f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 411f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings for bitwise equality (code unit order). 412f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 413f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 414f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 415f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return 0 if <code>s1</code> and <code>s2</code> are bitwise equal; a negative 416f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * value if <code>s1</code> is bitwise less than <code>s2,</code>; a positive 417f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * value if <code>s1</code> is bitwise greater than <code>s2</code>. 418f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 419f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 420f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 421f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcmp(const UChar *s1, 422f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *s2); 423f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 424f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 425f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings in code point order. 426f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * See u_strCompare for details. 427f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 428f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 429f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 430f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return a negative/zero/positive integer corresponding to whether 431f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the first string is less than/equal to/greater than the second one 432f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * in code point order 433f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 434f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 435f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 436f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcmpCodePointOrder(const UChar *s1, const UChar *s2); 437f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 438f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 439f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings (binary order). 440f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 441f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The comparison can be done in code unit order or in code point order. 442f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * They differ only in UTF-16 when 443f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * comparing supplementary code points (U+10000..U+10ffff) 444f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). 445f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * In code unit order, high BMP code points sort after supplementary code points 446f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * because they are stored as pairs of surrogates which are at U+d800..U+dfff. 447f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 448f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This functions works with strings of different explicitly specified lengths 449f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * unlike the ANSI C-like u_strcmp() and u_memcmp() etc. 450f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * NUL-terminated strings are possible with length arguments of -1. 451f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 452f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 First source string. 453f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length1 Length of first source string, or -1 if NUL-terminated. 454f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 455f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 Second source string. 456f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length2 Length of second source string, or -1 if NUL-terminated. 457f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 458f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param codePointOrder Choose between code unit order (FALSE) 459f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and code point order (TRUE). 460f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 461f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return <0 or 0 or >0 as usual for string comparisons 462f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 463f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.2 464f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 465f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 466f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strCompare(const UChar *s1, int32_t length1, 467f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *s2, int32_t length2, 468f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UBool codePointOrder); 469f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 470f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 471f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings (binary order) 472f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * as presented by UCharIterator objects. 473f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Works otherwise just like u_strCompare(). 474f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 475f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Both iterators are reset to their start positions. 476f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * When the function returns, it is undefined where the iterators 477f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * have stopped. 478f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 479f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param iter1 First source string iterator. 480f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param iter2 Second source string iterator. 481f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param codePointOrder Choose between code unit order (FALSE) 482f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and code point order (TRUE). 483f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 484f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return <0 or 0 or >0 as usual for string comparisons 485f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 486f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strCompare 487f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 488f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.6 489f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 490f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 491f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strCompareIter(UCharIterator *iter1, UCharIterator *iter2, UBool codePointOrder); 492f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 493f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#ifndef U_COMPARE_CODE_POINT_ORDER 494f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* see also unistr.h and unorm.h */ 495f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 496f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Option bit for u_strCaseCompare, u_strcasecmp, unorm_compare, etc: 497f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare strings in code point order instead of code unit order. 498f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.2 499f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 500f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#define U_COMPARE_CODE_POINT_ORDER 0x8000 501f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 502f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 503f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 504f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two strings case-insensitively using full case folding. 505f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is equivalent to 506f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strCompare(u_strFoldCase(s1, options), 507f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strFoldCase(s2, options), 508f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (options&U_COMPARE_CODE_POINT_ORDER)!=0). 509f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 510f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The comparison can be done in UTF-16 code unit order or in code point order. 511f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * They differ only when comparing supplementary code points (U+10000..U+10ffff) 512f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * to BMP code points near the end of the BMP (i.e., U+e000..U+ffff). 513f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * In code unit order, high BMP code points sort after supplementary code points 514f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * because they are stored as pairs of surrogates which are at U+d800..U+dfff. 515f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 516f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This functions works with strings of different explicitly specified lengths 517f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * unlike the ANSI C-like u_strcmp() and u_memcmp() etc. 518f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * NUL-terminated strings are possible with length arguments of -1. 519f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 520f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 First source string. 521f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length1 Length of first source string, or -1 if NUL-terminated. 522f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 523f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 Second source string. 524f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length2 Length of second source string, or -1 if NUL-terminated. 525f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 526f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param options A bit set of options: 527f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_DEFAULT or 0 is used for default options: 528f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Comparison in code unit order with default case folding. 529f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 530f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_COMPARE_CODE_POINT_ORDER 531f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to choose code point order instead of code unit order 532f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (see u_strCompare for details). 533f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 534f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_EXCLUDE_SPECIAL_I 535f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 536f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 537f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 538f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 539f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return <0 or 0 or >0 as usual for string comparisons 540f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 541f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.2 542f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 543f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 544f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strCaseCompare(const UChar *s1, int32_t length1, 545f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *s2, int32_t length2, 546f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) uint32_t options, 547f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 548f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 549f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 550f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two ustrings for bitwise equality. 551f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compares at most <code>n</code> characters. 552f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 553f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param ucs1 A string to compare. 554f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param ucs2 A string to compare. 555f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to compare. 556f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return 0 if <code>s1</code> and <code>s2</code> are bitwise equal; a negative 557f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * value if <code>s1</code> is bitwise less than <code>s2</code>; a positive 558f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * value if <code>s1</code> is bitwise greater than <code>s2</code>. 559f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 560f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 561f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 562f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strncmp(const UChar *ucs1, 563f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *ucs2, 564f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t n); 565f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 566f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 567f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings in code point order. 568f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is different in UTF-16 from u_strncmp() if supplementary characters are present. 569f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For details, see u_strCompare(). 570f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 571f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 572f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 573f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to compare. 574f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return a negative/zero/positive integer corresponding to whether 575f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the first string is less than/equal to/greater than the second one 576f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * in code point order 577f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 578f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 579f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 580f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strncmpCodePointOrder(const UChar *s1, const UChar *s2, int32_t n); 581f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 582f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 583f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two strings case-insensitively using full case folding. 584f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is equivalent to u_strcmp(u_strFoldCase(s1, options), u_strFoldCase(s2, options)). 585f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 586f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 587f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 588f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param options A bit set of options: 589f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_DEFAULT or 0 is used for default options: 590f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Comparison in code unit order with default case folding. 591f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 592f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_COMPARE_CODE_POINT_ORDER 593f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to choose code point order instead of code unit order 594f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (see u_strCompare for details). 595f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 596f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_EXCLUDE_SPECIAL_I 597f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 598f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A negative, zero, or positive integer indicating the comparison result. 599f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 600f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 601f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 602f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcasecmp(const UChar *s1, const UChar *s2, uint32_t options); 603f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 604f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 605f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two strings case-insensitively using full case folding. 606f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is equivalent to u_strcmp(u_strFoldCase(s1, at most n, options), 607f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strFoldCase(s2, at most n, options)). 608f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 609f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 610f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 611f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters each string to case-fold and then compare. 612f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param options A bit set of options: 613f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_DEFAULT or 0 is used for default options: 614f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Comparison in code unit order with default case folding. 615f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 616f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_COMPARE_CODE_POINT_ORDER 617f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to choose code point order instead of code unit order 618f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (see u_strCompare for details). 619f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 620f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_EXCLUDE_SPECIAL_I 621f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 622f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A negative, zero, or positive integer indicating the comparison result. 623f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 624f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 625f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 626f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strncasecmp(const UChar *s1, const UChar *s2, int32_t n, uint32_t options); 627f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 628f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 629f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two strings case-insensitively using full case folding. 630f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is equivalent to u_strcmp(u_strFoldCase(s1, n, options), 631f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strFoldCase(s2, n, options)). 632f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 633f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 634f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 635f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length The number of characters in each string to case-fold and then compare. 636f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param options A bit set of options: 637f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_DEFAULT or 0 is used for default options: 638f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Comparison in code unit order with default case folding. 639f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 640f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_COMPARE_CODE_POINT_ORDER 641f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to choose code point order instead of code unit order 642f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (see u_strCompare for details). 643f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 644f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - U_FOLD_CASE_EXCLUDE_SPECIAL_I 645f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 646f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A negative, zero, or positive integer indicating the comparison result. 647f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 648f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 649f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 650f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memcasecmp(const UChar *s1, const UChar *s2, int32_t length, uint32_t options); 651f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 652f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 653f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy a ustring. Adds a null terminator. 654f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 655f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 656f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 657f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 658f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 659f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 660f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 661f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strcpy(UChar *dst, 662f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src); 663f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 664f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 665f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy a ustring. 666f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copies at most <code>n</code> characters. The result will be null terminated 667f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * if the length of <code>src</code> is less than <code>n</code>. 668f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 669f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 670f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 671f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to copy. 672f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 673f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 674f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 675f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 676f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strncpy(UChar *dst, 677f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 678f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t n); 679f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 680f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#if !UCONFIG_NO_CONVERSION 681f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 682f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 683f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy a byte string encoded in the default codepage to a ustring. 684f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Adds a null terminator. 685f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Performs a host byte to UChar conversion 686f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 687f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 688f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 689f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 690f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 691f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 692f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 u_uastrcpy(UChar *dst, 693f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src ); 694f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 695f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 696f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy a byte string encoded in the default codepage to a ustring. 697f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copies at most <code>n</code> characters. The result will be null terminated 698f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * if the length of <code>src</code> is less than <code>n</code>. 699f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Performs a host byte to UChar conversion 700f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 701f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 702f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 703f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to copy. 704f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 705f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 706f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 707f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 u_uastrncpy(UChar *dst, 708f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src, 709f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t n); 710f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 711f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 712f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy ustring to a byte string encoded in the default codepage. 713f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Adds a null terminator. 714f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Performs a UChar to host byte conversion 715f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 716f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 717f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 718f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 719f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 720f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 721f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE char* U_EXPORT2 u_austrcpy(char *dst, 722f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src ); 723f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 724f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 725f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copy ustring to a byte string encoded in the default codepage. 726f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Copies at most <code>n</code> characters. The result will be null terminated 727f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * if the length of <code>src</code> is less than <code>n</code>. 728f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Performs a UChar to host byte conversion 729f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 730f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dst The destination string. 731f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string. 732f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param n The maximum number of characters to copy. 733f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dst</code>. 734f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 735f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 736f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE char* U_EXPORT2 u_austrncpy(char *dst, 737f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 738f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t n ); 739f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 740f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 741f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 742f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 743f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Synonym for memcpy(), but with UChars only. 744f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest The destination string 745f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string 746f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The number of characters to copy 747f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dest</code> 748f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 749f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 750f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 751f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memcpy(UChar *dest, const UChar *src, int32_t count); 752f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 753f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 754f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Synonym for memmove(), but with UChars only. 755f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest The destination string 756f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The source string 757f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The number of characters to move 758f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dest</code> 759f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 760f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 761f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 762f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memmove(UChar *dest, const UChar *src, int32_t count); 763f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 764f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 765f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Initialize <code>count</code> characters of <code>dest</code> to <code>c</code>. 766f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 767f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest The destination string. 768f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The character to initialize the string. 769f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The maximum number of characters to set. 770f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to <code>dest</code>. 771f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 772f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 773f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 774f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memset(UChar *dest, UChar c, int32_t count); 775f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 776f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 777f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare the first <code>count</code> UChars of each buffer. 778f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 779f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param buf1 The first string to compare. 780f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param buf2 The second string to compare. 781f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The maximum number of UChars to compare. 782f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return When buf1 < buf2, a negative number is returned. 783f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * When buf1 == buf2, 0 is returned. 784f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * When buf1 > buf2, a positive number is returned. 785f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 786f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 787f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 788f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memcmp(const UChar *buf1, const UChar *buf2, int32_t count); 789f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 790f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 791f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Compare two Unicode strings in code point order. 792f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is different in UTF-16 from u_memcmp() if supplementary characters are present. 793f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For details, see u_strCompare(). 794f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 795f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s1 A string to compare. 796f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s2 A string to compare. 797f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The maximum number of characters to compare. 798f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return a negative/zero/positive integer corresponding to whether 799f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the first string is less than/equal to/greater than the second one 800f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * in code point order 801f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 802f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 803f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 804f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memcmpCodePointOrder(const UChar *s1, const UChar *s2, int32_t count); 805f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 806f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 807f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a BMP code point in a string. 808f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 809f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 810f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 811f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 812f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (contains <code>count</code> UChars). 813f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The BMP code point to find. 814f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The length of the string. 815f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>c</code> in <code>s</code> 816f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 817f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 818f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 819f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strchr 820f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memchr32 821f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 822f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 823f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 824f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memchr(const UChar *s, UChar c, int32_t count); 825f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 826f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 827f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the first occurrence of a code point in a string. 828f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 829f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 830f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 831f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 832f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (contains <code>count</code> UChars). 833f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The code point to find. 834f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The length of the string. 835f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the first occurrence of <code>c</code> in <code>s</code> 836f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 837f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 838f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 839f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strchr32 840f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memchr 841f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindFirst 842f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 843f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 844f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memchr32(const UChar *s, UChar32 c, int32_t count); 845f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 846f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 847f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a BMP code point in a string. 848f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 849f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 850f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 851f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 852f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (contains <code>count</code> UChars). 853f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The BMP code point to find. 854f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The length of the string. 855f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>c</code> in <code>s</code> 856f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 857f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 858f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 859f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrchr 860f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memrchr32 861f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 862f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 863f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 864f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memrchr(const UChar *s, UChar c, int32_t count); 865f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 866f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 867f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Find the last occurrence of a code point in a string. 868f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A surrogate code point is found only if its match in the text is not 869f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * part of a surrogate pair. 870f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A NUL character is found at the string terminator. 871f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 872f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param s The string to search (contains <code>count</code> UChars). 873f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param c The code point to find. 874f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param count The length of the string. 875f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return A pointer to the last occurrence of <code>c</code> in <code>s</code> 876f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or <code>NULL</code> if <code>c</code> is not in <code>s</code>. 877f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.4 878f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 879f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strrchr32 880f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_memrchr 881f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFindLast 882f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 883f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 884f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_memrchr32(const UChar *s, UChar32 c, int32_t count); 885f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 886f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 887f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unicode String literals in C. 888f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * We need one macro to declare a variable for the string 889f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and to statically preinitialize it if possible, 890f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and a second macro to dynamically intialize such a string variable if necessary. 891f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 892f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The macros are defined for maximum performance. 893f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * They work only for strings that contain "invariant characters", i.e., 894f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * only latin letters, digits, and some punctuation. 895f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * See utypes.h for details. 896f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 897f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A pair of macros for a single string must be used with the same 898f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * parameters. 899f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The string parameter must be a C string literal. 900f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The length of the string, not including the terminating 901f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * <code>NUL</code>, must be specified as a constant. 902f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The U_STRING_DECL macro should be invoked exactly once for one 903f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * such string variable before it is used. 904f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 905f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Usage: 906f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * <pre> 907f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_DECL(ustringVar1, "Quick-Fox 2", 11); 908f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_DECL(ustringVar2, "jumps 5%", 8); 909f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * static UBool didInit=FALSE; 910f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 911f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * int32_t function() { 912f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * if(!didInit) { 913f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_INIT(ustringVar1, "Quick-Fox 2", 11); 914f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_INIT(ustringVar2, "jumps 5%", 8); 915f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * didInit=TRUE; 916f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * } 917f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * return u_strcmp(ustringVar1, ustringVar2); 918f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * } 919f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * </pre> 920f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 921f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Note that the macros will NOT consistently work if their argument is another #define. 922f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The following will not work on all platforms, don't use it. 923f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 924f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * <pre> 925f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * #define GLUCK "Mr. Gluck" 926f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_DECL(var, GLUCK, 9) 927f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_STRING_INIT(var, GLUCK, 9) 928f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * </pre> 929f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 930f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Instead, use the string literal "Mr. Gluck" as the argument to both macro 931f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * calls. 932f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 933f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 934f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 935f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 936f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#if defined(U_DECLARE_UTF16) 937f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_DECL(var, cs, length) static const UChar var[(length)+1]=U_DECLARE_UTF16(cs) 938f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) /**@stable ICU 2.0 */ 939f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_INIT(var, cs, length) 940f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#elif U_SIZEOF_WCHAR_T==U_SIZEOF_UCHAR && (U_CHARSET_FAMILY==U_ASCII_FAMILY || (U_SIZEOF_UCHAR == 2 && defined(U_WCHAR_IS_UTF16))) 941f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_DECL(var, cs, length) static const UChar var[(length)+1]=L ## cs 942f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) /**@stable ICU 2.0 */ 943f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_INIT(var, cs, length) 944f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#elif U_SIZEOF_UCHAR==1 && U_CHARSET_FAMILY==U_ASCII_FAMILY 945f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_DECL(var, cs, length) static const UChar var[(length)+1]=cs 946f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) /**@stable ICU 2.0 */ 947f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_INIT(var, cs, length) 948f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#else 949f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_DECL(var, cs, length) static UChar var[(length)+1] 950f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) /**@stable ICU 2.0 */ 951f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)# define U_STRING_INIT(var, cs, length) u_charsToUChars(cs, var, length+1) 952f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 953f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 954f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 955f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unescape a string of characters and write the resulting 956f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unicode characters to the destination buffer. The following escape 957f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * sequences are recognized: 958f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 959f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\uhhhh 4 hex digits; h in [0-9A-Fa-f] 960f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\Uhhhhhhhh 8 hex digits 961f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\xhh 1-2 hex digits 962f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\x{h...} 1-8 hex digits 963f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\ooo 1-3 octal digits; o in [0-7] 964f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\cX control-X; X is masked with 0x1F 965f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 966f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * as well as the standard ANSI C escapes: 967f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 968f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\a => U+0007, \\b => U+0008, \\t => U+0009, \\n => U+000A, 969f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\v => U+000B, \\f => U+000C, \\r => U+000D, \\e => U+001B, 970f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * \\" => U+0022, \\' => U+0027, \\? => U+003F, \\\\ => U+005C 971f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 972f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Anything else following a backslash is generically escaped. For 973f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * example, "[a\\-z]" returns "[a-z]". 974f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 975f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If an escape sequence is ill-formed, this method returns an empty 976f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * string. An example of an ill-formed sequence is "\\u" followed by 977f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * fewer than 4 hex digits. 978f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 979f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The above characters are recognized in the compiler's codepage, 980f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * that is, they are coded as 'u', '\\', etc. Characters that are 981f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * not parts of escape sequences are converted using u_charsToUChars(). 982f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 983f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function is similar to UnicodeString::unescape() but not 984f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * identical to it. The latter takes a source UnicodeString, so it 985f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * does escape recognition but no conversion. 986f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 987f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src a zero-terminated string of invariant characters 988f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest pointer to buffer to receive converted and unescaped 989f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * text and, if there is room, a zero terminator. May be NULL for 990f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * preflighting, in which case no UChars will be written, but the 991f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * return value will still be valid. On error, an empty string is 992f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * stored here (if possible). 993f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity the number of UChars that may be written at 994f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest. Ignored if dest == NULL. 995f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return the length of unescaped string. 996f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_unescapeAt 997f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see UnicodeString#unescape() 998f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see UnicodeString#unescapeAt() 999f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1000f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1001f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 1002f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_unescape(const char *src, 1003f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar *dest, int32_t destCapacity); 1004f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1005f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_CDECL_BEGIN 1006f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1007f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Callback function for u_unescapeAt() that returns a character of 1008f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the source text given an offset and a context pointer. The context 1009f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pointer will be whatever is passed into u_unescapeAt(). 1010f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1011f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param offset pointer to the offset that will be passed to u_unescapeAt(). 1012f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param context an opaque pointer passed directly into u_unescapeAt() 1013f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return the character represented by the escape sequence at 1014f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * offset 1015f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_unescapeAt 1016f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1017f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1018f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)typedef UChar (U_CALLCONV *UNESCAPE_CHAR_AT)(int32_t offset, void *context); 1019f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_CDECL_END 1020f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1021f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1022f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unescape a single sequence. The character at offset-1 is assumed 1023f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (without checking) to be a backslash. This method takes a callback 1024f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pointer to a function that returns the UChar at a given offset. By 1025f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * varying this callback, ICU functions are able to unescape char* 1026f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * strings, UnicodeString objects, and UFILE pointers. 1027f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1028f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If offset is out of range, or if the escape sequence is ill-formed, 1029f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (UChar32)0xFFFFFFFF is returned. See documentation of u_unescape() 1030f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * for a list of recognized sequences. 1031f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1032f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param charAt callback function that returns a UChar of the source 1033f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * text given an offset and a context pointer. 1034f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param offset pointer to the offset that will be passed to charAt. 1035f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The offset value will be updated upon return to point after the 1036f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * last parsed character of the escape sequence. On error the offset 1037f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * is unchanged. 1038f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param length the number of characters in the source text. The 1039f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * last character of the source text is considered to be at offset 1040f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * length-1. 1041f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param context an opaque pointer passed directly into charAt. 1042f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return the character represented by the escape sequence at 1043f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * offset, or (UChar32)0xFFFFFFFF on error. 1044f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_unescape() 1045f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see UnicodeString#unescape() 1046f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see UnicodeString#unescapeAt() 1047f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1048f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1049f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar32 U_EXPORT2 1050f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_unescapeAt(UNESCAPE_CHAR_AT charAt, 1051f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *offset, 1052f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t length, 1053f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) void *context); 1054f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1055f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1056f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Uppercase the characters in a string. 1057f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Casing is locale-dependent and context-sensitive. 1058f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The result may be longer or shorter than the original. 1059f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The source string and the destination buffer are allowed to overlap. 1060f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1061f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1062f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1063f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1064f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the result 1065f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * without writing any of the result string. 1066f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original string 1067f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1068f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param locale The locale to consider, or "" for the root locale or NULL for the default locale. 1069f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1070f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1071f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The length of the result string. It may be greater than destCapacity. In that case, 1072f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * only some of the result was written to the destination buffer. 1073f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1074f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1075f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 1076f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToUpper(UChar *dest, int32_t destCapacity, 1077f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, int32_t srcLength, 1078f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *locale, 1079f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1080f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1081f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1082f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Lowercase the characters in a string. 1083f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Casing is locale-dependent and context-sensitive. 1084f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The result may be longer or shorter than the original. 1085f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The source string and the destination buffer are allowed to overlap. 1086f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1087f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1088f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1089f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1090f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the result 1091f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * without writing any of the result string. 1092f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original string 1093f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1094f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param locale The locale to consider, or "" for the root locale or NULL for the default locale. 1095f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1096f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1097f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The length of the result string. It may be greater than destCapacity. In that case, 1098f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * only some of the result was written to the destination buffer. 1099f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1100f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1101f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 1102f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToLower(UChar *dest, int32_t destCapacity, 1103f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, int32_t srcLength, 1104f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *locale, 1105f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1106f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1107f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#if !UCONFIG_NO_BREAK_ITERATION 1108f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1109f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1110f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Titlecase a string. 1111f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Casing is locale-dependent and context-sensitive. 1112f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Titlecasing uses a break iterator to find the first characters of words 1113f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * that are to be titlecased. It titlecases those characters and lowercases 1114f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all others. 1115f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1116f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The titlecase break iterator can be provided to customize for arbitrary 1117f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * styles, using rules and dictionaries beyond the standard iterators. 1118f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * It may be more efficient to always provide an iterator to avoid 1119f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * opening and closing one for each string. 1120f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The standard titlecase iterator for the root locale implements the 1121f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * algorithm of Unicode TR 21. 1122f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1123f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function uses only the setText(), first() and next() methods of the 1124f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * provided break iterator. 1125f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1126f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The result may be longer or shorter than the original. 1127f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The source string and the destination buffer are allowed to overlap. 1128f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1129f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1130f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1131f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1132f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the result 1133f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * without writing any of the result string. 1134f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original string 1135f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1136f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param titleIter A break iterator to find the first characters of words 1137f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * that are to be titlecased. 1138f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If none is provided (NULL), then a standard titlecase 1139f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * break iterator is opened. 1140f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param locale The locale to consider, or "" for the root locale or NULL for the default locale. 1141f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1142f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1143f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The length of the result string. It may be greater than destCapacity. In that case, 1144f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * only some of the result was written to the destination buffer. 1145f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.1 1146f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1147f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 1148f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToTitle(UChar *dest, int32_t destCapacity, 1149f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, int32_t srcLength, 1150f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UBreakIterator *titleIter, 1151f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *locale, 1152f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1153f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1154f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 1155f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1156f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1157f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Case-fold the characters in a string. 1158f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Case-folding is locale-independent and not context-sensitive, 1159f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * but there is an option for whether to include or exclude mappings for dotted I 1160f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and dotless i that are marked with 'I' in CaseFolding.txt. 1161f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The result may be longer or shorter than the original. 1162f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The source string and the destination buffer are allowed to overlap. 1163f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1164f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1165f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1166f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1167f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the result 1168f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * without writing any of the result string. 1169f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original string 1170f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1171f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param options Either U_FOLD_CASE_DEFAULT or U_FOLD_CASE_EXCLUDE_SPECIAL_I 1172f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1173f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1174f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The length of the result string. It may be greater than destCapacity. In that case, 1175f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * only some of the result was written to the destination buffer. 1176f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1177f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1178f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE int32_t U_EXPORT2 1179f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFoldCase(UChar *dest, int32_t destCapacity, 1180f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, int32_t srcLength, 1181f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) uint32_t options, 1182f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1183f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1184f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#if defined(U_WCHAR_IS_UTF16) || defined(U_WCHAR_IS_UTF32) || !UCONFIG_NO_CONVERSION 1185f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1186f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-16 string to a wchar_t string. 1187f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If it is known at compile time that wchar_t strings are in UTF-16 or UTF-32, then 1188f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * this function simply calls the fast, dedicated function for that. 1189f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, two conversions UTF-16 -> default charset -> wchar_t* are performed. 1190f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1191f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1192f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1193f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of wchar_t's). If it is 0, then 1194f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1195f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1196f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1197f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1198f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1199f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1200f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1201f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1202f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1203f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1204f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1205f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1206f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1207f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE wchar_t* U_EXPORT2 1208f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToWCS(wchar_t *dest, 1209f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1210f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1211f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1212f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1213f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1214f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1215f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a wchar_t string to UTF-16. 1216f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If it is known at compile time that wchar_t strings are in UTF-16 or UTF-32, then 1217f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * this function simply calls the fast, dedicated function for that. 1218f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Otherwise, two conversions wchar_t* -> default charset -> UTF-16 are performed. 1219f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1220f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1221f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1222f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1223f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1224f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1225f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1226f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1227f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1228f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1229f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1230f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1231f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1232f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1233f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1234f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1235f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1236f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1237f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromWCS(UChar *dest, 1238f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1239f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1240f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const wchar_t *src, 1241f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1242f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1243f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif /* defined(U_WCHAR_IS_UTF16) || defined(U_WCHAR_IS_UTF32) || !UCONFIG_NO_CONVERSION */ 1244f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1245f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1246f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-16 string to UTF-8. 1247f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1248f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1249f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1250f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1251f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of chars). If it is 0, then 1252f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1253f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1254f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1255f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1256f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1257f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1258f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1259f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1260f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1261f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1262f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1263f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1264f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF8WithSub 1265f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8 1266f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1267f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE char* U_EXPORT2 1268f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToUTF8(char *dest, 1269f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1270f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1271f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1272f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1273f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1274f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1275f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1276f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-8 string to UTF-16. 1277f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1278f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1279f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1280f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1281f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1282f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1283f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1284f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1285f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1286f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1287f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1288f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1289f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1290f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1291f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1292f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1293f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1294f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8WithSub 1295f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8Lenient 1296f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1297f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1298f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromUTF8(UChar *dest, 1299f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1300f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1301f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src, 1302f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1303f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1304f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1305f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1306f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-16 string to UTF-8. 1307f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1308f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1309f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Same as u_strToUTF8() except for the additional subchar which is output for 1310f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * illegal input sequences, instead of stopping with the U_INVALID_CHAR_FOUND error code. 1311f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * With subchar==U_SENTINEL, this function behaves exactly like u_strToUTF8(). 1312f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1313f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1314f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1315f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of chars). If it is 0, then 1316f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1317f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1318f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1319f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1320f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1321f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1322f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1323f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1324f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subchar The substitution character to use in place of an illegal input sequence, 1325f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or U_SENTINEL if the function is to return with U_INVALID_CHAR_FOUND instead. 1326f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A substitution character can be any valid Unicode code point (up to U+10FFFF) 1327f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except for surrogate code points (U+D800..U+DFFF). 1328f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The recommended value is U+FFFD "REPLACEMENT CHARACTER". 1329f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pNumSubstitutions Output parameter receiving the number of substitutions if subchar>=0. 1330f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to 0 if no substitutions occur or subchar<0. 1331f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pNumSubstitutions can be NULL. 1332f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1333f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1334f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1335f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1336f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1337f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF8 1338f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8WithSub 1339f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 3.6 1340f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1341f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE char* U_EXPORT2 1342f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToUTF8WithSub(char *dest, 1343f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1344f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1345f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1346f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1347f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar32 subchar, int32_t *pNumSubstitutions, 1348f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1349f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1350f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1351f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-8 string to UTF-16. 1352f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1353f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1354f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Same as u_strFromUTF8() except for the additional subchar which is output for 1355f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * illegal input sequences, instead of stopping with the U_INVALID_CHAR_FOUND error code. 1356f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * With subchar==U_SENTINEL, this function behaves exactly like u_strFromUTF8(). 1357f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1358f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1359f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1360f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1361f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1362f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1363f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1364f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1365f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1366f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1367f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1368f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1369f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subchar The substitution character to use in place of an illegal input sequence, 1370f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or U_SENTINEL if the function is to return with U_INVALID_CHAR_FOUND instead. 1371f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A substitution character can be any valid Unicode code point (up to U+10FFFF) 1372f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except for surrogate code points (U+D800..U+DFFF). 1373f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The recommended value is U+FFFD "REPLACEMENT CHARACTER". 1374f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pNumSubstitutions Output parameter receiving the number of substitutions if subchar>=0. 1375f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to 0 if no substitutions occur or subchar<0. 1376f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pNumSubstitutions can be NULL. 1377f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1378f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1379f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1380f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1381f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1382f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8 1383f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8Lenient 1384f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF8WithSub 1385f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 3.6 1386f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1387f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1388f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromUTF8WithSub(UChar *dest, 1389f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1390f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1391f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src, 1392f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1393f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar32 subchar, int32_t *pNumSubstitutions, 1394f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1395f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1396f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1397f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-8 string to UTF-16. 1398f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1399f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Same as u_strFromUTF8() except that this function is designed to be very fast, 1400f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which it achieves by being lenient about malformed UTF-8 sequences. 1401f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function is intended for use in environments where UTF-8 text is 1402f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * expected to be well-formed. 1403f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1404f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Its semantics are: 1405f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - Well-formed UTF-8 text is correctly converted to well-formed UTF-16 text. 1406f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - The function will not read beyond the input string, nor write beyond 1407f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the destCapacity. 1408f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - Malformed UTF-8 results in "garbage" 16-bit Unicode strings which may not 1409f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * be well-formed UTF-16. 1410f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The function will resynchronize to valid code point boundaries 1411f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * within a small number of code points after an illegal sequence. 1412f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * - Non-shortest forms are not detected and will result in "spoofing" output. 1413f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1414f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For further performance improvement, if srcLength is given (>=0), 1415f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * then it must be destCapacity>=srcLength. 1416f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1417f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * There is no inverse u_strToUTF8Lenient() function because there is practically 1418f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * no performance gain from not checking that a UTF-16 string is well-formed. 1419f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1420f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1421f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1422f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1423f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1424f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1425f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unlike for other ICU functions, if srcLength>=0 then it 1426f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * must be destCapacity>=srcLength. 1427f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1428f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1429f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1430f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1431f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Unlike for other ICU functions, if srcLength>=0 but 1432f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * destCapacity<srcLength, then *pDestLength will be set to srcLength 1433f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (and U_BUFFER_OVERFLOW_ERROR will be set) 1434f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * regardless of the actual result length. 1435f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1436f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1437f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1438f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1439f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1440f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1441f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1442f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8 1443f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8WithSub 1444f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF8WithSub 1445f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 3.6 1446f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1447f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar * U_EXPORT2 1448f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromUTF8Lenient(UChar *dest, 1449f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1450f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1451f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src, 1452f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1453f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1454f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1455f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1456f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-16 string to UTF-32. 1457f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1458f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1459f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1460f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1461f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChar32s). If it is 0, then 1462f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1463f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1464f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1465f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1466f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1467f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1468f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1469f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1470f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1471f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1472f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1473f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF32WithSub 1474f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF32 1475f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1476f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1477f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar32* U_EXPORT2 1478f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToUTF32(UChar32 *dest, 1479f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1480f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1481f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1482f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1483f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1484f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1485f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1486f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-32 string to UTF-16. 1487f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1488f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1489f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1490f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1491f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1492f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1493f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1494f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1495f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1496f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1497f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1498f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1499f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1500f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Must be a valid pointer to an error code value, 1501f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * which must not indicate a failure before the function call. 1502f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1503f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF32WithSub 1504f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF32 1505f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 2.0 1506f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1507f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1508f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromUTF32(UChar *dest, 1509f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1510f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1511f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar32 *src, 1512f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1513f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1514f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1515f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1516f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-16 string to UTF-32. 1517f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1518f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1519f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Same as u_strToUTF32() except for the additional subchar which is output for 1520f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * illegal input sequences, instead of stopping with the U_INVALID_CHAR_FOUND error code. 1521f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * With subchar==U_SENTINEL, this function behaves exactly like u_strToUTF32(). 1522f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1523f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1524f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1525f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChar32s). If it is 0, then 1526f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1527f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1528f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1529f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1530f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1531f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1532f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1533f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1534f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subchar The substitution character to use in place of an illegal input sequence, 1535f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or U_SENTINEL if the function is to return with U_INVALID_CHAR_FOUND instead. 1536f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A substitution character can be any valid Unicode code point (up to U+10FFFF) 1537f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except for surrogate code points (U+D800..U+DFFF). 1538f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The recommended value is U+FFFD "REPLACEMENT CHARACTER". 1539f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pNumSubstitutions Output parameter receiving the number of substitutions if subchar>=0. 1540f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to 0 if no substitutions occur or subchar<0. 1541f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pNumSubstitutions can be NULL. 1542f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1543f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1544f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1545f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1546f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1547f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF32 1548f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF32WithSub 1549f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 4.2 1550f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1551f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar32* U_EXPORT2 1552f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToUTF32WithSub(UChar32 *dest, 1553f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1554f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1555f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1556f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1557f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar32 subchar, int32_t *pNumSubstitutions, 1558f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1559f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1560f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1561f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a UTF-32 string to UTF-16. 1562f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1563f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1564f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Same as u_strFromUTF32() except for the additional subchar which is output for 1565f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * illegal input sequences, instead of stopping with the U_INVALID_CHAR_FOUND error code. 1566f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * With subchar==U_SENTINEL, this function behaves exactly like u_strFromUTF32(). 1567f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1568f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1569f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1570f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1571f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1572f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1573f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1574f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1575f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1576f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1577f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1578f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1579f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subchar The substitution character to use in place of an illegal input sequence, 1580f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or U_SENTINEL if the function is to return with U_INVALID_CHAR_FOUND instead. 1581f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A substitution character can be any valid Unicode code point (up to U+10FFFF) 1582f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except for surrogate code points (U+D800..U+DFFF). 1583f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The recommended value is U+FFFD "REPLACEMENT CHARACTER". 1584f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pNumSubstitutions Output parameter receiving the number of substitutions if subchar>=0. 1585f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to 0 if no substitutions occur or subchar<0. 1586f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pNumSubstitutions can be NULL. 1587f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1588f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1589f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1590f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1591f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1592f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF32 1593f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF32WithSub 1594f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 4.2 1595f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1596f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1597f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromUTF32WithSub(UChar *dest, 1598f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1599f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1600f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar32 *src, 1601f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1602f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar32 subchar, int32_t *pNumSubstitutions, 1603f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1604f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1605f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1606f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a 16-bit Unicode string to Java Modified UTF-8. 1607f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * See http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#modified-utf-8 1608f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1609f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function behaves according to the documentation for Java DataOutput.writeUTF() 1610f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except that it does not encode the output length in the destination buffer 1611f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and does not have an output length restriction. 1612f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * See http://java.sun.com/javase/6/docs/api/java/io/DataOutput.html#writeUTF(java.lang.String) 1613f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1614f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The input string need not be well-formed UTF-16. 1615f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (Therefore there is no subchar parameter.) 1616f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1617f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1618f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1619f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of chars). If it is 0, then 1620f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1621f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1622f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1623f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1624f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1625f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1626f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1627f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1628f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1629f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1630f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1631f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1632f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1633f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 4.4 1634f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToUTF8WithSub 1635f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromJavaModifiedUTF8WithSub 1636f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1637f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE char* U_EXPORT2 1638f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToJavaModifiedUTF8( 1639f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) char *dest, 1640f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1641f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1642f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UChar *src, 1643f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1644f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1645f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1646f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 1647f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Convert a Java Modified UTF-8 string to a 16-bit Unicode string. 1648f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set. 1649f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1650f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function behaves according to the documentation for Java DataInput.readUTF() 1651f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except that it takes a length parameter rather than 1652f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * interpreting the first two input bytes as the length. 1653f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * See http://java.sun.com/javase/6/docs/api/java/io/DataInput.html#readUTF() 1654f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1655f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The output string may not be well-formed UTF-16. 1656f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 1657f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest A buffer for the result string. The result will be zero-terminated if 1658f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the buffer is large enough. 1659f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity The size of the buffer (number of UChars). If it is 0, then 1660f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * dest may be NULL and the function will only return the length of the 1661f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * result without writing any of the result string (pre-flighting). 1662f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pDestLength A pointer to receive the number of units written to the destination. If 1663f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pDestLength!=NULL then *pDestLength is always set to the 1664f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * number of output units corresponding to the transformation of 1665f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * all the input units, even in case of a buffer overflow. 1666f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src The original source string 1667f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength The length of the original string. If -1, then src must be zero-terminated. 1668f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param subchar The substitution character to use in place of an illegal input sequence, 1669f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or U_SENTINEL if the function is to return with U_INVALID_CHAR_FOUND instead. 1670f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * A substitution character can be any valid Unicode code point (up to U+10FFFF) 1671f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * except for surrogate code points (U+D800..U+DFFF). 1672f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The recommended value is U+FFFD "REPLACEMENT CHARACTER". 1673f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pNumSubstitutions Output parameter receiving the number of substitutions if subchar>=0. 1674f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Set to 0 if no substitutions occur or subchar<0. 1675f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pNumSubstitutions can be NULL. 1676f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode Pointer to a standard ICU error code. Its input value must 1677f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * pass the U_SUCCESS() test, or else the function returns 1678f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * immediately. Check for U_FAILURE() on output or use with 1679f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * function chaining. (See User Guide for details.) 1680f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return The pointer to destination buffer. 1681f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8WithSub 1682f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromUTF8Lenient 1683f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToJavaModifiedUTF8 1684f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @stable ICU 4.4 1685f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 1686f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_STABLE UChar* U_EXPORT2 1687f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromJavaModifiedUTF8WithSub( 1688f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar *dest, 1689f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t destCapacity, 1690f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t *pDestLength, 1691f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const char *src, 1692f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) int32_t srcLength, 1693f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar32 subchar, int32_t *pNumSubstitutions, 1694f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 1695f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 1696f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 1697