16f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/* 26f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org******************************************************************************* 36f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* 46f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* Copyright (C) 1999-2012, International Business Machines 56f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* Corporation and others. All Rights Reserved. 66f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* 76f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org******************************************************************************* 86f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* file name: utf16.h 96f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* encoding: US-ASCII 106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* tab size: 8 (not used) 116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* indentation:4 126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* 136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* created on: 1999sep09 146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* created by: Markus W. Scherer 156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org*/ 166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * \file 196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * \brief C API: 16-bit Unicode handling macros 206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This file defines macros to deal with 16-bit Unicode (UTF-16) code units and strings. 226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * For more information see utf.h and the ICU User Guide Strings chapter 246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (http://userguide.icu-project.org/strings). 256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * <em>Usage:</em> 276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * ICU coding guidelines for if() statements should be followed when using these macros. 286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Compound statements (curly braces {}) must be used for if-else-while... 296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * bodies and all macro statements should be terminated with semicolon. 306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef __UTF16_H__ 336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define __UTF16_H__ 346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/umachine.h" 366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef __UTF_H__ 376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org# include "unicode/utf.h" 386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/* single-code point definitions -------------------------------------------- */ 416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Does this code unit alone encode a code point (BMP, not a surrogate)? 446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_SINGLE(c) !U_IS_SURROGATE(c) 496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Is this code unit a lead surrogate (U+d800..U+dbff)? 526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_LEAD(c) (((c)&0xfffffc00)==0xd800) 576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Is this code unit a trail surrogate (U+dc00..U+dfff)? 606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_TRAIL(c) (((c)&0xfffffc00)==0xdc00) 656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Is this code unit a surrogate (U+d800..U+dfff)? 686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_SURROGATE(c) U_IS_SURROGATE(c) 736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Assuming c is a surrogate code point (U16_IS_SURROGATE(c)), 766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is it a lead surrogate? 776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_SURROGATE_LEAD(c) (((c)&0x400)==0) 826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Assuming c is a surrogate code point (U16_IS_SURROGATE(c)), 856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is it a trail surrogate? 866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 16-bit code unit 876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return TRUE or FALSE 886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_IS_SURROGATE_TRAIL(c) (((c)&0x400)!=0) 916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Helper constant for U16_GET_SUPPLEMENTARY. 946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @internal 956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_SURROGATE_OFFSET ((0xd800<<10UL)+0xdc00-0x10000) 976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a supplementary code point value (U+10000..U+10ffff) 1006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * from its lead and trail surrogates. 1016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The result is undefined if the input values are not 1026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * lead and trail surrogates. 1036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param lead lead surrogate (U+d800..U+dbff) 1056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param trail trail surrogate (U+dc00..U+dfff) 1066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return supplementary code point (U+10000..U+10ffff) 1076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_GET_SUPPLEMENTARY(lead, trail) \ 1106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (((UChar32)(lead)<<10UL)+(UChar32)(trail)-U16_SURROGATE_OFFSET) 1116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the lead surrogate (0xd800..0xdbff) for a 1156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * supplementary code point (0x10000..0x10ffff). 1166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param supplementary 32-bit code point (U+10000..U+10ffff) 1176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return lead surrogate (U+d800..U+dbff) for supplementary 1186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_LEAD(supplementary) (UChar)(((supplementary)>>10)+0xd7c0) 1216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the trail surrogate (0xdc00..0xdfff) for a 1246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * supplementary code point (0x10000..0x10ffff). 1256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param supplementary 32-bit code point (U+10000..U+10ffff) 1266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return trail surrogate (U+dc00..U+dfff) for supplementary 1276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_TRAIL(supplementary) (UChar)(((supplementary)&0x3ff)|0xdc00) 1306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * How many 16-bit code units are used to encode this Unicode code point? (1 or 2) 1336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The result is not defined if c is not a Unicode code point (U+0000..U+10ffff). 1346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c 32-bit code point 1356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return 1 or 2 1366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_LENGTH(c) ((uint32_t)(c)<=0xffff ? 1 : 2) 1396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The maximum number of 16-bit code units per Unicode code point (U+0000..U+10ffff). 1426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return 2 1436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_MAX_LENGTH 2 1466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a code point from a string at a random-access offset, 1496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * without changing the offset. 1506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 1516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset may point to either the lead or trail surrogate unit 1536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, in which case the macro will read 1546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the adjacent matching surrogate as well. 1556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The result is undefined if the offset points to a single, unpaired surrogate. 1566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Iteration through a string is more efficient with U16_NEXT_UNSAFE or U16_NEXT. 1576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 1596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 1606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 1616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_GET 1626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_GET_UNSAFE(s, i, c) { \ 1656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[i]; \ 1666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_SURROGATE(c)) { \ 1676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_SURROGATE_LEAD(c)) { \ 1686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((c), (s)[(i)+1]); \ 1696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } else { \ 1706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((s)[(i)-1], (c)); \ 1716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 1726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 1736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 1746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a code point from a string at a random-access offset, 1776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * without changing the offset. 1786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 1796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset may point to either the lead or trail surrogate unit 1816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, in which case the macro will read 1826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the adjacent matching surrogate as well. 1836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The length can be negative for a NUL-terminated string. 1856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset points to a single, unpaired surrogate, then that itself 1876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be returned as the code point. 1886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Iteration through a string is more efficient with U16_NEXT_UNSAFE or U16_NEXT. 1896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 1916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start starting string offset (usually 0) 1926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be start<=i<length 1936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length string length 1946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 1956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_GET_UNSAFE 1966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 1976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_GET(s, start, i, length, c) { \ 1996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[i]; \ 2006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_SURROGATE(c)) { \ 2016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint16_t __c2; \ 2026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_SURROGATE_LEAD(c)) { \ 2036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((i)+1!=(length) && U16_IS_TRAIL(__c2=(s)[(i)+1])) { \ 2046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((c), __c2); \ 2056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } else { \ 2076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((i)>(start) && U16_IS_LEAD(__c2=(s)[(i)-1])) { \ 2086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY(__c2, (c)); \ 2096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 2136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/* definitions with forward iteration --------------------------------------- */ 2156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 2176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a code point from a string at a code point boundary offset, 2186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and advance the offset to the next code point boundary. 2196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing forward iteration.) 2206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 2216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset may point to the lead surrogate unit 2236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, in which case the macro will read 2246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the following trail surrogate as well. 2256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset points to a trail surrogate, then that itself 2266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be returned as the code point. 2276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The result is undefined if the offset points to a single, unpaired lead surrogate. 2286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 2306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 2316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 2326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_NEXT 2336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 2346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_NEXT_UNSAFE(s, i, c) { \ 2366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[(i)++]; \ 2376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_LEAD(c)) { \ 2386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((c), (s)[(i)++]); \ 2396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 2416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 2436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a code point from a string at a code point boundary offset, 2446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and advance the offset to the next code point boundary. 2456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing forward iteration.) 2466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 2476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The length can be negative for a NUL-terminated string. 2496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset may point to the lead surrogate unit 2516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, in which case the macro will read 2526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the following trail surrogate as well. 2536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset points to a trail surrogate or 2546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * to a single, unpaired lead surrogate, then that itself 2556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be returned as the code point. 2566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 2586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be i<length 2596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length string length 2606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 2616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_NEXT_UNSAFE 2626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 2636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_NEXT(s, i, length, c) { \ 2656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[(i)++]; \ 2666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_LEAD(c)) { \ 2676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint16_t __c2; \ 2686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((i)!=(length) && U16_IS_TRAIL(__c2=(s)[(i)])) { \ 2696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org ++(i); \ 2706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((c), __c2); \ 2716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 2746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 2766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Append a code point to a string, overwriting 1 or 2 code units. 2776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset points to the current end of the string contents 2786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and is advanced (post-increment). 2796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes a valid code point and sufficient space in the string. 2806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Otherwise, the result is undefined. 2816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string buffer 2836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 2846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c code point to append 2856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_APPEND 2866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 2876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_APPEND_UNSAFE(s, i, c) { \ 2896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((uint32_t)(c)<=0xffff) { \ 2906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(c); \ 2916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } else { \ 2926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(((c)>>10)+0xd7c0); \ 2936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(((c)&0x3ff)|0xdc00); \ 2946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 2956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 2966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 2986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Append a code point to a string, overwriting 1 or 2 code units. 2996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The offset points to the current end of the string contents 3006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and is advanced (post-increment). 3016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, checks for a valid code point. 3026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If a surrogate pair is written, checks for sufficient space in the string. 3036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the code point is not valid or a trail surrogate does not fit, 3046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * then isError is set to TRUE. 3056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string buffer 3076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be i<capacity 3086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param capacity size of the string buffer 3096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c code point to append 3106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param isError output UBool set to TRUE if an error occurs, otherwise not modified 3116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_APPEND_UNSAFE 3126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 3136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_APPEND(s, i, capacity, c, isError) { \ 3156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((uint32_t)(c)<=0xffff) { \ 3166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(c); \ 3176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } else if((uint32_t)(c)<=0x10ffff && (i)+1<(capacity)) { \ 3186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(((c)>>10)+0xd7c0); \ 3196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (s)[(i)++]=(uint16_t)(((c)&0x3ff)|0xdc00); \ 3206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } else /* c>0x10ffff or not enough space */ { \ 3216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (isError)=TRUE; \ 3226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 3236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 3246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Advance the string offset from one code point boundary to the next. 3276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing iteration.) 3286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 3296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 3316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 3326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_FWD_1 3336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 3346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_FWD_1_UNSAFE(s, i) { \ 3366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_LEAD((s)[(i)++])) { \ 3376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org ++(i); \ 3386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 3396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 3406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Advance the string offset from one code point boundary to the next. 3436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing iteration.) 3446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 3456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The length can be negative for a NUL-terminated string. 3476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 3496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be i<length 3506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length string length 3516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_FWD_1_UNSAFE 3526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 3536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_FWD_1(s, i, length) { \ 3556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_LEAD((s)[(i)++]) && (i)!=(length) && U16_IS_TRAIL((s)[i])) { \ 3566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org ++(i); \ 3576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 3586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 3596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Advance the string offset from one code point boundary to the n-th next one, 3626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * i.e., move forward by n code points. 3636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing iteration.) 3646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 3656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 3676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 3686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param n number of code points to skip 3696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_FWD_N 3706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 3716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_FWD_N_UNSAFE(s, i, n) { \ 3736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t __N=(n); \ 3746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org while(__N>0) { \ 3756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org U16_FWD_1_UNSAFE(s, i); \ 3766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --__N; \ 3776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 3786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 3796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Advance the string offset from one code point boundary to the n-th next one, 3826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * i.e., move forward by n code points. 3836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Post-incrementing iteration.) 3846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 3856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The length can be negative for a NUL-terminated string. 3876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 3896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i int32_t string offset, must be i<length 3906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length int32_t string length 3916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param n number of code points to skip 3926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_FWD_N_UNSAFE 3936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 3946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_FWD_N(s, i, length, n) { \ 3966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t __N=(n); \ 3976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org while(__N>0 && ((i)<(length) || ((length)<0 && (s)[i]!=0))) { \ 3986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org U16_FWD_1(s, i, length); \ 3996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --__N; \ 4006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 4016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 4026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Adjust a random-access offset to a code point boundary 4056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * at the start of a code point. 4066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset points to the trail surrogate of a surrogate pair, 4076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * then the offset is decremented. 4086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Otherwise, it is not modified. 4096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 4106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 4126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 4136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_SET_CP_START 4146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 4156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_SET_CP_START_UNSAFE(s, i) { \ 4176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL((s)[i])) { \ 4186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --(i); \ 4196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 4206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 4216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Adjust a random-access offset to a code point boundary 4246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * at the start of a code point. 4256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset points to the trail surrogate of a surrogate pair, 4266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * then the offset is decremented. 4276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Otherwise, it is not modified. 4286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 4296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 4316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start starting string offset (usually 0) 4326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be start<=i 4336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_SET_CP_START_UNSAFE 4346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 4356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_SET_CP_START(s, start, i) { \ 4376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL((s)[i]) && (i)>(start) && U16_IS_LEAD((s)[(i)-1])) { \ 4386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --(i); \ 4396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 4406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 4416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/* definitions with backward iteration -------------------------------------- */ 4436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the previous one 4466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and get the code point between them. 4476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 4486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 4496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 4516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind a trail surrogate unit 4526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, then the macro will read 4536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the preceding lead surrogate as well. 4546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind a lead surrogate, then that itself 4556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be returned as the code point. 4566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The result is undefined if the offset is behind a single, unpaired trail surrogate. 4576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 4596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 4606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 4616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_PREV 4626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 4636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_PREV_UNSAFE(s, i, c) { \ 4656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[--(i)]; \ 4666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL(c)) { \ 4676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY((s)[--(i)], (c)); \ 4686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 4696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 4706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the previous one 4736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and get the code point between them. 4746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 4756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 4766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 4786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind a trail surrogate unit 4796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for a supplementary code point, then the macro will read 4806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the preceding lead surrogate as well. 4816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind a lead surrogate or behind a single, unpaired 4826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * trail surrogate, then that itself 4836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be returned as the code point. 4846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 4866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start starting string offset (usually 0) 4876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be start<i 4886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param c output UChar32 variable 4896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_PREV_UNSAFE 4906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 4916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_PREV(s, start, i, c) { \ 4936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=(s)[--(i)]; \ 4946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL(c)) { \ 4956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint16_t __c2; \ 4966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((i)>(start) && U16_IS_LEAD(__c2=(s)[(i)-1])) { \ 4976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --(i); \ 4986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org (c)=U16_GET_SUPPLEMENTARY(__c2, (c)); \ 4996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the previous one. 5056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 5066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 5076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 5086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 5106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 5116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_BACK_1 5126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 5136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_BACK_1_UNSAFE(s, i) { \ 5156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL((s)[--(i)])) { \ 5166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --(i); \ 5176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the previous one. 5226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 5236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 5246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 5256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 5276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start starting string offset (usually 0) 5286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be start<i 5296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_BACK_1_UNSAFE 5306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 5316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_BACK_1(s, start, i) { \ 5336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_TRAIL((s)[--(i)]) && (i)>(start) && U16_IS_LEAD((s)[(i)-1])) { \ 5346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --(i); \ 5356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the n-th one before it, 5406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * i.e., move backward by n code points. 5416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 5426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 5436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 5446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 5466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 5476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param n number of code points to skip 5486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_BACK_N 5496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 5506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_BACK_N_UNSAFE(s, i, n) { \ 5526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t __N=(n); \ 5536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org while(__N>0) { \ 5546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org U16_BACK_1_UNSAFE(s, i); \ 5556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --__N; \ 5566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Move the string offset from one code point boundary to the n-th one before it, 5616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * i.e., move backward by n code points. 5626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * (Pre-decrementing backward iteration.) 5636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 5646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 5656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 5676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start start of string 5686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset, must be start<i 5696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param n number of code points to skip 5706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_BACK_N_UNSAFE 5716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 5726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_BACK_N(s, start, i, n) { \ 5746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t __N=(n); \ 5756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org while(__N>0 && (i)>(start)) { \ 5766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org U16_BACK_1(s, start, i); \ 5776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org --__N; \ 5786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Adjust a random-access offset to a code point boundary after a code point. 5836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind the lead surrogate of a surrogate pair, 5846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * then the offset is incremented. 5856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Otherwise, it is not modified. 5866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 5876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unsafe" macro, assumes well-formed UTF-16. 5886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 5906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i string offset 5916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_SET_CP_LIMIT 5926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 5936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_SET_CP_LIMIT_UNSAFE(s, i) { \ 5956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if(U16_IS_LEAD((s)[(i)-1])) { \ 5966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org ++(i); \ 5976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 5986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 5996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 6016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Adjust a random-access offset to a code point boundary after a code point. 6026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the offset is behind the lead surrogate of a surrogate pair, 6036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * then the offset is incremented. 6046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Otherwise, it is not modified. 6056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The input offset may be the same as the string length. 6066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Safe" macro, handles unpaired surrogates and checks for string boundaries. 6076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The length can be negative for a NUL-terminated string. 6096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s const UChar * string 6116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param start int32_t starting string offset (usually 0) 6126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param i int32_t string offset, start<=i<=length 6136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length int32_t string length 6146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see U16_SET_CP_LIMIT_UNSAFE 6156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 2.4 6166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 6176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define U16_SET_CP_LIMIT(s, start, i, length) { \ 6186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org if((start)<(i) && ((i)<(length) || (length)<0) && U16_IS_LEAD((s)[(i)-1]) && U16_IS_TRAIL((s)[i])) { \ 6196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org ++(i); \ 6206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } \ 6216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org} 6226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 624