16f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/* 26f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org*************************************************************************** 36f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* Copyright (C) 2008-2013, International Business Machines Corporation 46f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* and others. All Rights Reserved. 56f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org*************************************************************************** 66f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* file name: uspoof.h 76f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* encoding: US-ASCII 86f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* tab size: 8 (not used) 96f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* indentation:4 106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* 116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* created on: 2008Feb13 126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* created by: Andy Heninger 136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* 146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org* Unicode Spoof Detection 156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org*/ 166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef USPOOF_H 186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#define USPOOF_H 196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/utypes.h" 216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/uset.h" 226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/parseerr.h" 236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/localpointer.h" 246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if !UCONFIG_NO_NORMALIZATION 266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/unistr.h" 306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#include "unicode/uniset.h" 316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * \file 366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * \brief Unicode Security and Spoofing Detection, C API. 376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * These functions are intended to check strings, typically 396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * identifiers of some type, such as URLs, for the presence of 406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters that are likely to be visually confusing - 416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for cases where the displayed form of an identifier may 426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * not be what it appears to be. 436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Unicode Technical Report #36, http://unicode.org/reports/tr36, and 456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Unicode Technical Standard #39, http://unicode.org/reports/tr39 466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Unicode security considerations", give more background on 476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * security an spoofing issues with Unicode identifiers. 486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The tests and checks provided by this module implement the recommendations 496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * from those Unicode documents. 506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The tests available on identifiers fall into two general categories: 526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# Single identifier tests. Check whether an identifier is 536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * potentially confusable with any other string, or is suspicious 546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for other reasons. 556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# Two identifier tests. Check whether two specific identifiers are confusable. 566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This does not consider whether either of strings is potentially 576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusable with any string other than the exact one specified. 586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The steps to perform confusability testing are 606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# Open a USpoofChecker. 616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# Configure the USPoofChecker for the desired set of tests. The tests that will 626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be performed are specified by a set of USpoofChecks flags. 636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# Perform the checks using the pre-configured USpoofChecker. The results indicate 646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * which (if any) of the selected tests have identified possible problems with the identifier. 656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Results are reported as a set of USpoofChecks flags; this mirrors the form in which 666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the set of tests to perform was originally specified to the USpoofChecker. 676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * A USpoofChecker may be used repeatedly to perform checks on any number of identifiers. 696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Thread Safety: The test functions for checking a single identifier, or for testing 716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * whether two identifiers are possible confusable, are thread safe. 726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * They may called concurrently, from multiple threads, using the same USpoofChecker instance. 736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * More generally, the standard ICU thread safety rules apply: functions that take a 756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * const USpoofChecker parameter are thread safe. Those that take a non-const 766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecier are not thread safe. 776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Descriptions of the available checks. 806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When testing whether pairs of identifiers are confusable, with the uspoof_areConfusable() 826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * family of functions, the relevant tests are 836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_SINGLE_SCRIPT_CONFUSABLE: All of the characters from the two identifiers are 856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * from a single script, and the two identifiers are visually confusable. 866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_MIXED_SCRIPT_CONFUSABLE: At least one of the identifiers contains characters 876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * from more than one script, and the two identifiers are visually confusable. 886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_WHOLE_SCRIPT_CONFUSABLE: Each of the two identifiers is of a single script, but 896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the two identifiers are from different scripts, and they are visually confusable. 906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The safest approach is to enable all three of these checks as a group. 926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_ANY_CASE is a modifier for the above tests. If the identifiers being checked can 946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be of mixed case and are used in a case-sensitive manner, this option should be specified. 956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the identifiers being checked are used in a case-insensitive manner, and if they are 976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * displayed to users in lower-case form only, the USPOOF_ANY_CASE option should not be 986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * specified. Confusabality issues involving upper case letters will not be reported. 996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When performing tests on a single identifier, with the uspoof_check() family of functions, 1016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the relevant tests are: 1026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_MIXED_SCRIPT_CONFUSABLE: the identifier contains characters from multiple 1046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * scripts, and there exists an identifier of a single script that is visually confusable. 1056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_WHOLE_SCRIPT_CONFUSABLE: the identifier consists of characters from a single 1066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * script, and there exists a visually confusable identifier. 1076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The visually confusable identifier also consists of characters from a single script. 1086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * but not the same script as the identifier being checked. 1096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_ANY_CASE: modifies the mixed script and whole script confusables tests. If 1106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * specified, the checks will consider confusable characters of any case. If this flag is not 1116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * set, the test is performed assuming case folded identifiers. 1126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_SINGLE_SCRIPT: check that the identifier contains only characters from a 1136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * single script. (Characters from the 'common' and 'inherited' scripts are ignored.) 1146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This is not a test for confusable identifiers 1156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_INVISIBLE: check an identifier for the presence of invisible characters, 1166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * such as zero-width spaces, or character sequences that are 1176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * likely not to display, such as multiple occurrences of the same 1186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * non-spacing mark. This check does not test the input string as a whole 1196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for conformance to any particular syntax for identifiers. 1206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -# USPOOF_CHAR_LIMIT: check that an identifier contains only characters from a specified set 1216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of acceptable characters. See uspoof_setAllowedChars() and 1226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * uspoof_setAllowedLocales(). 1236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Note on Scripts: 1256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Characters from the Unicode Scripts "Common" and "Inherited" are ignored when considering 1266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the script of an identifier. Common characters include digits and symbols that 1276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are normally used with text from more than one script. 1286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Identifier Skeletons: A skeleton is a transformation of an identifier, such that 1306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * all identifiers that are confusable with each other have the same skeleton. 1316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Using skeletons, it is possible to build a dictionary data structure for 1326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * a set of identifiers, and then quickly test whether a new identifier is 1336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusable with an identifier already in the set. The uspoof_getSkeleton() 1346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * family of functions will produce the skeleton from an identifier. 1356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Note that skeletons are not guaranteed to be stable between versions 1376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of Unicode or ICU, so an applications should not rely on creating a permanent, 1386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or difficult to update, database of skeletons. Instabilities result from 1396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * identifying new pairs or sequences of characters that are visually 1406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusable, and thus must be mapped to the same skeleton character(s). 1416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgstruct USpoofChecker; 1456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgtypedef struct USpoofChecker USpoofChecker; /**< typedef for C of USpoofChecker */ 1466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 1486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Enum for the kinds of checks that USpoofChecker can perform. 1496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * These enum values are used both to select the set of checks that 1506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be performed, and to report results from the check function. 1516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 1536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgtypedef enum USpoofChecks { 1556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Single script confusable test. 1566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When testing whether two identifiers are confusable, report that they are if 1576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * both are from the same script and they are visually confusable. 1586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Note: this test is not applicable to a check of a single identifier. 1596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_SINGLE_SCRIPT_CONFUSABLE = 1, 1616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Mixed script confusable test. 1636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When checking a single identifier, report a problem if 1646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the identifier contains multiple scripts, and 1656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is confusable with some other identifier in a single script 1666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When testing whether two identifiers are confusable, report that they are if 1676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the two IDs are visually confusable, 1686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and at least one contains characters from more than one script. 1696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_MIXED_SCRIPT_CONFUSABLE = 2, 1716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Whole script confusable test. 1736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When checking a single identifier, report a problem if 1746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The identifier is of a single script, and 1756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * there exists a confusable identifier in another script. 1766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * When testing whether two identifiers are confusable, report that they are if 1776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * each is of a single script, 1786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the scripts of the two identifiers are different, and 1796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the identifiers are visually confusable. 1806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 1816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_WHOLE_SCRIPT_CONFUSABLE = 4, 1826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Any Case Modifier for confusable identifier tests. 1846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org If specified, consider all characters, of any case, when looking for confusables. 1856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org If USPOOF_ANY_CASE is not specified, identifiers being checked are assumed to have been 1866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org case folded. Upper case confusable characters will not be checked. 1876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org Selects between Lower Case Confusable and 1886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org Any Case Confusable. */ 1896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_ANY_CASE = 8, 1906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 1916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 1926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check that an identifier is no looser than the specified RestrictionLevel. 1936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The default if uspoof_setRestrctionLevel() is not called is HIGHLY_RESTRICTIVE. 1946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If USPOOF_AUX_INFO is enabled the actual restriction level of the 1966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * identifier being tested will also be returned by uspoof_check(). 1976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 1986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see URestrictionLevel 1996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see uspoof_setRestrictionLevel 2006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see USPOOF_AUX_INFO 2016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 51 2036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_RESTRICTION_LEVEL = 16, 2056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DEPRECATED_API 2076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Check that an identifier contains only characters from a 2086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * single script (plus chars from the common and inherited scripts.) 2096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Applies to checks of a single identifier check only. 2106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @deprecated ICU 51 Use RESTRICTION_LEVEL instead. 2116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_SINGLE_SCRIPT = USPOOF_RESTRICTION_LEVEL, 2136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DEPRECATED_API */ 2146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Check an identifier for the presence of invisible characters, 2166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * such as zero-width spaces, or character sequences that are 2176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * likely not to display, such as multiple occurrences of the same 2186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * non-spacing mark. This check does not test the input string as a whole 2196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for conformance to any particular syntax for identifiers. 2206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_INVISIBLE = 32, 2226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** Check that an identifier contains only characters from a specified set 2246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of acceptable characters. See uspoof_setAllowedChars() and 2256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * uspoof_setAllowedLocales(). 2266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_CHAR_LIMIT = 64, 2286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DRAFT_API 2306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check that an identifier does not include decimal digits from 2326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * more than one numbering system. 2336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_MIXED_NUMBERS = 128, 2376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DRAFT_API */ 2386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Enable all spoof checks. 2416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.6 2436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_ALL_CHECKS = 0xFFFF, 2456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DRAFT_API 2476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Enable the return of auxillary (non-error) information in the 2496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * upper bits of the check results value. 2506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If this "check" is not enabled, the results of uspoof_check() will be zero when an 2526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * identifier passes all of the enabled checks. 2536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If this "check" is enabled, (uspoof_check() & USPOOF_ALL_CHECKS) will be zero 2556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * when an identifier passes all checks. 2566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_AUX_INFO = 0x40000000 2606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DRAFT_API */ 2616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } USpoofChecks; 2636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 2656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DRAFT_API 2666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Constants from UAX #39 for use in setRestrictionLevel(), and 2686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for returned identifier restriction levels in check results. 2696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org typedef enum URestrictionLevel { 2726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Only ASCII characters: U+0000..U+007F 2746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_ASCII = 0x10000000, 2786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * All characters in each identifier must be from a single script, or from the combinations: Latin + Han + 2806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Hiragana + Katakana; Latin + Han + Bopomofo; or Latin + Han + Hangul. Note that this level will satisfy the 2816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * vast majority of Latin-script users; also that TR36 has ASCII instead of Latin. 2826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_HIGHLY_RESTRICTIVE = 0x20000000, 2866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Allow Latin with other scripts except Cyrillic, Greek, Cherokee Otherwise, the same as Highly Restrictive 2886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_MODERATELY_RESTRICTIVE = 0x30000000, 2926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Allow arbitrary mixtures of scripts. Otherwise, the same as Moderately Restrictive. 2946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 2956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 2966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 2976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_MINIMALLY_RESTRICTIVE = 0x40000000, 2986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org /** 2996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Any valid identifiers, including characters outside of the Identifier Profile. 3006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 3026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org USPOOF_UNRESTRICTIVE = 0x50000000 3046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org } URestrictionLevel; 3056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DRAFT_API */ 3066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Create a Unicode Spoof Checker, configured to perform all 3096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * checks except for USPOOF_LOCALE_LIMIT and USPOOF_CHAR_LIMIT. 3106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Note that additional checks may be added in the future, 3116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * resulting in the changes to the default checking behavior. 3126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 3146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return the newly created Spoof Checker 3156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 3166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE USpoofChecker * U_EXPORT2 3186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_open(UErrorCode *status); 3196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Open a Spoof checker from its serialized from, stored in 32-bit-aligned memory. 3236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Inverse of uspoof_serialize(). 3246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The memory containing the serialized data must remain valid and unchanged 3256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * as long as the spoof checker, or any cloned copies of the spoof checker, 3266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are in use. Ownership of the memory remains with the caller. 3276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The spoof checker (and any clones) must be closed prior to deleting the 3286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * serialized data. 3296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param data a pointer to 32-bit-aligned memory containing the serialized form of spoof data 3316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length the number of bytes available at data; 3326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * can be more than necessary 3336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param pActualLength receives the actual number of bytes at data taken up by the data; 3346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * can be NULL 3356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param pErrorCode ICU error code 3366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return the spoof checker. 3376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see uspoof_open 3396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see uspoof_serialize 3406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 3416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE USpoofChecker * U_EXPORT2 3436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_openFromSerialized(const void *data, int32_t length, int32_t *pActualLength, 3446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *pErrorCode); 3456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Open a Spoof Checker from the source form of the spoof data. 3486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The Three inputs correspond to the Unicode data files confusables.txt 3496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusablesWholeScript.txt and xidmdifications.txt as described in 3506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Unicode UAX #39. The syntax of the source data is as described in UAX #39 for 3516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * these files, and the content of these files is acceptable input. 3526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The character encoding of the (char *) input text is UTF-8. 3546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 3556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param confusables a pointer to the confusable characters definitions, 3566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * as found in file confusables.txt from unicode.org. 3576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param confusablesLen The length of the confusables text, or -1 if the 3586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * input string is zero terminated. 3596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param confusablesWholeScript 3606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * a pointer to the whole script confusables definitions, 3616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * as found in the file confusablesWholeScript.txt from unicode.org. 3626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param confusablesWholeScriptLen The length of the whole script confusables text, or 3636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * -1 if the input string is zero terminated. 3646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param errType In the event of an error in the input, indicates 3656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * which of the input files contains the error. 3666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The value is one of USPOOF_SINGLE_SCRIPT_CONFUSABLE or 3676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_WHOLE_SCRIPT_CONFUSABLE, or 3686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * zero if no errors are found. 3696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param pe In the event of an error in the input, receives the position 3706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in the input text (line, offset) of the error. 3716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status an in/out ICU UErrorCode. Among the possible errors is 3726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * U_PARSE_ERROR, which is used to report syntax errors 3736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in the input. 3746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return A spoof checker that uses the rules from the input files. 3756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 3766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE USpoofChecker * U_EXPORT2 3786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_openFromSource(const char *confusables, int32_t confusablesLen, 3796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const char *confusablesWholeScript, int32_t confusablesWholeScriptLen, 3806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t *errType, UParseError *pe, UErrorCode *status); 3816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Close a Spoof Checker, freeing any memory that was being held by 3856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * its implementation. 3866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 3876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 3886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE void U_EXPORT2 3896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_close(USpoofChecker *sc); 3906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 3926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_NAMESPACE_BEGIN 3946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 3956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 3966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * \class LocalUSpoofCheckerPointer 3976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * "Smart pointer" class, closes a USpoofChecker via uspoof_close(). 3986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * For most methods see the LocalPointerBase base class. 3996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see LocalPointerBase 4016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see LocalPointer 4026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.4 4036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DEFINE_LOCAL_OPEN_POINTER(LocalUSpoofCheckerPointer, USpoofChecker, uspoof_close); 4056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_NAMESPACE_END 4076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 4096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Clone a Spoof Checker. The clone will be set to perform the same checks 4126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * as the original source. 4136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The source USpoofChecker 4156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 4166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return 4176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 4186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE USpoofChecker * U_EXPORT2 4206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_clone(const USpoofChecker *sc, UErrorCode *status); 4216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Specify the set of checks that will be performed by the check 4256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * functions of this Spoof Checker. 4266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 4286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param checks The set of checks that this spoof checker will perform. 4296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The value is a bit set, obtained by OR-ing together 4306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * values from enum USpoofChecks. 4316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 4326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 4336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE void U_EXPORT2 4366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_setChecks(USpoofChecker *sc, int32_t checks, UErrorCode *status); 4376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the set of checks that this Spoof Checker has been configured to perform. 4406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 4426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 4436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return The set of checks that this spoof checker will perform. 4446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The value is a bit set, obtained by OR-ing together 4456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * values from enum USpoofChecks. 4466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 4476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 4506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getChecks(const USpoofChecker *sc, UErrorCode *status); 4516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DRAFT_API 4536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Set the loosest restriction level allowed. The default if this function 4556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is not called is HIGHLY_RESTRICTIVE. 4566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Calling this function also enables the RESTRICTION_LEVEL check. 4576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param restrictionLevel The loosest restriction level allowed. 4586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see URestrictionLevel 4596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 4606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT void U_EXPORT2 4626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_setRestrictionLevel(USpoofChecker *sc, URestrictionLevel restrictionLevel); 4636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the Restriction Level that will be tested if the checks include RESTRICTION_LEVEL. 4676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return The restriction level 4696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see URestrictionLevel 4706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 4716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 4726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT URestrictionLevel U_EXPORT2 4736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getRestrictionLevel(const USpoofChecker *sc); 4746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DRAFT_API */ 4756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 4766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 4776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Limit characters that are acceptable in identifiers being checked to those 4786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * normally used with the languages associated with the specified locales. 4796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Any previously specified list of locales is replaced by the new settings. 4806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * A set of languages is determined from the locale(s), and 4826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * from those a set of acceptable Unicode scripts is determined. 4836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Characters from this set of scripts, along with characters from 4846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the "common" and "inherited" Unicode Script categories 4856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be permitted. 4866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Supplying an empty string removes all restrictions; 4886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters from any script will be allowed. 4896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The USPOOF_CHAR_LIMIT test is automatically enabled for this 4916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecker when calling this function with a non-empty list 4926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of locales. 4936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The Unicode Set of characters that will be allowed is accessible 4956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * via the uspoof_getAllowedChars() function. uspoof_setAllowedLocales() 4966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will <i>replace</i> any previously applied set of allowed characters. 4976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 4986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Adjustments, such as additions or deletions of certain classes of characters, 4996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * can be made to the result of uspoof_setAllowedLocales() by 5006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * fetching the resulting set with uspoof_getAllowedChars(), 5016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * manipulating it with the Unicode Set API, then resetting the 5026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * spoof detectors limits with uspoof_setAllowedChars() 5036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 5056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param localesList A list list of locales, from which the language 5066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and associated script are extracted. The locales 5076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are comma-separated if there is more than one. 5086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * White space may not appear within an individual locale, 5096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * but is ignored otherwise. 5106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The locales are syntactically like those from the 5116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * HTTP Accept-Language header. 5126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the localesList is empty, no restrictions will be placed on 5136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the allowed characters. 5146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 5166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 5176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE void U_EXPORT2 5196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_setAllowedLocales(USpoofChecker *sc, const char *localesList, UErrorCode *status); 5206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a list of locales for the scripts that are acceptable in strings 5236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * to be checked. If no limitations on scripts have been specified, 5246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * an empty string will be returned. 5256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * uspoof_setAllowedChars() will reset the list of allowed to be empty. 5276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The format of the returned list is the same as that supplied to 5296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * uspoof_setAllowedLocales(), but returned list may not be identical 5306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * to the originally specified string; the string may be reformatted, 5316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * and information other than languages from 5326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the originally specified locales may be omitted. 5336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 5356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 5366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return A string containing a list of locales corresponding 5376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * to the acceptable scripts, formatted like an 5386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * HTTP Accept Language value. 5396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 5416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE const char * U_EXPORT2 5436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getAllowedLocales(USpoofChecker *sc, UErrorCode *status); 5446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Limit the acceptable characters to those specified by a Unicode Set. 5486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Any previously specified character limit is 5496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is replaced by the new settings. This includes limits on 5506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters that were set with the uspoof_setAllowedLocales() function. 5516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The USPOOF_CHAR_LIMIT test is automatically enabled for this 5536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecker by this function. 5546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 5566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param chars A Unicode Set containing the list of 5576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters that are permitted. Ownership of the set 5586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * remains with the caller. The incoming set is cloned by 5596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * this function, so there are no restrictions on modifying 5606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or deleting the USet after calling this function. 5616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 5626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 5636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE void U_EXPORT2 5656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_setAllowedChars(USpoofChecker *sc, const USet *chars, UErrorCode *status); 5666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a USet for the characters permitted in an identifier. 5706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This corresponds to the limits imposed by the Set Allowed Characters 5716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * functions. Limitations imposed by other checks will not be 5726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * reflected in the set returned by this function. 5736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set will be frozen, meaning that it cannot be modified 5756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * by the caller. 5766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Ownership of the returned set remains with the Spoof Detector. The 5786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * returned set will become invalid if the spoof detector is closed, 5796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or if a new set of allowed characters is specified. 5806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 5836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 5846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return A USet containing the characters that are permitted by 5856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the USPOOF_CHAR_LIMIT test. 5866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 5876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 5886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE const USet * U_EXPORT2 5896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getAllowedChars(const USpoofChecker *sc, UErrorCode *status); 5906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 5926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 5936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 5946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Limit the acceptable characters to those specified by a Unicode Set. 5956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Any previously specified character limit is 5966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is replaced by the new settings. This includes limits on 5976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters that were set with the uspoof_setAllowedLocales() function. 5986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 5996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The USPOOF_CHAR_LIMIT test is automatically enabled for this 6006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USoofChecker by this function. 6016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 6036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param chars A Unicode Set containing the list of 6046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * characters that are permitted. Ownership of the set 6056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * remains with the caller. The incoming set is cloned by 6066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * this function, so there are no restrictions on modifying 6076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or deleting the UnicodeSet after calling this function. 6086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 6096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 6106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 6116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE void U_EXPORT2 6126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_setAllowedUnicodeSet(USpoofChecker *sc, const icu::UnicodeSet *chars, UErrorCode *status); 6136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 6166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get a UnicodeSet for the characters permitted in an identifier. 6176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This corresponds to the limits imposed by the Set Allowed Characters / 6186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * UnicodeSet functions. Limitations imposed by other checks will not be 6196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * reflected in the set returned by this function. 6206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set will be frozen, meaning that it cannot be modified 6226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * by the caller. 6236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Ownership of the returned set remains with the Spoof Detector. The 6256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * returned set will become invalid if the spoof detector is closed, 6266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or if a new set of allowed characters is specified. 6276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 6306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if this function encounters a problem. 6316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return A UnicodeSet containing the characters that are permitted by 6326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the USPOOF_CHAR_LIMIT test. 6336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 6346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 6356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE const icu::UnicodeSet * U_EXPORT2 6366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getAllowedUnicodeSet(const USpoofChecker *sc, UErrorCode *status); 6376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 6386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 6416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the specified string for possible security issues. 6426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The text to be checked will typically be an identifier of some sort. 6436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The set of checks to be performed is specified with uspoof_setChecks(). 6446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 6466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id The identifier to be checked for possible security issues, 6476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in UTF-16 format. 6486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length the length of the string to be checked, expressed in 6496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 16 bit UTF-16 code units, or -1 if the string is 6506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * zero terminated. 6516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param position An out parameter. 6526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Originally, the index of the first string position that failed a check. 6536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Now, always returns zero. 6546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This parameter may be null. 6556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 6566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 6576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Spoofing or security issues detected with the input string are 6586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * not reported here, but through the function's return value. 6596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bits set for any potential security 6606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or spoofing issues detected. The bits are defined by 6616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 6626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be zero if the input string passes all of the 6636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enabled checks. 6646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 6656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 6666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 6676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_check(const USpoofChecker *sc, 6686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const UChar *id, int32_t length, 6696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t *position, 6706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 6716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 6736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 6746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the specified string for possible security issues. 6756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The text to be checked will typically be an identifier of some sort. 6766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The set of checks to be performed is specified with uspoof_setChecks(). 6776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 6786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 6796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id A identifier to be checked for possible security issues, in UTF8 format. 6806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length the length of the string to be checked, or -1 if the string is 6816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * zero terminated. 6826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param position An out parameter. 6836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Originally, the index of the first string position that failed a check. 6846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Now, always returns zero. 6856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This parameter may be null. 6866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @deprecated ICU 51 6876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 6886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 6896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Spoofing or security issues detected with the input string are 6906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * not reported here, but through the function's return value. 6916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If the input contains invalid UTF-8 sequences, 6926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * a status of U_INVALID_CHAR_FOUND will be returned. 6936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bits set for any potential security 6946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or spoofing issues detected. The bits are defined by 6956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 6966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be zero if the input string passes all of the 6976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enabled checks. 6986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 6996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 7006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 7016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_checkUTF8(const USpoofChecker *sc, 7026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const char *id, int32_t length, 7036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t *position, 7046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 7056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 7086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 7096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the specified string for possible security issues. 7106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The text to be checked will typically be an identifier of some sort. 7116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The set of checks to be performed is specified with uspoof_setChecks(). 7126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 7146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id A identifier to be checked for possible security issues. 7156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param position An out parameter. 7166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Originally, the index of the first string position that failed a check. 7176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Now, always returns zero. 7186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * This parameter may be null. 7196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @deprecated ICU 51 7206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 7216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 7226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Spoofing or security issues detected with the input string are 7236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * not reported here, but through the function's return value. 7246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bits set for any potential security 7256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or spoofing issues detected. The bits are defined by 7266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. (returned_value & USPOOF_ALL_CHECKS) 7276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * will be zero if the input string passes all of the 7286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enabled checks. 7296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 7306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 7316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 7326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_checkUnicodeString(const USpoofChecker *sc, 7336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const icu::UnicodeString &id, 7346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org int32_t *position, 7356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 7366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 7386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 7416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the whether two specified strings are visually confusable. 7426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The types of confusability to be tested - single script, mixed script, 7436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or whole script - are determined by the check options set for the 7446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecker. 7456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The tests to be performed are controlled by the flags 7476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_SINGLE_SCRIPT_CONFUSABLE 7486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_MIXED_SCRIPT_CONFUSABLE 7496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_WHOLE_SCRIPT_CONFUSABLE 7506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * At least one of these tests must be selected. 7516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_ANY_CASE is a modifier for the tests. Select it if the identifiers 7536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * may be of mixed case. 7546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * If identifiers are case folded for comparison and 7556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * display to the user, do not select the USPOOF_ANY_CASE option. 7566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 7596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id1 The first of the two identifiers to be compared for 7606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The strings are in UTF-16 format. 7616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length1 the length of the first identifer, expressed in 7626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 16 bit UTF-16 code units, or -1 if the string is 7636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * nul terminated. 7646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id2 The second of the two identifiers to be compared for 7656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The identifiers are in UTF-16 format. 7666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length2 The length of the second identifiers, expressed in 7676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 16 bit UTF-16 code units, or -1 if the string is 7686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * nul terminated. 7696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 7706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 7716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Confusability of the identifiers is not reported here, 7726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * but through this function's return value. 7736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bit(s) set corresponding to 7746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the type of confusability found, as defined by 7756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. Zero is returned if the identifiers 7766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are not confusable. 7776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 7786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 7796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 7806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_areConfusable(const USpoofChecker *sc, 7816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const UChar *id1, int32_t length1, 7826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const UChar *id2, int32_t length2, 7836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 7846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 7876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 7886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the whether two specified strings are visually confusable. 7896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The types of confusability to be tested - single script, mixed script, 7906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or whole script - are determined by the check options set for the 7916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecker. 7926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 7936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 7946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id1 The first of the two identifiers to be compared for 7956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The strings are in UTF-8 format. 7966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length1 the length of the first identifiers, in bytes, or -1 7976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * if the string is nul terminated. 7986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id2 The second of the two identifiers to be compared for 7996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The strings are in UTF-8 format. 8006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length2 The length of the second string in bytes, or -1 8016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * if the string is nul terminated. 8026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 8036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 8046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Confusability of the strings is not reported here, 8056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * but through this function's return value. 8066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bit(s) set corresponding to 8076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the type of confusability found, as defined by 8086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. Zero is returned if the strings 8096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are not confusable. 8106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 8116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 8126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 8136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_areConfusableUTF8(const USpoofChecker *sc, 8146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const char *id1, int32_t length1, 8156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const char *id2, int32_t length2, 8166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 8176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 8226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 8236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Check the whether two specified strings are visually confusable. 8246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The types of confusability to be tested - single script, mixed script, 8256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or whole script - are determined by the check options set for the 8266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USpoofChecker. 8276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 8286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 8296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s1 The first of the two identifiers to be compared for 8306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The strings are in UTF-8 format. 8316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param s2 The second of the two identifiers to be compared for 8326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * confusability. The strings are in UTF-8 format. 8336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 8346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 8356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Confusability of the identifiers is not reported here, 8366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * but through this function's return value. 8376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return An integer value with bit(s) set corresponding to 8386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * the type of confusability found, as defined by 8396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * enum USpoofChecks. Zero is returned if the identifiers 8406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * are not confusable. 8416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 8426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 8436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 8446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_areConfusableUnicodeString(const USpoofChecker *sc, 8456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const icu::UnicodeString &s1, 8466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const icu::UnicodeString &s2, 8476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 8486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 8496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 8526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the "skeleton" for an identifier. 8536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Skeletons are a transformation of the input identifier; 8546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Two identifiers are confusable if their skeletons are identical. 8556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * See Unicode UAX #39 for additional information. 8566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 8576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Using skeletons directly makes it possible to quickly check 8586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * whether an identifier is confusable with any of some large 8596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * set of existing identifiers, by creating an efficiently 8606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * searchable collection of the skeletons. 8616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 8626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 8636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param type The type of skeleton, corresponding to which 8646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of the Unicode confusable data tables to use. 8656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The default is Mixed-Script, Lowercase. 8666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Allowed options are USPOOF_SINGLE_SCRIPT_CONFUSABLE and 8676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_ANY_CASE_CONFUSABLE. The two flags may be ORed. 8686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id The input identifier whose skeleton will be computed. 8696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length The length of the input identifier, expressed in 16 bit 8706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * UTF-16 code units, or -1 if the string is zero terminated. 8716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param dest The output buffer, to receive the skeleton string. 8726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param destCapacity The length of the output buffer, in 16 bit units. 8736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The destCapacity may be zero, in which case the function will 8746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * return the actual length of the skeleton. 8756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 8766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 8776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return The length of the skeleton string. The returned length 8786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is always that of the complete skeleton, even when the 8796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * supplied buffer is too small (or of zero length) 8806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 8816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 8826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 8836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 8846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getSkeleton(const USpoofChecker *sc, 8856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint32_t type, 8866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const UChar *id, int32_t length, 8876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UChar *dest, int32_t destCapacity, 8886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 8896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 8906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 8916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the "skeleton" for an identifier. 8926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Skeletons are a transformation of the input identifier; 8936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Two identifiers are confusable if their skeletons are identical. 8946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * See Unicode UAX #39 for additional information. 8956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 8966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Using skeletons directly makes it possible to quickly check 8976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * whether an identifier is confusable with any of some large 8986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * set of existing identifiers, by creating an efficiently 8996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * searchable collection of the skeletons. 9006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker 9026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param type The type of skeleton, corresponding to which 9036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of the Unicode confusable data tables to use. 9046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The default is Mixed-Script, Lowercase. 9056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Allowed options are USPOOF_SINGLE_SCRIPT_CONFUSABLE and 9066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_ANY_CASE. The two flags may be ORed. 9076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id The UTF-8 format identifier whose skeleton will be computed. 9086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param length The length of the input string, in bytes, 9096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or -1 if the string is zero terminated. 9106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param dest The output buffer, to receive the skeleton string. 9116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param destCapacity The length of the output buffer, in bytes. 9126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The destCapacity may be zero, in which case the function will 9136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * return the actual length of the skeleton. 9146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 9156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. Possible Errors include U_INVALID_CHAR_FOUND 9166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * for invalid UTF-8 sequences, and 9176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * U_BUFFER_OVERFLOW_ERROR if the destination buffer is too small 9186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * to hold the complete skeleton. 9196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return The length of the skeleton string, in bytes. The returned length 9206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * is always that of the complete skeleton, even when the 9216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * supplied buffer is too small (or of zero length) 9226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 9246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 9256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 9266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getSkeletonUTF8(const USpoofChecker *sc, 9276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint32_t type, 9286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const char *id, int32_t length, 9296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org char *dest, int32_t destCapacity, 9306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 9316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 9336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 9346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the "skeleton" for an identifier. 9356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Skeletons are a transformation of the input identifier; 9366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Two identifiers are confusable if their skeletons are identical. 9376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * See Unicode UAX #39 for additional information. 9386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Using skeletons directly makes it possible to quickly check 9406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * whether an identifier is confusable with any of some large 9416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * set of existing identifiers, by creating an efficiently 9426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * searchable collection of the skeletons. 9436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc The USpoofChecker. 9456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param type The type of skeleton, corresponding to which 9466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * of the Unicode confusable data tables to use. 9476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The default is Mixed-Script, Lowercase. 9486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Allowed options are USPOOF_SINGLE_SCRIPT_CONFUSABLE and 9496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * USPOOF_ANY_CASE_CONFUSABLE. The two flags may be ORed. 9506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param id The input identifier whose skeleton will be computed. 9516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param dest The output identifier, to receive the skeleton string. 9526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if an error occurred while attempting to 9536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * perform the check. 9546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return A reference to the destination (skeleton) string. 9556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 9576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 9586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_I18N_API icu::UnicodeString & U_EXPORT2 9596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getSkeletonUnicodeString(const USpoofChecker *sc, 9606f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org uint32_t type, 9616f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org const icu::UnicodeString &id, 9626f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org icu::UnicodeString &dest, 9636f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 9646f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_SHOW_CPLUSPLUS_API */ 9656f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9666f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9676f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#ifndef U_HIDE_DRAFT_API 9686f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 9696f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the set of Candidate Characters for Inclusion in Identifiers, as defined 9706f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in Unicode UAX #31, http://www.unicode.org/reports/tr31/#Table_Candidate_Characters_for_Inclusion_in_Identifiers 9716f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9726f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 9736f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be deleted by the caller. 9746f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9756f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if a problem occurs while creating the set. 9766f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9776f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 9786f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 9796f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT const USet * U_EXPORT2 9806f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getInclusionSet(UErrorCode *status); 9816f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9826f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 9836f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the set of characters from Recommended Scripts for Inclusion in Identifiers, as defined 9846f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in Unicode UAX #31, http://www.unicode.org/reports/tr31/#Table_Recommended_Scripts 9856f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9866f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 9876f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be deleted by the caller. 9886f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9896f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if a problem occurs while creating the set. 9906f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 9916f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 9926f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 9936f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT const USet * U_EXPORT2 9946f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getRecommendedSet(UErrorCode *status); 9956f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9966f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#if U_SHOW_CPLUSPLUS_API 9976f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 9986f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 9996f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the set of Candidate Characters for Inclusion in Identifiers, as defined 10006f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in Unicode UAX #31, http://www.unicode.org/reports/tr31/#Table_Candidate_Characters_for_Inclusion_in_Identifiers 10016f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10026f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 10036f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be deleted by the caller. 10046f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10056f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if a problem occurs while creating the set. 10066f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10076f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 10086f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 10096f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT const icu::UnicodeSet * U_EXPORT2 10106f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getInclusionUnicodeSet(UErrorCode *status); 10116f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10126f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 10136f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Get the set of characters from Recommended Scripts for Inclusion in Identifiers, as defined 10146f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * in Unicode UAX #31, http://www.unicode.org/reports/tr31/#Table_Recommended_Scripts 10156f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10166f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The returned set is frozen. Ownership of the set remains with the ICU library; it must not 10176f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * be deleted by the caller. 10186f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10196f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status The error code, set if a problem occurs while creating the set. 10206f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10216f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @draft ICU 51 10226f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 10236f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_DRAFT const icu::UnicodeSet * U_EXPORT2 10246f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_getRecommendedUnicodeSet(UErrorCode *status); 10256f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10266f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_SHOW_CPLUSPLUS_API */ 10276f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* U_HIDE_DRAFT_API */ 10286f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10296f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org/** 10306f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Serialize the data for a spoof detector into a chunk of memory. 10316f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The flattened spoof detection tables can later be used to efficiently 10326f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * instantiate a new Spoof Detector. 10336f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10346f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * The serialized spoof checker includes only the data compiled from the 10356f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * Unicode data tables by uspoof_openFromSource(); it does not include 10366f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * include any other state or configuration that may have been set. 10376f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10386f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param sc the Spoof Detector whose data is to be serialized. 10396f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param data a pointer to 32-bit-aligned memory to be filled with the data, 10406f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * can be NULL if capacity==0 10416f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param capacity the number of bytes available at data, 10426f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * or 0 for preflighting 10436f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @param status an in/out ICU UErrorCode; possible errors include: 10446f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * - U_BUFFER_OVERFLOW_ERROR if the data storage block is too small for serialization 10456f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * - U_ILLEGAL_ARGUMENT_ERROR the data or capacity parameters are bad 10466f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @return the number of bytes written or needed for the spoof data 10476f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * 10486f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @see utrie2_openFromSerialized() 10496f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org * @stable ICU 4.2 10506f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org */ 10516f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orgU_STABLE int32_t U_EXPORT2 10526f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.orguspoof_serialize(USpoofChecker *sc, 10536f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org void *data, int32_t capacity, 10546f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org UErrorCode *status); 10556f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10566f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10576f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif 10586f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org 10596f31ac30b9092fd02a8c97e5216cf53f3e4fae4jshin@chromium.org#endif /* USPOOF_H */ 1060