1f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* 2f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)******************************************************************************* 3f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 4f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Copyright (C) 2002-2003, International Business Machines 5f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* Corporation and others. All Rights Reserved. 6f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 7f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)******************************************************************************* 8f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* file name: punycode.h 9f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* encoding: US-ASCII 10f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* tab size: 8 (not used) 11f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* indentation:4 12f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* 13f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* created on: 2002jan31 14f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)* created by: Markus W. Scherer 15f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)*/ 16f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 17f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* This ICU code derived from: */ 18f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* 19f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)punycode.c 0.4.0 (2001-Nov-17-Sat) 20f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)http://www.cs.berkeley.edu/~amc/idn/ 21f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)Adam M. Costello 22f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)http://www.nicemice.net/amc/ 23f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)*/ 24f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 25f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#ifndef __PUNYCODE_H__ 26f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#define __PUNYCODE_H__ 27f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 28f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#include "unicode/utypes.h" 29f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 30f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#if !UCONFIG_NO_IDNA 31f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 32f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 33f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strToPunycode() converts Unicode to Punycode. 34f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 35f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The input string must not contain single, unpaired surrogates. 36f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The output will be represented as an array of ASCII code points. 37f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 38f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The output string is NUL-terminated according to normal ICU 39f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * string output rules. 40f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 41f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src Input Unicode string. 42f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This function handles a limited amount of code points 43f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * (the limit is >=64). 44f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_INDEX_OUTOFBOUNDS_ERROR is set if the limit is exceeded. 45f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength Number of UChars in src, or -1 if NUL-terminated. 46f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest Output Punycode array. 47f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity Size of dest. 48f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param caseFlags Vector of boolean values, one per input UChar, 49f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * indicating that the corresponding character is to be 50f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * marked for the decoder optionally 51f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * uppercasing (TRUE) or lowercasing (FALSE) 52f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the character. 53f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * ASCII characters are output directly in the case as marked. 54f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Flags corresponding to trail surrogates are ignored. 55f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * If caseFlags==NULL then input characters are not 56f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * case-mapped. 57f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode ICU in/out error code parameter. 58f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_INVALID_CHAR_FOUND if src contains 59f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * unmatched single surrogates. 60f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_INDEX_OUTOFBOUNDS_ERROR if src contains 61f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * too many code points. 62f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return Number of ASCII characters in puny. 63f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 64f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strFromPunycode 65f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 66f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_CFUNC int32_t 67f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strToPunycode(const UChar *src, int32_t srcLength, 68f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar *dest, int32_t destCapacity, 69f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) const UBool *caseFlags, 70f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 71f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 72f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/** 73f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * u_strFromPunycode() converts Punycode to Unicode. 74f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The Unicode string will be at most as long (in UChars) 75f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * than the Punycode string (in chars). 76f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 77f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param src Input Punycode string. 78f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param srcLength Length of puny, or -1 if NUL-terminated 79f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param dest Output Unicode string buffer. 80f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param destCapacity Size of dest in number of UChars, 81f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * and of caseFlags in numbers of UBools. 82f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param caseFlags Output array for case flags as 83f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * defined by the Punycode string. 84f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * The caller should uppercase (TRUE) or lowercase (FASLE) 85f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * the corresponding character in dest. 86f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * For supplementary characters, only the lead surrogate 87f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * is marked, and FALSE is stored for the trail surrogate. 88f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * This is redundant and not necessary for ASCII characters 89f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * because they are already in the case indicated. 90f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Can be NULL if the case flags are not needed. 91f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @param pErrorCode ICU in/out error code parameter. 92f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_INVALID_CHAR_FOUND if a non-ASCII character 93f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * precedes the last delimiter ('-'), 94f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * or if an invalid character (not a-zA-Z0-9) is found 95f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * after the last delimiter. 96f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * U_ILLEGAL_CHAR_FOUND if the delta sequence is ill-formed. 97f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @return Number of UChars written to dest. 98f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 99f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * @see u_strToPunycode 100f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 101f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)U_CFUNC int32_t 102f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)u_strFromPunycode(const UChar *src, int32_t srcLength, 103f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UChar *dest, int32_t destCapacity, 104f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UBool *caseFlags, 105f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) UErrorCode *pErrorCode); 106f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 107f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif /* #if !UCONFIG_NO_IDNA */ 108f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 109f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)#endif 110f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) 111f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles)/* 112f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Hey, Emacs, please set the following: 113f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 114f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * Local Variables: 115f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * indent-tabs-mode: nil 116f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * End: 117f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) * 118f4ed1cf5d184064c4cf0e4359c6d5d8aadb50afaTorne (Richard Coles) */ 119