#include <utf.h>
class CnvUtfConverter |
Public Member Enumerations | |
---|---|
enum | anonymous { KStateDefault } |
enum | TError { EErrorIllFormedInput } |
Public Member Functions | |
---|---|
IMPORT_C TInt | ConvertFromUnicodeToUtf7(TDes8 &, const TDesC16 &, TBool) |
TInt | ConvertFromUnicodeToUtf7(TDes8 &, const TDesC16 &, TBool, TBool) |
IMPORT_C HBufC8 * | ConvertFromUnicodeToUtf7L(const TDesC16 &, TBool) |
IMPORT_C TInt | ConvertFromUnicodeToUtf8(TDes8 &, const TDesC16 &) |
TInt | ConvertFromUnicodeToUtf8(TDes8 &, const TDesC16 &, TBool) |
IMPORT_C HBufC8 * | ConvertFromUnicodeToUtf8L(const TDesC16 &) |
IMPORT_C TInt | ConvertToUnicodeFromUtf7(TDes16 &, const TDesC8 &, TInt &) |
TInt | ConvertToUnicodeFromUtf7(TDes16 &, const TDesC8 &, TBool, TInt &) |
IMPORT_C HBufC16 * | ConvertToUnicodeFromUtf7L(const TDesC8 &) |
IMPORT_C TInt | ConvertToUnicodeFromUtf8(TDes16 &, const TDesC8 &) |
TInt | ConvertToUnicodeFromUtf8(TDes16 &, const TDesC8 &, TBool) |
TInt | ConvertToUnicodeFromUtf8(TDes16 &, const TDesC8 &, TBool, TInt &, TInt &) |
IMPORT_C HBufC16 * | ConvertToUnicodeFromUtf8L(const TDesC8 &) |
Converts text between Unicode (UCS-2) and the two Unicode transformation formats UTF-7 and UTF-8. There are no functions to convert directly between UTF-7 and UTF-8.
Objects of this class do not need to be created because all the member functions are static. The four functions are passed text in the second argument and output the resulting text in the first argument. Sixteen-bit descriptors are used to hold text encoded in UCS-2 (i.e. normal 16 bit Unicode), and eight-bit descriptors are used to hold text encoded in either of the transformation formats.
The conversion functions return the number of characters which were not converted because the output descriptor was not long enough to hold all of the converted text. This allows users of this class to perform partial conversions on an input descriptor, handling the case when the input descriptor is truncated mid way through a multi-byte character. The caller does not have to guess how big to make the output descriptor for a given input descriptor- they can simply do the conversion in a loop using a small output descriptor. The ability to handle truncated descriptors is particularly useful if the caller is receiving information in chunks from an external source.
others may be added in the future.
IMPORT_C TInt | ConvertFromUnicodeToUtf7 | ( | TDes8 & | aUtf7, |
const TDesC16 & | aUnicode, | |||
TBool | aEncodeOptionalDirectCharactersInBase64 | |||
) | [static] |
Converts Unicode text into UTF-7 encoding.
Parameter | Description |
---|---|
aUtf7 | On return, contains the UTF-7 encoded output string. |
aUnicode | A UCS-2 encoded input string. |
aEncodeOptionalDirectCharactersInBase64 | If ETrue then characters from UTF-7 set O (optional direct characters) are encoded in Modified Base64. If EFalse the characters are encoded directly, as their ASCII equivalents. |
Returns: The number of unconverted characters left at the end of the input descriptor, or one of the error values defined in TError.
IMPORT_C HBufC8 * | ConvertFromUnicodeToUtf7L | ( | const TDesC16 & | aUnicode, |
TBool | aEncodeOptionalDirectCharactersInBase64 | |||
) | [static] |
Converts Unicode text into UTF-7 encoding. The fucntion leaves with KErrCorrupt if the input string is corrupt.
Parameter | Description |
---|---|
aUnicode | A UCS-2 encoded input string. |
aEncodeOptionalDirectCharactersInBase64 | If ETrue then characters from UTF-7 set O (optional direct characters) are encoded in Modified Base64. If EFalse the characters are encoded directly, as their ASCII equivalents. |
Returns: A descriptor containing the UTF-7 encoded output string.
Converts Unicode text into UTF-8 encoding.
Parameter | Description |
---|---|
aUtf8 | On return, contains the UTF-8 encoded output string. |
aUnicode | The Unicode-encoded input string. |
Returns: The number of unconverted characters left at the end of the input descriptor, or one of the error values defined in TError.
TInt | ConvertFromUnicodeToUtf8 | ( | TDes8 & | aUtf8, |
const TDesC16 & | aUnicode, | |||
TBool | aGenerateJavaConformantUtf8 | |||
) | [static] |
Converts Unicode text into UTF-8 encoding.
Surrogate pairs can be input which will result in a valid 4 byte UTF-8 value.
The variant of UTF-8 used internally by Java differs slightly from standard UTF-8. The TBool argument controls the UTF-8 variant generated by this function.
Parameter | Description |
---|---|
aUtf8 | On return, contains the UTF-8 encoded output string. |
aUnicode | A UCS-2 encoded input string. |
aGenerateJavaConformantUtf8 | EFalse for orthodox UTF-8. ETrue for Java UTF-8. The default is EFalse. |
Returns: The number of unconverted characters left at the end of the input descriptor, or one of the error values defined in TError.
Converts Unicode text into UTF-8 encoding.
The variant of UTF-8 used internally by Java differs slightly from standard UTF-8. The TBool argument controls the UTF-8 variant generated by this function. This function leaves with a KErrCorrupt if the input string is corrupt.
Parameter | Description |
---|---|
aUnicode | A UCS-2 encoded input string. |
Returns: A pointer to an HBufC8 containing the converted UTF8.
Converts text encoded using the Unicode transformation format UTF-7 into the Unicode UCS-2 character set.
If the conversion is achieved using a series of calls to this function, where each call starts off where the previous call reached in the input descriptor, the state of the conversion is stored. The initial value of the state variable should be set as KStateDefault when the conversion is started, and afterwards simply passed unchanged into each function call.
Parameter | Description |
---|---|
aUnicode | On return, contains the Unicode encoded output string. |
aUtf7 | The UTF-7 encoded input string. |
aState | For the first call of the function set to KStateDefault. For subsequent calls, pass in the variable unchanged. |
Returns: The number of unconverted bytes left at the end of the input descriptor, or one of the error values defined in TError.
Converts text encoded using the Unicode transformation format UTF-7 into the Unicode UCS-2 character set.
Parameter | Description |
---|---|
aUtf7 | The UTF-7 encoded input string. |
Returns: A pointer to an HBufC16 containing the converted Unicode string
Converts text encoded using the Unicode transformation format UTF-8 into the Unicode UCS-2 character set.
Parameter | Description |
---|---|
aUnicode | On return, contains the Unicode encoded output string. |
aUtf8 | The UTF-8 encoded input string |
Returns: The number of unconverted bytes left at the end of the input descriptor, or one of the error values defined in TError.
TInt | ConvertToUnicodeFromUtf8 | ( | TDes16 & | aUnicode, |
const TDesC8 & | aUtf8, | |||
TBool | aGenerateJavaConformantUtf8 | |||
) | [static] |
Converts text encoded using the Unicode transformation format UTF-8 into the Unicode UCS-2 character set.
Parameter | Description |
---|---|
aUnicode | On return, contains the Unicode encoded output string. |
aUtf8 | The UTF-8 encoded input string |
aGenerateJavaConformantUtf8 | EFalse for orthodox UTF-8. ETrue for Java |
Returns: The number of unconverted bytes left at the end of the input descriptor, or one of the error values defined in TError.
TInt | ConvertToUnicodeFromUtf8 | ( | TDes16 & | aUnicode, |
const TDesC8 & | aUtf8, | |||
TBool | aGenerateJavaConformantUtf8, | |||
TInt & | aNumberOfUnconvertibleCharacters, | |||
TInt & | aIndexOfFirstByteOfFirstUnconvertibleCharacter | |||
) | [static] |
Converts text encoded using the Unicode transformation format UTF-8 into the Unicode UCS-2 character set. Surrogate pairs can be created when a valid 4 byte UTF-8 is input.
The variant of UTF-8 used internally by Java differs slightly from standard UTF-8. The TBool argument controls the UTF-8 variant generated by this function.
Parameter | Description |
---|---|
aUnicode | On return, contains the Unicode encoded output string. |
aUtf8 | The UTF-8 encoded input string |
aGenerateJavaConformantUtf8 | EFalse for orthodox UTF-8. ETrue for Java UTF-8. The default is EFalse. |
aNumberOfUnconvertibleCharacters | On return, contains the number of bytes which were not converted. |
aIndexOfFirstByteOfFirstUnconvertibleCharacter | On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted. |
Returns: The number of unconverted bytes left at the end of the input descriptor, or one of the error values defined in TError.
Converts text encoded using the Unicode transformation format UTF-8 into the Unicode UCS-2 character set. This function leaves with an error code of the input string is corrupted.
Parameter | Description |
---|---|
aUtf8 | The UTF-8 encoded input string |
Returns: A pointer to an HBufC16 with the converted Unicode string.