unicode_category_lookup, unicode_isalnum, unicode_isalpha, unicode_isblank, unicode_isdigit, unicode_isgraph, unicode_islower, unicode_ispunct, unicode_isspace, unicode_isupper — unicode character categorization
#include <courier-unicode.h>
uint32_t
unicode_category_lookup( |
char32_t c) ; |
int
unicode_isalnum( |
char32_t c) ; |
int
unicode_isalpha( |
char32_t c) ; |
int
unicode_isblank( |
char32_t c) ; |
int
unicode_isdigit( |
char32_t c) ; |
int
unicode_isgraph( |
char32_t c) ; |
int
unicode_islower( |
char32_t c) ; |
int
unicode_ispunct( |
char32_t c) ; |
int
unicode_isspace( |
char32_t c) ; |
int
unicode_isupper( |
char32_t c) ; |
unicode_category_lookup
()
looks up the unicode character's
categorization. unicode_category_lookup
() returns a 32 bit
value. The value's UNICODE_CATEGORY_1 bits specify the first
level of the unicode character's category, with UNICODE_CATEGORY_2, UNICODE_CATEGORY_3, and UNICODE_CATEGORY_4 bits specifying the 2nd,
3rd, and 4th level, if given. A value of 0 for each
corresponding bit set indicates that no category is specified
for this level, for this character; otherwise the possible
values are defined in <courier-unicode.h>
.
The remaining functions implement comparable equivalents of their non-unicode versions in the standard C library, as follows:
unicode_isalnum
()Returns non-0 for all unicode_isalpha
() or unicode_isdigit
().
unicode_isalpha
()Returns non-0 for all UNICODE_CATEGORY_1_LETTER.
unicode_isblank
()Return non-0 for TAB, and all UNICODE_CATEGORY_2_SPACE.
unicode_isdigit
()Returns non-0 for all UNICODE_CATEGORY_1_NUMBER | UNICODE_CATEGORY_2_DIGIT, only (no third categories).
unicode_isgraph
()Returns non-0 for all codepoints above SPACE which are not unicode_isspace
().
unicode_islower
()Returns non-0 for all unicode_isalpha
() for which the
character is equal to unicode_lc(3) of
itself.
unicode_ispunct
()Returns non-0 for all UNICODE_CATEGORY_1_PUNCTUATION.
unicode_isspace
()Returns non-0 for unicode_isblank() or for unicode characters with linebreaking properties of BK, CR, LF, NL, and SP.
unicode_isupper
()Returns non-0 for all unicode_isalpha
() for which the
character is equal to unicode_uc(3) of
itself.