unicode_grapheme_break, unicode_grapheme_break_init, unicode_grapheme_break_next, unicode_grapheme_break_deinit — unicode grapheme cluster boundary rules
#include <courier-unicode.h>
unicode_grapheme_break_info_t
unicode_grapheme_break_init( |
void) ; |
int
unicode_grapheme_next( |
unicode_grapheme_break_info_t handle, |
char32_t c) ; |
void
unicode_grapheme_deinit( |
unicode_grapheme_break_info_t handle) ; |
int
unicode_grapheme_break( |
char32_t a, |
char32_t b) ; |
These functions implement the unicode grapheme cluster
breaking algorithm. Invoke unicode_grapheme_break_init
() to initialize
the grapheme cluster breaking algorithm. unicode_grapheme_break_init
() returns an
opaque handle. Each subsequent call to unicode_grapheme_break_next
() passes this
handle, and the next character. unicode_grapheme_break_next
() returns a
non-0 value if there's a grapheme break before the character,
in a sequence of Unicode characters. unicode_grapheme_break_deinit
() releases
all resources used by the grapheme breaking handle, and the
unicode_grapheme_break_info_t
handle is no longer valid after this call.
The first call to unicode_grapheme_break_next
() always
returns non-0, as per the GB1 rule.
unicode_grapheme_break
() is
a simplified interface that returns non-zero if there is a
grapheme break between two unicode characters a
and b
. This is is equivalent to
calling unicode_grapheme_break_init
(), followed by
two calls to unicode_grapheme_break_next
(), and finally
unicode_grapheme_break_deinit
(), then
returning the result of the second call to unicode_grapheme_break_next
().