Re: G_UTF8String: Boxed Type Proposal
- From: Randall Sawyer <srandallsawyer hushmail me>
- To: Matthias Clasen <matthias clasen gmail com>
- Cc: gtk-devel-list <gtk-devel-list gnome org>
- Subject: Re: G_UTF8String: Boxed Type Proposal
- Date: Thu, 17 Mar 2016 14:18:29 -0400
On 03/17/2016 10:39 AM, Randall Sawyer wrote:
On 03/17/2016 09:30 AM, Matthias Clasen wrote:
I believe that you haven't found such a proposal because most people
don't see much use in a separate boxed type for utf8 strings. Every
string we pass around in GLib and GTK+, and every char * in their APIs
is expected to be in utf8. The few exceptions to this rule are
explicitly documented.
GLib already provides a number of utilities for dealing with utf8
strings in terms of characters, such as g_utf8_strlen,
g_utf8_substring, g_utf8_find_next/prev_char. We can certainly discuss
adding to that list, if there are glaring omissions.
Here is the vision: Once raw string data - or gunichar value - has
been passed and validated into the construction of a "G_UTF8String"
structure, then contents of two-or-more of these can be easily
combined without the need for additional measuring or validating.
Alright Matthias, after your thoughtful response, I have come to the
following conclusion: When considering management of dynamically
allocated UTF-8 strings, there are actually two points to consider: 1)
Whether the byte sequences are valid per IETF RFC 3629 Section 4 - and -
2) The number of distinct characters represented in the string vs. the
total number of bytes used to represent these.
If someone were to write a widget library or an application using
libraries which ensure valid UTF-8 as input - Gdk key-press events and
GtkIMContexts for example - then it wouldn't make sense to run those
strings through yet another course of validation. That addresses the
first issue.
There is still the question of character length vs. byte length.
Therefore - from what you have told me - I will be sure to present
methods which feature validation as an option and not as the rule.
Thank you.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]