Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
- From: Dan Williams <dcbw redhat com>
- To: Nathan Williams <njw google com>
- Cc: networkmanager-list gnome org
- Subject: Re: PATCH] sms_decode_text(): Sanitize 8-bit data so that it is UTF8-clean.
- Date: Tue, 27 Sep 2011 14:19:42 -0500
On Tue, 2011-09-27 at 14:55 -0400, Nathan Williams wrote:
>
>
> On Tue, Sep 27, 2011 at 2:18 PM, Dan Williams <dcbw redhat com> wrote:
> On Mon, 2011-09-26 at 18:29 -0400, Nathan Williams wrote:
> > This keeps ModemManager from crashing deep in the DBus
> libraries when
> > a SMS Get() or List() DBus operation finds a message that
> isn't valid
> > UTF-8 and/or has embedded NUL characters.
> >
> > I'll be putting up a separate patch as a proposal for how to
> avoid
> > this problem in the new API.
>
>
> Sounds fine; though in general we know the encoding that the
> message
> comes in with, and we know we need to convert to UTF-8 for
> D-Bus (and
> really, everything should be UTF-8 at the boundaries, it would
> be just
> horrid to expose any charset encoding details to clients and I
> don't
> think we have to). So we should be able to convert to UTF-8
> without any
> real loss of fidelity when reading the message from the
> modem, and we
> should be able to convert from UTF-8 to a suitable charset
> (whatever
> we've selected from CSCS) when sending messages too.
>
> In what cases would we want to send or receive essentially
> binary data
> via SMS? AFAIK most of these cases show up as base64 or
> hex-string SMS
> if they aren't intended for human consumption.
>
>
> We do do that conversion to UTF-8 when we know the transmission
> character set, GSM-7 or UCS2. The one fly in this ointment is that one
> of the possible encodings is, in fact, "8-bit data" (TP-DCS value of
> 04 or f4) with no associated character set. The particular case that
> brought this to my attention was a test SMS from a carrier that was
> supposed to contain, I believe, a polyphonic ringtone for some Nokia
> handset.
Ok, I suppose we could also expose the data as a byte array in the Get()
method call along with the 'text' argument. Since it seems like we can
probably tell whether it's supposed to be a string or not.
Dan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]