Re: [gmime-devel] g_mime_part_get_openpgp_data () and Content-Transfer-Encoding: base64



On 12/14/2017 11:31 AM, Daniel Kahn Gillmor wrote:
On Tue 2017-12-12 10:02:51 -0500, Jeffrey Stedfast wrote:
The get_openpgp_data() function returns an enum value (or bitfield? I
can't remember atm)
it's an enum, which i think makes sense, unless you want to contemplate
the following situations:

  * if there's an encrypted blob inside a cleartext signature (usually
    OpenPGP goes the other way around, with signatures inside encryption)

  * if there are two blobs in a single message.

the enum, as currently defined, *could* be switched over to be a
bitfield if you wanted to, because it happens to only use values 0, 1,
and 2.  I'm not saying it should be switched, just observing it as a
possible subtle API change.

I've since added public/private key block enum values. Although no release has been made yet, so there's still time to make them flags ;-)


based on the OpenPGP markers that the GMimeParser found while parsing
the message (while scanning for MIME boundaries).  Since the
GMimeParser does not decode the content as it is parsing the message,
it can't peer under the obfuscation of the base64 encoded blob.
yep, that's what i thought was happening.

I thought this would be enough since in the subset of messages that I've
personally seen that use inline PGP, the text always comes through using
the 7bit (either implicitly or explicitly), 8bit, or quoted-printable
(which makes sense for signed messages) encodings which do not obfuscate
the OpenPGP markers. I hadn't considered the likelihood of (especially)
encrypted messages being encoded using base64 since there's literally no
reason to do that (armored PGP data is already 7bit clean).
fwiw, i don't care at all about OpenPGP inline signatures.  i think
they're dangerous in several ways, and i don't think any reasonable
client has a good way of dealing with them safely.

For more details, see:

    https://dkg.fifthhorseman.net/notes/inline-pgp-harmful/

So i'll focus my reply here on inline-encrypted data.

The toolchains that are b64-encoding a fully-encrypted-and-armored MIME
part are obviously faulty, but they are doing something that is within
the letter of the specifications.  As you say, there is "literally no
reason to do that".

There's no easy way to make this work by doing it in the parser, *but* I
could possibly write a GMimeFilter that would be able to detect OpenPGP
markers.

Once I do that, I could have g_mime_part_get_openpgp_data() use said
filter if the current GMimeOpenPGPData state is NONE.

How does that sound?
I think you're saying that if the state is NONE, then any call to
get_openpgp_data() would run an additional pass over the data, with the
filter in place, rather than just returning the currently-cached value.

Yes and no. I've actually already implemented what I proposed and it still caches the value and the parser still tries to auto-detect as it scans the content streams, but you are also correct in that if the parser doesn't detect anything, the value is "UNKNOWN" and so will invoke the filter on the decoded content when you call g_mime_part_get_openpgp_data().

FWIW, I added an "UNKNOWN" state so that once we detect NONE, we don't have to re-filter.


That seems troubling to me from an efficiency viewpoint.  In notmuch,
i'm proposing to invoke get_openpgp_data on basically every leaf MIME
part processed, so having it do something other than return a cached
value is a steeper cost than i'd hoped for.

Agreed...


what if the filter was simply a filter that ran only when the part was
base64-encoded, and content-type text/*?  you could cache in the message
whether the extra filter had been run already or not, and when
get_openpgp_data was invoked, you could only run it in those special
cases (and you could immediately cache the fact that the filter had been
runas well).

So, something like (pseudocode):

     message.get_openpgp_data:
         if message.stored_openpgp_data == NONE &&
            message.major_content_type == "text" &&
            message.content_transfer_encoding == "base64":
               if message.decoded_openpgp_data is NULL:
                  message.decoded_openpgp_data =
                    scan_for_openpgp(b64decode(message.body))
               return message.decoded_openpgp_data
          return message.stored_openpgp_data

wdyt?  it still seems inefficient, but it would limit the overall cost
while still catching these bizarre systems.

Yea, I think something like this could work.

Jeff


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]