On Tue 2017-12-12 10:02:51 -0500, Jeffrey Stedfast wrote:
The get_openpgp_data() function returns an enum value (or bitfield? I can't remember atm)
it's an enum, which i think makes sense, unless you want to contemplate the following situations: * if there's an encrypted blob inside a cleartext signature (usually OpenPGP goes the other way around, with signatures inside encryption) * if there are two blobs in a single message. the enum, as currently defined, *could* be switched over to be a bitfield if you wanted to, because it happens to only use values 0, 1, and 2. I'm not saying it should be switched, just observing it as a possible subtle API change.
based on the OpenPGP markers that the GMimeParser found while parsing the message (while scanning for MIME boundaries). Since the GMimeParser does not decode the content as it is parsing the message, it can't peer under the obfuscation of the base64 encoded blob.
yep, that's what i thought was happening.
I thought this would be enough since in the subset of messages that I've personally seen that use inline PGP, the text always comes through using the 7bit (either implicitly or explicitly), 8bit, or quoted-printable (which makes sense for signed messages) encodings which do not obfuscate the OpenPGP markers. I hadn't considered the likelihood of (especially) encrypted messages being encoded using base64 since there's literally no reason to do that (armored PGP data is already 7bit clean).
fwiw, i don't care at all about OpenPGP inline signatures. i think they're dangerous in several ways, and i don't think any reasonable client has a good way of dealing with them safely. For more details, see: https://dkg.fifthhorseman.net/notes/inline-pgp-harmful/ So i'll focus my reply here on inline-encrypted data. The toolchains that are b64-encoding a fully-encrypted-and-armored MIME part are obviously faulty, but they are doing something that is within the letter of the specifications. As you say, there is "literally no reason to do that".
There's no easy way to make this work by doing it in the parser, *but* I could possibly write a GMimeFilter that would be able to detect OpenPGP markers. Once I do that, I could have g_mime_part_get_openpgp_data() use said filter if the current GMimeOpenPGPData state is NONE. How does that sound?
I think you're saying that if the state is NONE, then any call to get_openpgp_data() would run an additional pass over the data, with the filter in place, rather than just returning the currently-cached value. That seems troubling to me from an efficiency viewpoint. In notmuch, i'm proposing to invoke get_openpgp_data on basically every leaf MIME part processed, so having it do something other than return a cached value is a steeper cost than i'd hoped for. what if the filter was simply a filter that ran only when the part was base64-encoded, and content-type text/*? you could cache in the message whether the extra filter had been run already or not, and when get_openpgp_data was invoked, you could only run it in those special cases (and you could immediately cache the fact that the filter had been runas well). So, something like (pseudocode): message.get_openpgp_data: if message.stored_openpgp_data == NONE && message.major_content_type == "text" && message.content_transfer_encoding == "base64": if message.decoded_openpgp_data is NULL: message.decoded_openpgp_data = scan_for_openpgp(b64decode(message.body)) return message.decoded_openpgp_data return message.stored_openpgp_data wdyt? it still seems inefficient, but it would limit the overall cost while still catching these bizarre systems. --dkg
Attachment:
signature.asc
Description: PGP signature