Re: Git and --signoff, UTF-8



2009/4/5 Owen Taylor <otaylor redhat com>:
> On Sat, 2009-04-04 at 18:43 +0100, Simos wrote:
>
>> 2. What is the issue regarding UTF-8 in commit messages?
>> This is mostly an issue with names of people.
>> At first the reaction would be to simply use ASCII characters.
>>
>> However, with git and git-send-e-mail, there will eventually be
>> non-ASCII person names
>> in commit messages.
>>
>> For this, it might be good to have an overall policy for UTF-8, and
>> add to the documentation
>> something like
>>
>> git config --global format.headers "Content-Type: text/plain; charset=\"utf-8\""
>>
>> Then, the policy could be either to refrain from using UTF-8 if it can
>> be avoided or use UTF-8 at will.

For git, everything (commit messages, filenames) are just a collection
of bytes, so there is no automatic conversion from one encoding to
another. When git needs an encoding (showing a commit messages, git
send-email / git format-patch,...) UTF-8 is used by default.

This can be change with two configs (from man git-config):

i18n.commitEncoding::
        Character encoding the commit messages are stored in; git itself
        does not care per se, but this information is necessary e.g. when
        importing commits from emails or in the gitk graphical history
        browser (and possibly at other places in the future or in other
        porcelains). See e.g. linkgit:git-mailinfo[1]. Defaults to 'utf-8'.

i18n.logOutputEncoding::
        Character encoding the commit messages are converted to when
        running 'git-log' and friends.

So, using UTF-8 for the names and commits messages is straightforward.

The only problem can appear when somebody uses another encoding, then
s/he should set i18n.commitEncoding before committing.

>
> See:
>
> http://mail.gnome.org/archives/gnome-infrastructure/2009-March/msg00037.html
>
> AFAIK format.headers only affects git send-email / git format-patch.

An git send-email / git format-patch should detect and add the
Content-Type automatically.

>
> I see no reason to avoid using UTF-8.

Me neither.

Santi


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]