[moon] home
IPv4

Erlkönig: Per-User Locale Definitions

Create and use a locale without root access.
parent
[parent webpage]

server
[webserver base]

search
[search erlkonig webpages]

trust
[import certificates]


homes
[talisman]
[zoion]

Update

2019-09-08 Thunderbird version 60.8.0 breaks all this, and even downgrading to 52.7.0 shows dates still broken, possibly due to something stuck in the ~/.mozailla-thunderbird/ dir. For more info, see https://bugzilla.mozilla.org/show_bug.cgi?id=1426907 (Basically the Thunderbird team ditched a bunch of unmaintained code that may have included, say, unix TZ_TIME support, in favor of another approach, CLDR, that may or may not be harder to customise. Approaches are being discussed.)

Overview

Here's how a non-root user on a Linux system can create, update, and use a personal locale definition. This is useful for getting ISO-8601 time and date formats to be used for date(1) and strftime(3), and in applications like Thunderbird that don't have the sense to allow users to set a strftime-like date/time format themselves.

Changing all the month and day names is straightforward, although not shown here. Note that limitations of the localization routines block deep changes such as implementing a lunar model or the more seasonal hobbit calendar.

If the steps below are followed, using the arbitrarily-created locale name en_US@myISO, the following will display changed date formats in Thunderbird:

$ LOCPATH=~/lib/locale LC_ALL=en_US@myISO thunderbird &
$ 
beforeafter
thunderbird without localedef thunderbird *with* localedef

It's easy to make a ~/bin/thunderbird-iso script:

#!/bin/sh
LOCPATH=$HOME/lib/locale LC_ALL=en_US@myISO exec thunderbird "$@"

Basic Elements

WARNING: This is an unrefined, first-pass hack at getting control of the default date text formats without being root. Use at your own risk. What risk? Making this your account default will cause programs you run to use the new format, with a faint chance of adverse side effects if one program's human-formatted dates are poorly parsed by another program. This can be especially quirky if you tend to use su without the - option and/or have locally-written scripts that attempt to parse loosely-specced dates. However, many programs expecting dates to be read by other programs will write the epoch time into their output, which isn't subject to such errors.
  1. Insure that you have the localedef(1) utility on your system

    $ type localedef
    localedef is hashed (/usr/bin/localedef)
    $ 
  2. Verify that your libc supports the LOCPATH environment variable:

    $ grep LOCPATH /usr/lib/libc.a
    Binary file /usr/lib/libc.a matches
    $ 
  3. Select a name for your new locale that doesn't collide with any existing ones (the output here is abbreviated):

    $ locale -a    
    C
    POSIX
    en_US.utf8
    $ 

    This example will assume your new locale is named en_US@myISO; you can easily choose a locale name other than myISO.

  4. Make a directory for your locale:

    $ mkdir -p ~/lib/locale
    $ cd ~/lib/locale
    $ 
  5. Locate some baseline to start from:

    $ cp /usr/share/i18n/locales/en_US ~/lib/locale/en_US@myISO.def
    $ 
  6. Test compile your clone (still in ~/lib/locale). Note that localedef's first argument does need to be a full pathname unless you will/can install into the system locales area.

    $ localedef ~/lib/locale/en_US@myISO < en_US@myISO.def
    $ ls -FCas en_US@myISO
    total 284
      4 ./          220 LC_CTYPE             4 LC_MONETARY    4 LC_TELEPHONE
      4 ../           4 LC_IDENTIFICATION    4 LC_NAME        4 LC_TIME
      4 LC_ADDRESS    4 LC_MEASUREMENT       4 LC_NUMERIC
     16 LC_COLLATE    4 LC_MESSAGES/         4 LC_PAPER
    $ 
  7. Test your date(1) command with (the below are elisions):

    $ LOCPATH=~/lib/locale LC_ALL=en_US@myISO strace date 2>&1 | grep myISO
    open("…/lib/locale/en_US@myISO/LC_IDENTIFICATION", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_MEASUREMENT", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_TELEPHONE", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_ADDRESS", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_NAME", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_PAPER", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_MESSAGES", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_MONETARY", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_COLLATE", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_TIME", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_NUMERIC", O_RDONLY) = 3
    open("…/lib/locale/en_US@myISO/LC_CTYPE", O_RDONLY) = 3
    $ 

    If you don't see the open() lines, or they return negatives on the far right, then LOCPATH or LC_ALL isn't working.

  8. Now you can begin editing, probably in the en_US@myISO.def file. Have a look at the example file at the end of this document, particularly the entry for data-fmt of "%Y-%m-%d %H:%M:%S %Z %a". Note that the locale files tend to use encoded forms the text strings, with each character becoming its hex equivalent in the form <Udddd>. For example, a space can be written <U0020>. I noticed that this encoding can be skipped, just putting the strftime strings in literally, but it seemed easier to simply restore the encoding once the string values were tested than to figure out whether some benefit only came with the special encoding. To convert, using text as an example:

    ISO-8859-1 to Uxxxx

    $ text2u () {
       perl -e 'while(<>) {
          chomp;
          map { printf("<U%04X>", ord($_)); } split("");
          print("\n");
       }';
    }
    $ echo text | text2u
    <U0074><U0065><U0078><U0074>
    $ 

    Uxxxx to ISO-8859-1

    $ u2text () {
       perl -e 'while(<>) {
          s/[<>U]/ /g;
          map { printf("%c", hex($_)) ; } split(" ");
          print("\n");
       }';
    }
    $ echo '<U0074><U0065><U0078><U0074>' | u2text
    text
    $ 

    To find out which % codes to use, read the manpage on strftime(3). Note that slashes are not literal in the Uxxxx-coded parts, and are simply line continuation characters.

  9. If you want to apply the results to all of your programs, rather than to particular ones using scripts like the thunderbird-iso, you can. I don't really recommend this because it seems improper to use LC_ALL instead of the more focussed LC_TIME, and this example seems to be lacking something that would allow the finer approach. Regardless, to apply this to your whole environment, you can add LOCPATH and LC_ALL to whichever login dotfile is appropriate to you, probably ~/.bash_profile (since your ~/.bashrc should be relying on some parent process having already read it for its environment variables). This can be complicated in X environments, where it can be challenging to locate the correct startup dotfile which itself should read in ~/.bash_profile or something . Likely candidates are ~/.xsession, ~/.xprofile, etc. If your ~/.bash_profile simply reads your ~/.profile, and if Bourne shell compat matters, you can put this in the latter:

    LOCPATH=$HOME/lib/locale ; export LOCPATH ; LC_ALL=en_US@myISO ; export LC_ALL

The Example

I created a ~/lib/locale/en_US@myISO.def file, so that dates would appear in ISO-8601 format, which looked like the below during testing. I didn't go hardcore with the standard-compliant interstitial T, as in 2011-01-10T06:21:11-06, but the result is readable and satisfies my criterion for easy and reliable sorting, even with the dayname abbreviation included. It would be nice if date(1) had a specific option to request the combined ISO-8601 version, instead of needing to write out date +%FT%T%z (or +%XT%x%z).

Normally the outputs for these resemble:
+%X 06:21:01 AM
+%x 10/01/2011
+%r 06:21:06 AM
no argsSat Jan 1 06:21:11 CST 2011
$ LOCPATH=~/lib/locale LC_ALL=en_US@myISO date +%X
06:21:01
$ LOCPATH=~/lib/locale LC_ALL=en_US@myISO date +%x
2011-01-10
$ LOCPATH=~/lib/locale LC_ALL=en_US@myISO date +%r
06:21:06 AM
$ LOCPATH=~/lib/locale LC_ALL=en_US@myISO date
2011-01-10 06:21:11 CST Mon
$ 

My en_US@myISO.def file follows. The LC_TIME section is where most of the mods are, in the settings for d_t_fmt, d_fmt, and date_fmt, in order to set ISO-8601 year-first date forms.

This file could certainly be shorter, since other sections can be included by reference, as seen in the LC_COLLATE section. If this were only going to be used to modify, say, the Thunderbird date/time format, then just doing the time section explicity (with the others by reference) and running Thunderbird via a wrapper script containing something like LC_TIME=en_US@myISO thunderbird "$@" would make sense. However, attempts to use it this way failed in my experiments, with only LC_ALL=en_US@myISO working as desired.

It's seems weird that date(1) gets a special date_fmt entry here, since in contrast there are files like /usr/share/locale-langpack/en_US@piglatin/LC_MESSAGES/inkscape.mo. It seems like there should be an option to use something like ~/lib/locales/en_US@myISO/LC_TIME/thunderbird.mo. Of course, that would probably require Thunderbird to do something to activate it, which goes totally against the evident current spirit of that app. Besides, being able to configure strftime strings through Thunderbird's own interface would be preferable, since almost no users will ever create their own locale files.

Lastly, it would nice to set the two values for date_fmt and and d_t_fmt differently, with one being more comfortably readable and one being the fullbore ISO-format-with-a-T variant. Probably date_fmt should be the human-friendly and d_t_fmt the standard-conformant one, but that isn't quite certain yet, so instead they both sit in a middle ground.

Hopefully the following will be useful regardless:

escape_char /
comment_char %
% Locale for English locale in the USA with ISO-8601 LC_TIME
% Contributed by North-Keys [ erlkonig (a t) talisman.org ], 2011

% process with:  localedef <targetdir> < <deffile>

LC_IDENTIFICATION
title      "English locale for the USA with ISO-8601 LC_TIME"
source     "North-Keys"
address    ""
contact    ""
email      ""
tel        ""
fax        ""
language   "English"
territory  "USA"
revision   "1.0"
date       "2011-01-10"
%
category  "en_US:2000";LC_IDENTIFICATION
category  "en_US:2000";LC_CTYPE
category  "en_US:2000";LC_COLLATE
category  "en_US:2000";LC_TIME
category  "en_US:2000";LC_NUMERIC
category  "en_US:2000";LC_MONETARY
category  "en_US:2000";LC_MESSAGES
category  "en_US:2000";LC_PAPER
category  "en_US:2000";LC_NAME
category  "en_US:2000";LC_ADDRESS
category  "en_US:2000";LC_TELEPHONE

END LC_IDENTIFICATION

LC_CTYPE
copy "en_GB"
END LC_CTYPE

LC_COLLATE

% Copy the template from ISO/IEC 14651
copy "iso14651_t1"

END LC_COLLATE

LC_MONETARY
int_curr_symbol     "<U0055><U0053><U0044><U0020>"
currency_symbol     "<U0024>"
mon_decimal_point   "<U002E>"
mon_thousands_sep   "<U002C>"
mon_grouping        3;3
positive_sign       ""
negative_sign       "<U002D>"
int_frac_digits     2
frac_digits         2
p_cs_precedes       1
int_p_sep_by_space  1
p_sep_by_space      0
n_cs_precedes       1
int_n_sep_by_space  1
n_sep_by_space      0
p_sign_posn         1
n_sign_posn         1
%
END LC_MONETARY

LC_NUMERIC
decimal_point   "<U002E>"
thousands_sep   "<U002C>"
grouping        3;3
END LC_NUMERIC

LC_TIME
abday	"<U0053><U0075><U006E>";"<U004D><U006F><U006E>";/
	"<U0054><U0075><U0065>";"<U0057><U0065><U0064>";/
	"<U0054><U0068><U0075>";"<U0046><U0072><U0069>";/
	"<U0053><U0061><U0074>"
day	"<U0053><U0075><U006E><U0064><U0061><U0079>";/
	"<U004D><U006F><U006E><U0064><U0061><U0079>";/
	"<U0054><U0075><U0065><U0073><U0064><U0061><U0079>";/
	"<U0057><U0065><U0064><U006E><U0065><U0073><U0064><U0061><U0079>";/
	"<U0054><U0068><U0075><U0072><U0073><U0064><U0061><U0079>";/
	"<U0046><U0072><U0069><U0064><U0061><U0079>";/
	"<U0053><U0061><U0074><U0075><U0072><U0064><U0061><U0079>"

week    7;19971130;7
first_weekday	1
first_workday	2
abmon	"<U004A><U0061><U006E>";"<U0046><U0065><U0062>";/
	"<U004D><U0061><U0072>";"<U0041><U0070><U0072>";/
	"<U004D><U0061><U0079>";"<U004A><U0075><U006E>";/
	"<U004A><U0075><U006C>";"<U0041><U0075><U0067>";/
	"<U0053><U0065><U0070>";"<U004F><U0063><U0074>";/
	"<U004E><U006F><U0076>";"<U0044><U0065><U0063>"
mon	"<U004A><U0061><U006E><U0075><U0061><U0072><U0079>";/
	"<U0046><U0065><U0062><U0072><U0075><U0061><U0072><U0079>";/
	"<U004D><U0061><U0072><U0063><U0068>";/
	"<U0041><U0070><U0072><U0069><U006C>";/
	"<U004D><U0061><U0079>";/
	"<U004A><U0075><U006E><U0065>";/
	"<U004A><U0075><U006C><U0079>";/
	"<U0041><U0075><U0067><U0075><U0073><U0074>";/
	"<U0053><U0065><U0070><U0074><U0065><U006D><U0062><U0065><U0072>";/
	"<U004F><U0063><U0074><U006F><U0062><U0065><U0072>";/
	"<U004E><U006F><U0076><U0065><U006D><U0062><U0065><U0072>";/
	"<U0044><U0065><U0063><U0065><U006D><U0062><U0065><U0072>"

%                                              2011-01-10 05:13:52 CST Mon
% Appropriate date and time representation (%c) "%Y-%m-%d %H:%M:%S %Z %a"
d_t_fmt "<U0025><U0059><U002D><U0025><U006D><U002D><U0025><U0064>/
<U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
<U0025><U005A><U0020><U0025><U0061>"

% Appropriate date representation (%x)  "%Y-%m-%d"
d_fmt   "<U0025><U0059><U002D><U0025><U006D><U002D><U0025><U0064>"

% Appropriate time representation (%X)  "%H:%M:%S"
t_fmt   "<U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053>"

% Appropriate AM/PM time representation (%r)  "%I:%M:%S %p"
t_fmt_ampm "<U0025><U0049><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
<U0025><U0070>"

% Strings for AM/PM
am_pm	"<U0041><U004D>";"<U0050><U004D>"

%                                            2011-01-10 05:13:52 CST Mon
% Appropriate date representation (date(1))   "%Y-%m-%d %H:%M:%S %Z %a"
date_fmt "<U0025><U0059><U002D><U0025><U006D><U002D><U0025><U0064>/
<U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
<U0025><U005A><U0020><U0025><U0061>"
END LC_TIME

LC_MESSAGES
yesexpr "<U005E><U005B><U0079><U0059><U005D><U002E><U002A>"
noexpr  "<U005E><U005B><U006E><U004E><U005D><U002E><U002A>"
yesstr  "<U0059><U0065><U0073>"
nostr   "<U004E><U006F>"
END LC_MESSAGES

LC_PAPER
height   279
width    216
END LC_PAPER

LC_NAME
name_fmt    "<U0025><U0064><U0025><U0074><U0025><U0067><U0025><U0074>/
<U0025><U006D><U0025><U0074><U0025><U0066>"
name_miss   "<U004D><U0069><U0073><U0073><U002E>"
name_mr     "<U004D><U0072><U002E>"
name_mrs    "<U004D><U0072><U0073><U002E>"
name_ms     "<U004D><U0073><U002E>"
END LC_NAME

LC_ADDRESS
postal_fmt    "<U0025><U0061><U0025><U004E><U0025><U0066><U0025><U004E>/
<U0025><U0064><U0025><U004E><U0025><U0062><U0025><U004E><U0025><U0068>/
<U0020><U0025><U0073><U0020><U0025><U0065><U0020><U0025><U0072><U0025>/
<U004E><U0025><U0054><U002C><U0020><U0025><U0053><U0020><U0025><U007A><U0025>/
<U004E><U0025><U0063><U0025><U004E>"
country_name  "<U0055><U0053><U0041>"
country_post  "<U0055><U0053><U0041>"
country_ab2   "<U0055><U0053>"
country_ab3   "<U0055><U0053><U0041>"
country_num   840
country_car   "<U0055><U0053><U0041>"
country_isbn  0
lang_name     "<U0045><U006E><U0067><U006C><U0069><U0073><U0068>"
lang_ab       "<U0065><U006E>"
lang_term     "<U0065><U006E><U0067>"
lang_lib      "<U0065><U006E><U0067>"
END LC_ADDRESS

LC_TELEPHONE
tel_int_fmt    "<U002B><U0025><U0063><U0020><U0028><U0025><U0061><U0029>/
<U0020><U0025><U006C>"
tel_dom_fmt    "<U0028><U0025><U0061><U0029><U0020><U0025><U006C>"
int_select     "<U0031><U0031>"
int_prefix     "<U0031>"
END LC_TELEPHONE

LC_MEASUREMENT
measurement    2
END LC_MEASUREMENT
disencrypt lang [de jp fr] diff backlinks (sec) validate printable
Cogito ergo spud (I think therefore I yam).
[ Your browser's CSS support is broken. Upgrade! ]
alexsiodhe, alex north-keys