Michal S. wrote:
On 15/01/2008, Jochen H. wrote:
There might be pieces of text or other data in different languages.
And POSIX locale handles such situations poorly.
And you should remember that the application
is not just the interface.
Right, but human read oriented text strings
will only occur in “the (user) interface”,
not in the number crunching part of the software, I assume,
just keeping to “good practices” …
It’s the mistake that the people designing POSIX locale did:
they forgot that the data has to live somewhere
before it gets to the user interface.
Right, and the user interface is the right place,
to convert the internal time value into a locale oriented string.
Just as in MVC: the View is the right place to deal with that stuff.
That’s what I like and appriate with academical people:
they are not pragmatic, and they don’t need to be
And it’s good that way. Serious.
Dear Michal, you are right in that it should be dealt with software with
multiple user interfaces,
all dealing with users in entirely different locales.
BUT: back in real life … different instances of a single program can
very well deal each with a different locale – w/o conflict and
overlaps.
I suspect you misunderstand the way locale is currently handled.
It is not used by ruby, it is avoided.
Well, as long as MRI is written in C
and makes use of host libraries esp. like (g)libc,
“avoiding” is better replaced by “defaulting”,
no, it’s avoiding it. The locale was set to “C” in 1.8,
and now only LC_CTYPE is set.
Oh, and that’s not set to a value?
Matz wrote, that this is done:
setlocale(LC_CTYPE, “”);
because there are language strings emitted by C library routines
and passed through to ruby class resp. object methods,
and it’s probably not too wrong
to assume these strings are “very similar” to en_US.
That is a locale de facto, isn’t it?!?
no, locale is about having the language specific stuff different at
different times. Before there was locale, everything was in the
implicit “C” locale which was like en_US
As mentioned above:
For me as a pragmatic programmer
it’s quite sufficient, that an instance of a program
initially acquires a single locality and keeps that for its life-time
and that’s it.
No changes in the meantime.
The only recent change was adding the
setlocale() call in the ruby interpreter
that is required for some
extensions to work properly in different locales.
Extensions built on assumptions like “you always work with en_US”,
of course, that simplifies life tremendously,
but you don’t real want to discuss this bad idea with me, right?
No, extensions that use libraries built around the assumption that
LC_CTYPE specifies the correct character classes which the “C” locale
does not for most cases.
After googling a while for ruby, rails, and locale,
my impression was very much different,
but I am running out of time right now
and you can do the query yourself,
and I prefer to give in here.
Just a single and last example:
$ env LC_ALL=fr_FR /usr/local/ruby1.9/bin/ruby
-e ‘t = Time.now; puts t.strftime(“%A”)’
Thursday
$ env LC_ALL=fr_FR date ‘+%A’
jeudi
The second thing is exactly what a simple UNIX programmer expects,
the first one is r***ish.
So, now that’s not twisting with environment variables, right?
I am sorry,
but this should get fixed,
and then we can proceed discussing the matter.
- passing (g)libc capabilities straight through,
where not too much speaks against it
The libc is never used directly,
because it does not operate on the same data types as ruby does.
You could make wrappers but it’s always some addition.
And that implies it’s better to reinvent the wheel
instead of using available basic middleware?
- implement nicer locale capabilities,
when there are any coding resources left
What capabilities do you need exactly?
I would be entirely satisfied,
if (g)libc’s locale capabilities would get passed through
unchanged, unfiltered, untwisted, un-… (whatsover).
I think, I have made my point clear by now.
You seem unhappy with my simple POSIX approach
and you request “multi-threaded” capabilities.
Let’s not get confused here!
You do see, how people struggle for I18N in rails apps in a desperate
and strange way,
Pls go to http://en.wikipedia.org/wiki/I18n
and read up, what’s implied with I18N.
Not just mulitbyte encodings, but also date/time formats, …
That’s what I keep referring to.
Afaik they struggle with multibyte encodings
which 1.9 should make easier.
This has nothing to do with locale.
Obviously there are locales,
that enforce the availability of multibyte encodings,
but … let’s not get carried away!
just because available locale capabilities
most easy to get passed through from the interpreter’s runtime system
are twisted suboptimally?
There are none to be twisted.
As you can see again from date example,
something gets twisted in the inner life of MRI.
Again: it’s already there, just set it free!
What is there?
(g)libc and its locale capabilities.
Could you, please point at the capabilities,
in the code, that are hidden?
Setting the locale to a zero-length string,
despite there might be another setting of the environment variable.
That is clearly something,
that voids the user’s intents.
I think I answered all questions patiently, seriously, and beyond …
Kind regards,
J.