Hello all,
I need some advice about how to use wcswidth on Ruby 1.9 strings.
Wcswidth returns the column width of a string, i.e. how much of the
horizontal space on the screen a string takes up when printed to a
terminal. Being able to calculate column width is crucial for any
console-based program that deals with non-ASCII characters. For example,
Chinese characters typically take up two columns.
So, I’d like to write a gem that provides wcswidth to Ruby. I have a
simple version that works:
ruby-wcswidth.c · GitHub
Unfortunately, this code requires that the string’s encoding match your
LC_CTYPE, because mbstowcs relies on LC_CTYPE to produce wchar_t’s. If
those don’t match, you’ll either get nothing, or, possibly, a wrong
answer.
So, is there a mapping between Ruby string encodings and LC_CTYPE
values? If so, I could at least check that the conversion is done
correctly (or possibly even skip the mbstowcs step, if I know what
wchar_t is and what encoding corresponds to it–of course what wchar_t
means is system-dependent.)
Or is there a better way to do this than using mbstowcs?
Or heck, is there already a way of getting string column width in Ruby?
It seems like there should be, but I don’t see one.
Thanks!
2010/5/10 William M. [email protected]:
I need some advice about how to use wcswidth on Ruby 1.9 strings.
Wcswidth returns the column width of a string, i.e. how much of the
horizontal space on the screen a string takes up when printed to a
terminal. Being able to calculate column width is crucial for any
console-based program that deals with non-ASCII characters. For example,
Chinese characters typically take up two columns.
I wrote such code in my terminfo binding.
Unfortunately, this code requires that the string’s encoding match your
LC_CTYPE, because mbstowcs relies on LC_CTYPE to produce wchar_t’s. If
those don’t match, you’ll either get nothing, or, possibly, a wrong
answer.
I used rb_locale_encoding() to convert to LC_CTYPE as follows:
str = rb_str_encode(str, rb_enc_from_encoding(rb_locale_encoding()),
0, Qnil);
Reformatted excerpts from Tanaka A.'s message of 2010-05-09:
I wrote such code in my terminfo binding.
GitHub - akr/ruby-terminfo: terminfo binding for Ruby
Very nice. It works when I use your git version. But the latest gem
(v0.1.1) doesn’t seem to have wcswidth.
I used rb_locale_encoding() to convert to LC_CTYPE as follows:
str = rb_str_encode(str, rb_enc_from_encoding(rb_locale_encoding()), 0, Qnil);
Perfect, that’s what I was missing.
I need one other bit of functionality too: the ability to get a
substring of a specific display width. I assume that is beyond the scope
of your terminfo package, so I am probably going to wrap the two methods
up into a gem. But if you’re planning on adding such a thing to
ruby-terminfo, let me know, and I will just use that instead.
Thanks for your help!
Reformatted excerpts from William M.'s message of 2010-05-10:
I need one other bit of functionality too: the ability to get a
substring of a specific display width. I assume that is beyond the
scope of your terminfo package, so I am probably going to wrap the two
methods up into a gem.
I’ve finally pulled these together into a gem:
http://rubygems.org/gems/console
If anyone else is interested in writing i18n-capable console
applications, this should be pretty useful for you. Unfortunately I
haven’t quite gotten it to work with Ruby 1.8. Any help on that front
would be appreciated.
Blog post with a little more background:
http://all-thing.net/string-width