Ruby’s not ready - an indepth essay

Err, to point a fact:

Ruby itself does not support Unicode, in the way that Intel quad core
processors aren’t really quad core.

Those are true statements, but they are misleading.

Ruby can mimic enough Unicode to get by in the areas where you need
it. Otherwise the original Japanese author probably wouldn’t have
been able to use it quite as much, and then it wouldn’t have caught on
in Japan, and wouldn’t have moved over to the US and Europe.

Heck, the fact that you can type this into vi and the two output lines
are the same should be enough to convince anyone.
#!/usr/bin/ruby
#an example shamelessly pulled from a ruby mailing list question
a="\xD7\x90"
b=“א”
puts a
puts b

Probably the fact that ruby’s lack of Unicode support still lets you
do this is why there hasn’t been more of a push for full blown
Unicode.

–Kyle

2008/4/7 Song Ma [email protected]:

F.Y.I

Not sure if you guys have read this article, I am going to re-post it here.

Ruby's not ready | glyphobet • глыфобет • γλυφοβετ

The only good point in that whole mess is that the current state of
docs for Ruby is poor, and could use a lot of love.

  • Rob

Bill K., 08.04.2008 14:51:

Anyway, ruby 1.8 does have usable UTF-8 support “out of the box.” (See also
post #9 in that thread by Matz talking about 1.8.)

Hmm. After reading the thread, I simply tried:

puts “öäü”.length

and it returns 6 if the source file is saved with UTF8, which is plain
wrong.
(it returns 3 if saved in ISO-8859-1 encoding).

String#size returns the same values.

In case your mail reader does not display the string correctly - it’s:
öäü

Thomas

TMTOWDI is bad

In other words, I’ll be objective, so long as objective means judging
things according to my own lack of imagination.

Ok, don’t kill me for this, and I do disagree with most of the article
(I personally found Ruby more coherent and therefore simple, and I see
many things are improving (speed, regexp, etc…) so I am happy with
that).

However I did sympathise with the author’s comment on having too many
ways to do the same thing. They pretty much mirror my feelings: that
you either learn them all (in which case you lost in simplicity) or
you’ll have a very hard time reading other people’s code. I admit
though that it makes writing code easier.

I know however that the Ruby community is strongly in favour of this
“feature”, so I was wondering why.

Diego

On 08/04/2008, Thomas K. [email protected] wrote:

and it returns 6 if the source file is saved with UTF8, which is plain
wrong. (it returns 3 if saved in ISO-8859-1 encoding).
String#size returns the same values.

In case your mail reader does not display the string correctly - it’s:
öäü

It’s completely correct. length in 1.8 means number_of_bytes. You can
get 3 by using regexps in utf-8 or a special extension which is
mentioned in many threads on Unicode as well.

Thanks

Michal

Rob S. wrote:

2008/4/7 Song Ma [email protected]:

F.Y.I

Not sure if you guys have read this article, I am going to re-post it here.

Ruby's not ready | glyphobet • глыфобет • γλυφοβετ

The only good point in that whole mess is that the current state of
docs for Ruby is poor, and could use a lot of love.

Well … I guess that depends on which docs you are talking about.
There’s plenty of documentation on Rails, three of the major GUIs –
Shoes, FXRuby and QtRuby – have books in “print” on them, there are two
major Ruby “cookbooks”, the documentation on Ruport and RSpec is
excellent, etc.

A week or so ago when the Ruby Mendicant was considering working on the
docs, I expressed the opinion that the documentation is the
responsibility of the creator – someone shouldn’t have to do it for
them. Now if the creator is a better coder than tech writer, perhaps the
project can take on someone. But my experience has been that it’s very
rare for someone to be an excellent coder and a poor tech writer.

I read the essay and all of the posts about it here so far, and my own
personal opinion is:

  1. Everything he said has been said before – it’s basically a rehash of
    old criticisms.

  2. My main concern is not with the documentation. My main concern is
    that both the syntax and semantics of the language seem to be more fluid
    than “pragmatic” considerations would dictate. I more or less grew up
    with FORTRAN, although I missed FORTRAN I. So I’ll use its evolution as
    an example.

Ten years into its evolution, an ANSI committee was formed to
standardize the language. Users and vendors sat around a huge table and
thrashed out what would break the least code, what was easy to
implement, what kinds of programs people wanted to write in the language
that they couldn’t, etc. The result was FORTRAN 66. 11 years later there
was FORTRAN 77, etc.

Now FORTRAN is 50 years old, there’s a FORTRAN 95 standard, and the
language is still in use (I think – I haven’t written any since 1990).
Ruby is a tad older than ten years, and I think maybe it’s time for some
standardization on the syntax and semantics.

I think there are enough “killer apps” now that we know what we can’t
take out of Ruby without breaking Rails, RSpec, Ruport, etc. And from
MRI, KRI, jRuby and Rubinius, I think we know what’s easy to implement
and what isn’t. But what I have no clues about is what programs people
want to write in Ruby that they can’t write now.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

| However I did sympathise with the author’s comment on having too many
| ways to do the same thing. They pretty much mirror my feelings: that
| you either learn them all (in which case you lost in simplicity) or
| you’ll have a very hard time reading other people’s code. I admit
| though that it makes writing code easier.
|
| I know however that the Ruby community is strongly in favour of this
| “feature”, so I was wondering why.

Learning R. as my first language, I can emphasize with this issue.

But: The TMTOWDI approach helps with productivity. A lot. I switch
between the different idioms and methods as the situation requires.
Sometimes it is more natural to use #match, sometimes it is more natural
to use #scan, depending on the situation.

The only problem I’ve had is the less than optimal documentation for
Ruby, so it is rather difficult to find what a method does or,
sometimes, where it comes from.

And lets face it: Most code in the world is write-only: It’s written
once, and becomes instant legacy code. How often is code completely
revisited, and has to be re-factored in an amount of time that makes it
impossible to wrap one’s head around it again? Not much, and it doesn’t
happen often (mostly in languages that are used in the enterprise, where
refactoring tools are available, and IDEs are more useful than with Ruby
at the moment).

Besides, it helps with problem solving, I’ve noticed, the more Ruby I
pick up. It’s a matter of using the right tool for the right job within
the language itself.

However, I’ve not yet formalized my IT knowledge, so take it with a
grain of salt. My view is an opinion. Treat it as such.


Phillip G.
Twitter: twitter.com/cynicalryan

Write first in an easy-to-understand pseudo-language; then translate
into
whatever language you have to use.
~ - The Elements of Programming Style (Kernighan & Plaugher)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf7guQACgkQbtAgaoJTgL93AACgmigzprUIZJD87pcNyae5IO+7
W6AAn02u5jVsqJ3+2rYNB9xN5hOG88lX
=FPEN
-----END PGP SIGNATURE-----

On Tue, 2008-04-08 at 23:20 +0900, Diego V. wrote:

However I did sympathise with the author’s comment on having too many
ways to do the same thing. They pretty much mirror my feelings: that
you either learn them all (in which case you lost in simplicity) or
you’ll have a very hard time reading other people’s code. I admit
though that it makes writing code easier.

I know however that the Ruby community is strongly in favour of this
“feature”, so I was wondering why.

You answered your own question here. It makes writing code a joy. And,
unlike, say, Perl (or even more extremely, K), Ruby isn’t quite a
write-only programming language. (If you use all the idiot Perlisms you
can make it that way, but almost nobody uses those thankfully!)

Ruby is by no means a perfect language. But as a former Pythonista (I
started with Python at v1.3), Ruby, despite its warts (and this includes
the UNICODE issue, the lousy performance, the bad threading model and
the whole host of other things people have ranted about for ages now)
remains my first and favourite language I reach for when I’m starting a
project. As things move along in the project I reach for other
languages to supplement or replace it (recently Erlang has grabbed my
imagination for certain key application elements), but Ruby’s my first
choice and is usually in the final product in some form or another.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

M. Edward (Ed) Borasky wrote:
| Well … I guess that depends on which docs you are talking about.
| There’s plenty of documentation on Rails, three of the major GUIs –
| Shoes, FXRuby and QtRuby – have books in “print” on them, there are two
| major Ruby “cookbooks”, the documentation on Ruport and RSpec is
| excellent, etc.

I shouldn’t have to buy a book (like the PickAxe), to get decent
documentation. A book should be one option, among many, to get to the
documentation. I’ll elaborate on that in the next paragraph.

| A week or so ago when the Ruby Mendicant was considering working on the
| docs, I expressed the opinion that the documentation is the
| responsibility of the creator – someone shouldn’t have to do it for
| them. Now if the creator is a better coder than tech writer, perhaps the
| project can take on someone. But my experience has been that it’s very
| rare for someone to be an excellent coder and a poor tech writer.

While I agree, that the project is responsible for its own documentation
~ (its rather obvious, is it not?), the tools we have in the Ruby
community to create documentation are limited to RDoc, a rather hacky
solution (Correct me if I’m wrong, but Dave T. said as much),
intended as a stand-in until a more useful tool comes around.

Sadly, as it is so common in the world, the temporary solution becomes
the permanent solution, with all its short-comings (see a discussion on
the mixup of Ruby Core and Ruby StdLib documentation on
RDoc Documentation). And that is a big obstacle, making it unnecessarily
difficult for newbies like me (I still am a newbie to Ruby, despite
using it for more than a year, the language is just that complex in its
simplicity, which is a major appeal for me, but that is a rant for
another time).

We, as a community, have to provide tools that make it easy and painless
to generate documentation, and generate it in different formats. In my
most humble opinion, we should take a look at Javadoc, and see what we
can steal^H^H^H^H^H implement for Ruby. I fully recognize that Javadoc
is great for Java, and less so for Ruby, but that doesn’t mean it
doesn’t have useful ideas that make it a breeze to generate
documentation.

Anecdotal evidence: When building the gem for Gondor Library I was
struggling in including the README and LICENSE files in RDoc. The files
were included in my Rake task’s FileList for gem generation, as well as
standalone doc generation.

However, the gem didn’t include the files in ./, only ./lib, much unlike
the RDoc task. I had to explicitly tell the gem, to include extra files.
I lost more than an hour to that. The outdated RubyGems documentation
didn’t really help. Fortunately, I found other Rake tasks, and could
eliminate the problem.

But it shouldn’t be this way, and I have the feeling it is a shortcoming
of RDoc (maybe not, I haven’t looked at RDoc itself, so my assessment of
the reason maybe wrong, but my point still stands).

Also, that RDoc generates frames for usage isn’t the ideal solution,
IMO, as it makes search difficult (via a browser’s search, anyway).

And AFAIK, RDoc cannot spit out PDFs, PostScript, or LaTeX, or something
other than ri. And that makes it unnecessarily difficult to generate
non-RDoc documentation without third party tools (and I don’t really
want to learn Yet Another Tool that is not directly related to
increasing my productivity in writing Code (that’s what I want to do,
not write comments or documentation).

| 2. My main concern is not with the documentation. My main concern is
| that both the syntax and semantics of the language seem to be more fluid
| than “pragmatic” considerations would dictate. I more or less grew up
| with FORTRAN, although I missed FORTRAN I. So I’ll use its evolution as
| an example.
|
| Ten years into its evolution, an ANSI committee was formed to
| standardize the language. Users and vendors sat around a huge table and
| thrashed out what would break the least code, what was easy to
| implement, what kinds of programs people wanted to write in the language
| that they couldn’t, etc. The result was FORTRAN 66. 11 years later there
| was FORTRAN 77, etc.

C and C++ went the other way, with the STDLIB growing steadily, and new
features being added. Yet, C/C++ are more in use.

However, both FORTRAN and C are anecdotal evidence. The scope of the
languages is not really the same, and neither is Ruby’s.

|
| I think there are enough “killer apps” now that we know what we can’t
| take out of Ruby without breaking Rails, RSpec, Ruport, etc. And from
| MRI, KRI, jRuby and Rubinius, I think we know what’s easy to implement
| and what isn’t. But what I have no clues about is what programs people
| want to write in Ruby that they can’t write now.

For the power that Ruby gives me: I want to write everything in Ruby.
It’s good at pretty much everything I can throw at it, except
number-crunching. But I can farm that out to C or Java, or maybe .NET
once IronRuby is “production ready”.

Personally, I haven’t reached the point where I feel that Ruby isn’t up
to the task at hand, or severely limited. That’s anecdotal, though. I’m
sure that people who have to do some heavy lifting and datamunging to do
have a different opinion on that (but more related to Ruby’s speed, than
Ruby’s syntax and expressiveness, or am I mistaken?).


Phillip G.
Twitter: twitter.com/cynicalryan

~ - You know you’ve been hacking too long when…
…you dream that your SO and yourself are icons in a GUI and you can’t
get
close to each other because the window manager demands minimum space
between
icons…
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf7iQIACgkQbtAgaoJTgL8aVgCgjWbCoZyKVMfxGJS0nMOwE9+P
dfgAoJA1nYY99kKFqU68aNS9KO8ca+6G
=Mf9d
-----END PGP SIGNATURE-----

Now FORTRAN is 50 years old, there’s a FORTRAN 95 standard, and the
language is still in use (I think – I haven’t written any since 1990).

Most definitly. In my office we is it constantly. There’s only one
serious contender for number crunching here and it’s C, but then C is
much more of a pain to use, and you need to use the correct flags to
get the kind of optimisations that Fortran gives you out of the box.
Oh, and Fortran has now moved to Fortran 2003 standard (and they are
working on 2008). It’s a really nice language to write in… though
it’s not suitable for that many jobs.

But this is totally OT, sorry. (it’s just rare to find another person
with a good opinion of Fortran ^_^)

Diego

On Apr 8, 2008, at 10:02 AM, Phillip G. wrote:

While I agree, that the project is responsible for its own
documentation
~ (its rather obvious, is it not?), the tools we have in the Ruby
community to create documentation are limited to RDoc, a rather hacky
solution (Correct me if I’m wrong, but Dave T. said as much),
intended as a stand-in until a more useful tool comes around.

I said it’s a hacky implementation. I believe the concept is just
fine.

of RDoc (maybe not, I haven’t looked at RDoc itself, so my
assessment of
the reason maybe wrong, but my point still stands).

Sounds like a configuration issue with the gem, to me.

Also, that RDoc generates frames for usage isn’t the ideal solution,
IMO, as it makes search difficult (via a browser’s search, anyway).

The frames are just one option. Have you looked at api.rubyonrails.com?

And AFAIK, RDoc cannot spit out PDFs, PostScript, or LaTeX, or
something
other than ri. And that makes it unnecessarily difficult to generate
non-RDoc documentation without third party tools (and I don’t really
want to learn Yet Another Tool that is not directly related to
increasing my productivity in writing Code (that’s what I want to do,
not write comments or documentation).

Actually, I have it generating all three, as well as plain text, chm,
and our in-house PML markup.

I’m not defending RDoc. But I think that if the situation is to
improve, it should be through informed discussion.

There are a couple of underlying issues. Firstly, much of the basic
Ruby infrastructure is not well documented. Many standard libraries in
the base distribution have minimal documentation, for example. Many
gems are only minimally documented (for example, having just API-level
documentation).

Second, there’s no good place to go to see documentation.

I think we have the basis for a great set of documentation. What is
needed now is for someone with vision to take the next step. It
needn’t be a big one. In the same way that the RubyGems initiative set
down some simply packaging guidelines that we all follow, I think that
someone could drive through the same for documentation. It needn’t be
more than a few conventions. For example, we are used to seeing README
and INSTALL in the top-level of an application. Change Gems to include
these by default. If it sees a HOWTO file, include that. If it sees
GUIDE, do the same. These are just suggestions–someone needs to take
ownership of this and flesh it out properly.

We have the API documentation nut cracked. We need to do the same for
the non-API documentation. And, if we set the conventions in place
now, the tools will be able to use them to build a raft of different
styles of documentation sites.

I’m looking forward to it.

Dave

On Tue, Apr 8, 2008 at 3:15 PM, Thomas K.
[email protected] wrote:

But Unicode/UTF8 would at least satisfy a lot more people than plain
ASCII or 8bit encodings (such as ISO-8859-x)

Would it?

It comes with a very significant hit in the speed of Regex processing,
at least with the current implementation.

Enough to, for many applications, negate the speed benefit everything
that has been optimized from 1.8 to 1.9. This has been shown with
speed reports on this mailing list previously

I’ll note that I work in a non-US character set, and in my experience,
UTF support in a programming language has only been in the way. So
far, what has been useful to me has always been to have strings be
lists of bytes.

I do not doubt that there are usecases where the support is useful; it
is just that so far I haven’t come across them, or the support that
has been there has been unobtrusive enough that I haven’t noticed that
it was useful (but I don’t think so - all data I have fit nicely in
ISO-Latin-1, because all I work with comes from western Europe or is
in english.)

Note that this sounds like I am against transparent UTF-8 support -
that’s not necessarily so. I just want to make sure that people are
(many) usecases where the support isn’t just neutral, it is actually a
drawback (loss of speed, extra complexity, not knowing that the result
of string.length actually means you can put string in a field of
length length), so the upsides had better be worthwhile.

Eivind.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave T. wrote:
|
| On Apr 8, 2008, at 10:02 AM, Phillip G. wrote:
|> While I agree, that the project is responsible for its own documentation
|> ~ (its rather obvious, is it not?), the tools we have in the Ruby
|> community to create documentation are limited to RDoc, a rather hacky
|> solution (Correct me if I’m wrong, but Dave T. said as much),
|> intended as a stand-in until a more useful tool comes around.
|
| I said it’s a hacky implementation. I believe the concept is just fine.

Err, yes. I like the idea of RDoc myself. I like it very much, almost to
the level of fan boy. It is very easy to generate API docs. But that is
the only thing it can do, it seems.

| Sounds like a configuration issue with the gem, to me.

Yes, it was. I had to tell the gem specifically to find these extra
documents. My chip on my shoulder is, that this shouldn’t be necessary,
especially when the RDoc task in my Rakefile actually generates the
documentation without declaring these documents explicitly.

|> Also, that RDoc generates frames for usage isn’t the ideal solution,
|> IMO, as it makes search difficult (via a browser’s search, anyway).
|
| The frames are just one option. Have you looked at api.rubyonrails.com?

Which is still a frameset. The layout’s different, but that’s it.

| Actually, I have it generating all three, as well as plain text, chm,
| and our in-house PML markup.
|
| I’m not defending RDoc. But I think that if the situation is to improve,
| it should be through informed discussion.

Yes, of course. I’m not shouting to purge RDoc from the Rubyist’s
toolkit, not at all. However, it stopped progressing at some point, and
no body has taken over to expand it.

(However, I have found a github repository that seems to tackle the
issue. And I’m waiting for the source code to take a look at it.)

| There are a couple of underlying issues. Firstly, much of the basic Ruby
| infrastructure is not well documented. Many standard libraries in the
| base distribution have minimal documentation, for example. Many gems are
| only minimally documented (for example, having just API-level
| documentation).

The hashing functions in the STDLIB have no documentation. Having looked
at the STDLIB documentation for Ruby in the past, I’m looking at the web
first, and spend my time tweaking search-engine queries, and find some
sort of tutorial on how to use it. Which makes me sad.

And since I’m not good at reading C, or other people’s code just yet, I
can’t contribute with documentation patches, and that makes me an
unhappy camper: Not being able to contribute in correcting issues I see,
and help out a great community. :confused:

| Second, there’s no good place to go to see documentation.

Indeed. If there were, the Rails API wouldn’t be on 5+ websites to look
at. Similar for Ruby’s documentation. That it isn’t very searchable is
another issue, IMO. (OTOH, neither is Javadoc from what I’ve seen. Well,
there’s no silver bullet, is there?).

| I think we have the basis for a great set of documentation. What is
| needed now is for someone with vision to take the next step. It needn’t
| be a big one. In the same way that the RubyGems initiative set down some
| simply packaging guidelines that we all follow, I think that someone
| could drive through the same for documentation. It needn’t be more than
| a few conventions. For example, we are used to seeing README and INSTALL
| in the top-level of an application. Change Gems to include these by
| default. If it sees a HOWTO file, include that. If it sees GUIDE, do the
| same. These are just suggestions–someone needs to take ownership of
| this and flesh it out properly.
|
| We have the API documentation nut cracked. We need to do the same for
| the non-API documentation. And, if we set the conventions in place now,
| the tools will be able to use them to build a raft of different styles
| of documentation sites.

To put my money where my mouth is, I’m willing to contribute to this
effort in any way I can. Be it coding, writing documentation, or
evangelizing it…

What I’d like to see, is something like RDoc, which grabs formatted
files (In Textile, Markaby, RDoc’s variant,…), and emits documentation,
from files that are in some way specified (a .document extension, maybe.
Something that is convention over configuration, to make the transition
as easy as possible).

I think I can hack up a simple tool that demonstrates what I mean (after
all, there’s RedCloth, and Rake’s FileList to accomplish what I mean).

Once I have that running, I’m perfectly happy to hand it to anyone who’s
more qualified than me to work on this tool.

| I’m looking forward to it.

So am I.

As a personal note: I did not mean to criticize you or your work, Dave.
In fact, if I could find my copy of the Pickaxe again, I’d drag it along
to have you sign it. It was a great help to me in picking up Ruby, and a
very valuable reference for the documentation. Thanks for the effort by
you and all who contributed to it. :slight_smile:

And I’d go nuts without RDoc to generate API documentation.


Phillip G.
Twitter: twitter.com/cynicalryan

~ - You know you’ve been hacking too long when…
…you “woke up” this morning and thought, “I’ll checkpoint here, snooze
a bit more and then revert to checkpoint.” A while later you go up
another consciousness notch and realize that you hadn’t checkpointed
successfully
~ - “Oh, of course. I didn’t have the keyboard.”
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf7lycACgkQbtAgaoJTgL8WjgCgkdQ+8JTKfPj3FoYllVffiVDs
DgkAn28WmpugQ/yH+LyqCpVDcZH0XUnp
=RXLi
-----END PGP SIGNATURE-----

On Apr 8, 2008, at 11:04 AM, Jeremy McAnally wrote:

As an aside, I’m working on re-implementing RDoc using ruby_parser.
It’s in very bad shape right now (transitioning from using RDoc’s
CodeObjects stuff to ruby_parser), but you can monitor my work at
http://github.com/jeremymcanally/docr .

Could ripper be used for this? Given that 1.9 has it built in, it
would reduce dependencies (on of the goals of RDoc was to have zero
external dependencies, and that still seems like a good idea).

Hopefully when this is in decent shape, it will be a well-tested,
nicely implemented Ruby documentation tool.

Are you talking to Eric, who’s currently working on RDoc for 1.9?

Dave

I agree about the API docs. There has been talk recently about
working on them, but there are some snags we need to get around (I
don’t know that you’ve been privvy to those conversations or not but
if not you should be :)).

As an aside, I’m working on re-implementing RDoc using ruby_parser.
It’s in very bad shape right now (transitioning from using RDoc’s
CodeObjects stuff to ruby_parser), but you can monitor my work at
http://github.com/jeremymcanally/docr .

The basic plan is to get the parser/normalization stuff in place and
tested this week. Then I’ll try to put a really lightweight bin
script/library in front of it next week. Most of the same code should
work in that part with minor modification. Sometime in there I’ll be
extracting the markup stuff from RDoc into its own gem/library so it’s
more separated from the mainline DocR stuff. I believe I along with
others have plans to hack in some additions to the markup to give it
some more powerful structures.

Hopefully when this is in decent shape, it will be a well-tested,
nicely implemented Ruby documentation tool.

–Jeremy

On Tue, Apr 8, 2008 at 11:37 AM, Dave T. [email protected] wrote:

I said it’s a hacky implementation. I believe the concept is just fine.

of RDoc (maybe not, I haven’t looked at RDoc itself, so my assessment of

minimally documented (for example, having just API-level documentation).
sees a HOWTO file, include that. If it sees GUIDE, do the same. These are

Dave


http://jeremymcanally.com/
http://entp.com

Read my books:
Ruby in Practice (Ruby in Practice)
My free Ruby e-book (http://humblelittlerubybook.com/)

Or, my blogs:

http://rubyinpractice.com

On Tue, Apr 8, 2008 at 7:40 AM, Thomas K.
[email protected] wrote:

Marc H., 08.04.2008 13:24:

[Unicode]

I need it.
I keep on reading people that need Unicode, and in your case it may very
well be true and for many others as well.
That’s precisely the ignorant attitude that caused the issues we currently
have with differen character sets. I’m pretty sure that if computer systems
had been emerged from a non-english speaking country at the beginning we
wouldn’t need to still fight character set issues (there are still too many
applications that even have problems with 8bit character sets)

Unfortunately Marc didn’t keep my quote which has a lot more
important context than was kept. I never argued that people don’t need
Unicode. I said that most people don’t need Unicode munging of their
text. Think operations on the strings rather than the existence of the
strings themselves.

I’m a newbie with Ruby and until I read this discussion I simply assumed it
would fully support Unicode “out of the box” especially given the fact that
is originates from Japan. I’m actually very confused (not to say shocked)
that there is a discussion if Ruby needs (or supports?) Unicode.
Unicode (and a relevant encoding such as UTF8) should be the standard for
all (new) programming languages and not an exception.

No, they shouldn’t. Yes, Ruby needs Unicode support. But Unicode has a
big problem: legacy data. There’s more legacy textual data than there
is Unicode textual data at this point. That will change, yes, but it
isn’t so now. Languages that assume that their strings are Unicode
(and make it harder to deal with legacy data) are much harder to work
with for legacy data.

Also, look at the Han Unification discussions regarding Unicode and
you’ll see why the lack of Unicode support from Japan through early
this decade isn’t surprising. Unicode isn’t very friendly to Asian
texts, in terms of storage size. It’s not a big deal now that we’re
dealing with massive hard drives and our textual data is a minuscule
fraction of our overall data storage (audio, images, video).

Ruby 1.8 has limited (too limited, but there’s good historical reasons
for this) support for Unicode; Ruby 1.9 has good support for Unicode
and it’s getting better.

History is good to know. It would have prevented this blogger from
being ignorant.

-austin

On Tue, Apr 8, 2008 at 7:54 AM, Trans [email protected] wrote:

Why is everyone getting so worked up? It’s a critique.

critique |krɪˈtiːk| noun
a detailed analysis and assessment of something, esp. a literary,
philosophical, or political theory.
verb ( -tiques |krɪˈtiks|, -tiqued |krɪˈtikt|, -tiquing |krɪˈtikɪŋ|) [
trans. ]
evaluate (a theory or practice) in a detailed and analytical way :
the authors critique the methods and practices used in the research.

The only thing detailed about the blog posting is the author’s
ignorance. There’s no analysis, just mindless bashing.

-austin

On Tue, Apr 8, 2008 at 12:05 AM, Phillip G.
[email protected] wrote:

Austin Z. wrote:
| (Yes, Virginia. Most people don’t need full-on Unicode munging in
| their code. It’s necessary when you do need it, but most people don’t
| need it.)
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – Joel on Software

I need it. Most of Europe needs it. Not to mention Arabia, Japan, and
everybody else not speaking English.

You didn’t read what I said. I said “most people don’t need full-on
Unicode munging.” This is true. There are some cases where it’s
absolutely necessary, but most people just need to know that they’re
not going to screw up things when they work with Unicode.

You can write Unicode-safe applications without needing full Unicode
string munging. Easily. Most Rails apps should probably be doing
exactly that.

And yes, I do know Unicode. Maybe not as well as Tim B., but well
enough to know what’s actually needed and what isn’t. (I just wrote an
app that deals with UTF-8 Unicode strings; I don’t modify them at all,
so I’ve got a Unicode-safe app in Ruby because I’m not mucking with
things that I don’t need to muck with.)

Joel’s article is oversimplistic on this, really. I stand by what I
said: most people don’t need Unicode munging. But when you need it,
you really need it and Ruby can fall down flat for you, pre 1.9.
(And yes, if you look above, that is what I said.)

-austin

On Tue, Apr 8, 2008 at 9:15 AM, Thomas K.
[email protected] wrote:

Bill K., 08.04.2008 14:51:

Unicode (and a relevant encoding such as UTF8) should be the standard
for all (new) programming languages and not an exception.
Apparently not, as One Character Encoding to Rule Them All is not
considered satisfactory to many people.
But Unicode/UTF8 would at least satisfy a lot more people than plain
ASCII or 8bit encodings (such as ISO-8859-x)

Here’s a long thread on the subject from the archives: http://tinyurl.com/ge2kp

Please read the thread that Bill pointed you to. It explains a lot
more than you’d think there would be. (And yes, that’s a thread that I
was heavily involved in.)

Ruby 1.9 has much broader support for handling multiple character
encodings.
Is there any release plan for 1.9?

When it’s ready. 1.9.0 has already been released and there’s ongoing
patches to make it better. Follow ruby-core if you want more
information about Ruby 1.9. Matz still recommends Ruby 1.8.x for
production because there may be other incompatible changes with Ruby
1.9, but most of the breakers should be fixed. But it’s running on a
(newish) VM, so it’s something to work with for exercising the
language. I think that when 1.9.1 comes out, it’ll be much closer to
production quality.

-austin

On Tue, Apr 8, 2008 at 11:02 AM, Phillip G.
[email protected] wrote:

M. Edward (Ed) Borasky wrote:
| Well … I guess that depends on which docs you are talking about.
| There’s plenty of documentation on Rails, three of the major GUIs –
| Shoes, FXRuby and QtRuby – have books in “print” on them, there are two
| major Ruby “cookbooks”, the documentation on Ruport and RSpec is
| excellent, etc.

I shouldn’t have to buy a book (like the PickAxe), to get decent
documentation. A book should be one option, among many, to get to the
documentation. I’ll elaborate on that in the next paragraph.

Used to be that was the ONLY way to get documentation.

-austin