UTF-8 encoding with BOM under Ruby 1.8.x (Windows)

Moin, moin!

I still recognized, that a Ruby program encoded in UTF-8 with BOM
(standard unter Windows) still has problems when using Ruby 1.8.6 under
Windows, while the problem is fixed for Ruby 1.9.

Will there be a fix for future Ruby 1.8.x versions or will this problem
occur in future Ruby 1.8 releases too?

Wolfgang Nádasi-Donner

((%w(moin)*2).join ‘, ‘).capitalize+’!’ # DRY principle.

Wolfgang, I wonder if it wouldn’t be too tough to create a wrapper
program that stripped a BOM (if presnt) and passed the rest to the
real ruby. You could then associate this with the .rb extension, and
everything should work, right?

If it worked well enough, maybe this program should be named “ruby”,
and the real ruby renamed to “_ruby.exe” or something, so that even
scripts that are invoked by the normal “ruby” name call the BOM-aware
entry point.

I agree that this should be fixed for real in ruby, but I figured I’d
talk about workarounds anyway.

-rking

rking wrote:

I agree that this should be fixed for real in ruby, but I figured I’d
talk about workarounds anyway.

There is no problem with workarounds - my question is related to the
official Ruby releases, that will be used for the OneClickInstaller too.
I thought that it may not be difficult, because it works fine in Ruby
1.9.

Hi,

At Wed, 15 Aug 2007 23:16:58 +0900,
=?utf-8?Q?Wolfgang_N=c3=a1dasi=2dDonner?= wrote in [ruby-talk:264721]:

I still recognized, that a Ruby program encoded in UTF-8 with BOM
(standard unter Windows) still has problems when using Ruby 1.8.6 under
Windows, while the problem is fixed for Ruby 1.9.

Will there be a fix for future Ruby 1.8.x versions or will this problem
occur in future Ruby 1.8 releases too?

IIRC, no such change has been made in 1.9.

Nobuyoshi N. wrote:

IIRC, no such change has been made in 1.9.

But it works different in Ruby 1.9:

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby -v
ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-mswin32]

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby -Ku utf8.rb
utf8.rb:1: undefined method `´╗┐puts’ for main:Object (NoMethodError)

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby utf8.rb
utf8.rb:1: Invalid char \357' in expression utf8.rb:1: Invalid char\273’ in expression
utf8.rb:1: Invalid char `\277’ in expression

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby19 -v
ruby 1.9.0 (2007-05-15 patchlevel 0) [i386-mingw32]

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby19 -Ku utf8.rb
hello, world!

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby19 utf8.rb
hello, world!

C:\Dokumente und Einstellungen\wolfgang\Desktop>type utf8.rb
´╗┐puts “hello, world!”
C:\Dokumente und Einstellungen\wolfgang\Desktop>

Hi,

At Thu, 16 Aug 2007 16:42:55 +0900,
=?utf-8?Q?Wolfgang_N=c3=a1dasi=2dDonner?= wrote in [ruby-talk:264857]:

Nobuyoshi N. wrote:

IIRC, no such change has been made in 1.9.

But it works different in Ruby 1.9:

Sorry, I’ve forgotton that I added it 2 years ago, setting
$KCODE to ‘u’ if the source started with BOM, together with
pragma support.