How to send utf8 data to remote computer in ruby 1.9.2

luislavena · June 29, 2011, 11:08pm

Hi,

I have routines to send commands to remote computer.

def issueCommandInKazBox5(okMachine="", command="", matchLog="",
user=“admin”, passwd=“kazeon”)
puts " issueCommandInKazMachine=#{okMachine}"
puts " issueCommandInKazcommand=#{command}"
puts " issueCommandInKazstatus=#{matchLog}"
puts " issueCommandInKazuser=#{user}"
puts " issueCommandInKazpasswd=#{passwd}"

Net::SSH.start(okMachine, user, :password => “kazeon”, :auth_methods
=> [“password”]) do |session|

session.open_channel do |channel|

channel.request_pty do |ch, success|
  raise "Error requesting pty" unless success

  ch.send_channel_request("shell") do |ch, success|
    raise "Error opening shell" unless success
  end
end

channel.on_data do |ch, data|
  STDOUT.print data
end

channel.on_extended_data do |ch, type, data|
  STDOUT.print "Error: #{data}\n"
end

channel.send_data "#{command}\n".force_encoding('utf-8')

channel.send_data “exit\n”

session.loop

end
end

end

I fire commands like this to get the command fire on remote machine :

issueCommandInKazBox($node1, “add user bhavesh role admin”, “”)

It works properly for english data.

But when i have utf8 data, it throws error :

issueCommandInKazBox($node1, “add user âêžýáíúöóá¿ role admin”, “”)

Error is :

Error:
test_0001(TC_I18N):
OpenSSL::Cipher::CipherError: data not multiple of block length
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/transport/stat
e.rb:85:in final' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/transport/stat e.rb:85:infinal_cipher’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/transport/pack
et_stream.rb:142:in enqueue_packet' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/transport/sess ion.rb:223:inenqueue_message’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:368:in send_message' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/cha nnel.rb:493:inenqueue_pending_output’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/cha
nnel.rb:312:in process' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:214:inblock in preprocess’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:214:in each' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:214:inpreprocess’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:197:in process' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:161:inblock in loop’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:161:in loop' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:161:inloop’
C:/IE_AUTOMATION/kazeon/qa/watir-v1_4-TIP/Lib/KazeonCommon/BasicMethodLibrar
y.rb:2523:in block (2 levels) in issueCommandInKazBox5' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/cha nnel.rb:513:incall’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/cha
nnel.rb:513:in do_open_confirmation' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:535:inchannel_open_confirmation’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:456:in dispatch_incoming_packets' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:213:inpreprocess’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:197:in process' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:161:inblock in loop’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:161:in loop' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses sion.rb:161:inloop’
C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/ses
sion.rb:110:in close' C:/Ruby192/lib/ruby/gems/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh.rb:191:insta
rt’
C:/IE_AUTOMATION/kazeon/qa/watir-v1_4-TIP/Lib/KazeonCommon/BasicMethodLibrar
y.rb:2500:in issueCommandInKazBox5' C:/IE_AUTOMATION/kazeon/qa/watir-v1_4-TIP/KazModules/I18N/I18N_Regression.rb :71:intest_0001’

I already have encoding set to utf8 in my script.

what do i have to change to get it working? and where?

Bhavesh

bhavesh1_sharma · July 1, 2011, 3:03pm

Bhavesh S. wrote in post #1008313:

what do i have to change to get it working? and where?

You’ve got several options.

Use ruby 1.8 - seriously. Life will be sweet, move on.

If you can’t do that, then:

Try making random changes to your code using guesswork, e.g.

issueCommandInKazBox($node1, “add user âêžýáíúöóá¿ role
admin”.force_encoding(“ASCII-8BIT”), “”)

and if you stumble across a combination that works, cross your fingers
that it doesn’t break elsewhere.

Turn this into a test case for Net::SSH, submit it to the maintainer,
and hope they change the internals of Net::SSH to work in the way you
want.

I notice someone else was affected by a related issue today:
http://www.ruby-forum.com/topic/2043661

The problem is: ruby 1.9 has an ill-considered bastardised notion of
String which confuses “text” (i.e. a sequence of characters), “bytes of
text” (i.e. bytes representing a sequence of characters encoded in a
particular encoding), and “data” (i.e. things which aren’t text, such as
blocks of crypto data in Net::SSH) into the same fuzzy mess.

It then has a whole load of carefully-crafted heuristics which mean that
quite often it works. However, it also means that it hides problems when
you’re accidentally treating text as data, or vice versa. And that in
turn means that sometimes your program will crash, when you (or someone
else) forgot to identify the places where the heuristics need to be
overridden, as you just found with Net::SSH.

Through reverse-engineering, I found about 200 behaviours of strings and
encodings in ruby 1.9:

github.com

candlerb/string19/blob/master/string19.rb

#!/usr/bin/env ruby19
# encoding: UTF-8
# This document is Copyright (C) Brian Candler 2009 and released under a
# Creative Commons Attribution-NonCommercial 3.0 Unported License.

############# CONTENTS ###################

# -1. PREAMBLE
#  0. INTRODUCTION
#  1. ENCODINGS
#  2. PROPERTIES OF ENCODINGS
#  3. STRING, FILE AND REGEXP ENCODINGS
#  4. VALID AND FIXED ENCODINGS
#  5. COMPATIBLE OBJECTS
#  6. STRING CONCATENATION
#  7. THE BINARY / ASCII-8BIT ENCODING
#  8. SINGLE CHARACTERS
#  9. EQUALITY AND COLLATION
# 10. HASH AND EQL?
# 11. UPPER AND LOWER CASE

This file has been truncated. show original

but there are still plenty of open ends: in particular, all third party
libraries which either return a String or accept a String as an
argument need to document their encoding-related behaviour. Suffice to
say, I’m never touching 1.9 again.

Your other option would be to look outside Ruby. Python 3 makes a clear
distinction between “text” and “data” (where “data” includes “text
encoded as a sequence of bytes”). Unfortunately, Python 3 still doesn’t
have widespread library support, as code needs to be modified to work
with it.

Erlang also has a pretty clear distinction between lists of characters
(unicode codepoints) and binaries, which can contain text represented in
a particular encoding. It also has good ssh client and server libraries
bundled. You can wrap it with efene if you want a friendlier syntax.

HTH,

Brian.

bhavesh1_sharma · August 16, 2011, 8:47pm

Hi Brian,

Thanks for your inputs.

I tried the way you mentioned :

issueCommandInKazBox($node1, “add user âêžýáíúöóá¿ role
admin”.force_encoding(“ASCII-8BIT”), “”)

And it worked

I still surprise why it is not working when i use utg8 instead of
ascii-8bit
i.e.

issueCommandInKazBox($node1, “add user âêžýáíúöóá¿ role
admin”.force_encoding(“UTF-8”), “”)

Thanks for your input again.

Bhavesh

bhavesh1_sharma · August 16, 2011, 8:54pm

How can I use utf-8 in Ruby 1.9?
Use ruby 1.8
For me the current situation with encoding in ruby 1.9 looks like a bad
joke.
I believe, if there will be no fix for it, next time the advice may
sound like “don’t use Ruby, use Scala, Node.js or something other”

bhavesh1_sharma · August 17, 2011, 1:42pm

2011/8/16 Alexey P. [email protected]:

For me the situation with encoding in ruby 1.9 looks like a hairy bug
and a bad joke.

Hi, could you please point to the exact problem you mean related to
Ruby 1.9 encoding?

bhavesh1_sharma · August 16, 2011, 9:09pm

2011/8/16 Bhavesh S. [email protected]:

I still surprise why it is not working when i use utg8 instead of
ascii-8bit

Just for future reference: “ASCII-8BIT” encoding is the same as
“BINARY” encoding.

Therefore setting encoding to “ASCII-8BIT” is equivalent to “erasing”
encoding information, and many libraries who depend on “byte” having
the same meaning as “character” (ie., anything that uses string[n]
notation on binary data…) just may accidentally work right.

If you ever get encoding-related errors, common sense doesn’t seem to
work, and you don’t really care about the data that blows up, just
force_encoding(‘binary’) on everything

– Matma R.

bhavesh1_sharma · August 17, 2011, 5:42pm

I think the reason why Ruby has all these encodings, instead of simply
two: “UTF-8 string” and “binary data”, is that there are some problems
(data losses or something) when converting some obscure Japanese
encodings to UTF and back. And since Ruby has strong Japanese
connections, developers cared about this enough to create the
situation we have now.

(Disclaimer: I may be wrong, I just read that somewhere.)

– Matma R.

2011/8/17 Alexey P. [email protected]:

bhavesh1_sharma · August 17, 2011, 4:13pm

Hi, could you please point to the exact problem you mean related to Ruby 1.9
encoding?

There is no way to set default source encoding (supplying
command-line option is very inconvenient way to do it).
So, You has to put this extra line to source files # encoding: utf-8
Actually, I even encounter an article when someone created rake task
that reads all files in lib and prepend this line, and he’s called it
the ‘progress’. Well if this is the measure of the “progress” then the
Java with his bloated Code Generation tools should be scored as far more
“progressive” than Ruby.
In my point it’s regress, not the progress.
Sometimes here and there strange problems with encoding popped up
(for example hhere are some problems with saving utf-8 characters in
YAML) and You has to spend a couple of hours on this boring and
zero-value stuff trying to resolve it.
Why at all should I ever bother about encoding at these days? The
only possible justification may be - the performance improvements, but
Ruby is anyway slow, with or without any encoding optimization.

According to TIOBE the Java user base, installation base and community
are about 30 times bigger than Ruby, and there are only one available
encoding - utf-8, and it’s absolutely enough to cover all the possible
use cases.

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to “bloated Java”?

P.S.
Lot of time You didn’t encounter these problems because Rails do some
encoding stuff for You, and 90% of Ruby used for web dev with Rails.
But i believe that Ruby is not only about Rails & WebDev, it’s also
should be convenient and applicable to other cases.

bhavesh1_sharma · August 17, 2011, 7:54pm

I believe that solving problems of 1% by introducing problems to 99% of
everyone else is not the right way to go.

Why not make the utf-8 default and other encoding possible?

bhavesh1_sharma · August 17, 2011, 9:16pm

On Aug 17, 2011, at 11:54 AM, Alexey P. [email protected] wrote:

I believe that’s not the right way to solve problems of 1% by
introducing problems to 99% of everyone else.

That sounds like a good argument. Except…from where do you get your 1%
number? Japan is roughly 2% of the world population, but likely a huge
portion of Ruby users.

bhavesh1_sharma · August 17, 2011, 9:55pm

I did some googling and turns out I was right. Here’s an article I
found:

let me cite the relevant part:

Other languages, such as Java and Python, solve this problem by encodeing every
String that enters the language as UTF-8 (or UTF-16). […]

However, this solution does not work very well for the Japanese community. For a
variety of complicated reasons, Japanese encoding, such as SHIFT-JIS, are not
considered to losslessly encode into UTF-8. As a result, Ruby has a policy of not
attempting to simply encode any inbound String into UTF-8.

The article itself is actually a pretty interesting read.

Also, Alexey, that 1% you pulled out of nowhere is certainly wrong.
Just look at, say, the official Ruby language issues tracker -
Issues - Ruby Issue Tracking System - and count issues written in
Japanese, or by Japanese people. There aren’t many here, since this is
an English-language list.

– Matma R.

bhavesh1_sharma · August 17, 2011, 10:03pm

In 99.9% of cases You use UTF and You are happy! So, why Ruby that
positioned as simple and beautiful language have such messy situation
with encoding, compared to “bloated Java”?

Another good read on the subject:

“Ruby multilingualization (M17N) of Ruby 1.9 uses the code set
independent model (CSI) while many other languages use the Unicode
normalization model.”

“Under the CSI model, all encodings are handled equally, which means,
Unicode is one of character sets. The most remarkable feature of the
CSI model is that the model does not require a character code
conversion since external and internal character codes are identical.
Thus, the cost for conversion can be eliminated. Besides, we can keep
away from unexpected information loss caused by the conversion,
especially by cutting bits or bytes off. Ruby uses the CSI model, so
do Solaris, Citrus, or other system based on the C library that does
not use STDC_ISO_10646.”

“Moreover, it is possible to handle various character sets even though
they are not based on Unicode.”

bhavesh1_sharma · August 18, 2011, 5:00am

For me the current situation with encoding in ruby 1.9 looks like
a bad joke.

I believe, if there will be no fix for it, next time the advice
may sound like “don’t use Ruby, use Scala, Node.js or
something other”

I did not make the original comment, however the Encoding situation in
1.9 is still the reason why I have not moved to 1.9.

I love ruby but I don’t know if I will use 1.9. I am too lazy to deal
with all the Encoding stuff, especially as I myself simply do not need
the Encoding at all.

I use Ruby as a replacement over PHP and Perl.

Perhaps one day I find an easy way to walk over with all my äöü Umlauts
in my .rb files but until that day comes, Ruby 1.9.x just complains
about it, and I couldn’t care less if it does. I don’t know to which
other language I will move, Ruby spoiled me.

I am only glad ruby 1.8.x still works and I will probably end up using
it as long as possible before I will either eventually switch to another
language or somehow learn to deal with the Encoding crap.

bhavesh1_sharma · August 18, 2011, 12:08pm

2011/8/18 Marc H. [email protected]:

in my .rb files but until that day comes, Ruby 1.9.x just complains
about it, and I couldn’t care less if it does. I don’t know to which
other language I will move, Ruby spoiled me.

I am only glad ruby 1.8.x still works and I will probably end up using
it as long as possible before I will either eventually switch to another
language or somehow learn to deal with the Encoding crap.

Sorry, Marc, but to me it sounds like you just assumed it won’t work
without even trying. (And you’re part of the reason why everybody has
to wiggle their code around Ruby 1.8 compatibility instead of just
using new features.)

I can’t tell why Ruby barfs over your umlauts, since you didn’t bother
to write anything about it, but have you tried adding the “# coding:
utf-8” line at the top of your .rb files and actually saving them as
UTF? Have you also tried setting default internal encoding, or setting
encoding explicitly when reading/writing files, if that’s what didn’t
work for you? Really, making all your (simple) scripts compatible with
1.9 is two lines of code.

bhavesh1_sharma · August 18, 2011, 12:39am

Except…from where do you get your 1% number? Japan is roughly 2% of the world
population, but likely a huge portion of Ruby users.

Also, Alexey, that 1% you pulled out of nowhere is certainly wrong.

Good point, I agree with You, my estimation was wrong. Let’s calculate
it more preceiselly (using Google exact phrase search):
I googled for the “ruby language” phrase in different languages and
calculated the return counts.

“ruby language” - 280,000
“язык программирования ruby” - 62,800
“Ruby言語” - 41,500
“Ruby é uma linguagem” - 30,000
“Ruby es un lenguaje de programación” - 28,200
“Ruby è un linguaggio” - 21,000
“Ruby est un langage de programmation” - 10,400

~ 8% (please correct me if some translations are wrong, I used google
translator & wikipedia)

I’m very thankful for Matz and Japanese contributors. But I believe we
need to make it clear - should the Ruby stay a small language to be used
by closed community or should it move in the direction of be generally
applicable to lots of domains.

Don’t forget also, that lot of attraction in the latest years came from
the fame of Rails, but it can’t fuel it forever, for example - take a
look at this
picture: http://www.google.com/trends?q=ruby+language%2C+python+language

bhavesh1_sharma · August 18, 2011, 5:17pm

Bartosz Dziewoński wrote in post #1017168:

let me cite the relevant part:

Other languages, such as Java and Python, solve this problem by encodeing every
String that enters the language as UTF-8 (or UTF-16). […]

However, this solution does not work very well for the Japanese community. For
a
variety of complicated reasons, Japanese encoding, such as SHIFT-JIS,
are not
considered to losslessly encode into UTF-8. As a result, Ruby has a
policy of not
attempting to simply encode any inbound String into UTF-8.

Except that even ruby 1.9 cannot handle SHIFT-JIS (it’s a stateful
encoding)