Unicode characters are destroyed, but when?

Hi all.

I made a test case as follows. Both of the tests below fail for me on
both JRuby 1.3.1 (the version we were previously at) and JRuby 1.5.1
(the version we are at now.)

I was about to submit a ticket under Embedding since I thought the
Java ScriptEngine might have been at fault, but then I thought I would
try with ScriptingContainer, and that turns out to behave the same.
So I’m no longer certain which component the ticket belongs under.

I tried setting that to 1.9 behaviour as well, and 1.9 mode also
failed. Also setting KCode to UTF-8 doesn’t help either, which I
thought was surprising (even though it won’t help us anyway, as we are
using ScriptEngine, not ScriptingContainer.)

Posting here in any case, because someone might have a suggestion of a
way to prevent it doing this.

Note that the test will not be meaningful if you are on a system which
is either UTF-8 or a Japanese locale. I’m sure it will pass on both
of those. So if you’re running the test on those you will need to add
-Dfile.encoding=windows-1252 to reproduce the problem. I suspect that
it occurs on any encoding where the sequence is not representable in
the native encoding.

I think I reported something along these lines in the past, but I
can’t seem to find a trace of the mail. This is the first time I have
reproduced it reliably though.

TX

public class TestUnicodeCharacters {
String orig = “\u3070\u304B\u3084\u308D\u3046”;
String scriptlet = “str = "” + orig + “" ; puts str ; str”;
Writer writer = new StringWriter();

   @Test
   public void testCharacterEncodingViaScriptEngine() throws 

Exception {
ScriptEngine engine = new
ScriptEngineManager().getEngineByExtension(“rb”);
ScriptContext context = engine.getContext();
context.setWriter(writer);

       String result = (String) engine.eval(scriptlet, context);
       checkValues(result);
   }

   @Test
   public void testCharacterEncodingViaScriptContainer() throws 

Exception {
ScriptingContainer container = new ScriptingContainer();
container.setWriter(writer);
// The next lines don’t help anyway, but I half expected them
to…
//container.setCompatVersion(CompatVersion.RUBY1_9);
//container.setKCode(KCode.UTF8);

       String result = (String) container.runScriptlet(scriptlet);
       checkValues(result);
   }

   private void checkValues(String returnedResult) {
       assertEquals("Wrong result returned", orig, returnedResult);
       assertEquals("Wrong result printed", orig,

writer.toString().trim());
}
}


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Since you have a reproducible test case, would you mind filing this as
a bug in http://bugs.jruby.org/?

Thanks,
/Nick

On Wed, Jun 16, 2010 at 11:38 PM, Trejkaz [email protected] wrote:

of those. So if you’re running the test on those you will need to add

      context.setWriter(writer);
      //container.setCompatVersion(CompatVersion.RUBY1_9);
  }

}


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Hi,

On Thu, Jun 17, 2010 at 12:38 AM, Trejkaz [email protected] wrote:

Posting here in any case, because someone might have a suggestion of a
way to prevent it doing this.

Note that the test will not be meaningful if you are on a system which
is either UTF-8 or a Japanese locale. I’m sure it will pass on both
of those. So if you’re running the test on those you will need to add
-Dfile.encoding=windows-1252 to reproduce the problem. I suspect that
it occurs on any encoding where the sequence is not representable in
the native encoding.

So, you tried the encoding which is not the default one, right?
This should be the bug of RedBridge’s WriterOutputStream since it
chooses default encoding when StringWriter is given. So, as Nick said,
would you file this?

I think I reported something along these lines in the past, but I
can’t seem to find a trace of the mail. This is the first time I have
reproduced it reliably though.

I think you filed at JRuby Embed project on kenai.com.

-Yoko

  public void testCharacterEncodingViaScriptEngine() throws Exception {
  public void testCharacterEncodingViaScriptContainer() throws Exception {
  private void checkValues(String returnedResult) {

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email