Encoding, "extended ansi", and unicode in 1.9

denno · June 17, 2010, 4:22pm

I have a routine for converting ansi with “extended” ibm characters to
html. It is as follows…

EXTENDED_ANSI_TABLE = {
227.chr => “
”,
32.chr => " ",
128.chr => “Ç”, #128 C, cedilla (199)
129.chr => “ü”, #129 u, umlaut (252)
130.chr => “é”, #130 e, acute accent (233)
131.chr => “â”, #131 a, circumflex accent (226)
132.chr => “ä”, #132 a, umlaut (228)
133.chr => “à”, #133 a, grave accent (224)
134.chr => “å”, #134 a, ring (229)
135.chr => “ç”, #135 c, cedilla (231)
136.chr => “ê”, #136 e, circumflex accent (234)
137.chr => “ë”, #137 e, umlaut (235)
138.chr => “è”, #138 e, grave accent (232)
139.chr => “ï”, #139 i, umlaut (239)
140.chr => “î”, #140 i, circumflex accent (238)
141.chr => “ì”, #141 i, grave accent (236)
#big huge list continues for pages…
}

def parse_ansi_ext(str)

  EXTENDED_ANSI_TABLE.each_pair {|color, result|
    str = str.gsub(color,result)
  }
return str

end

This worked in 1.8, no problem.

If the input contains a character above 127.chr, it now bombs with the
error:

“Encoding::CompatibilityError at /
incompatible encoding regexp match (ASCII-8BIT regexp with ISO-8859-1
string)”

I’ve tried various acts of desperation to fix it, to no avail. I
don’t understand exactly what is wrong…

Thanks,

Dennis

denno · June 17, 2010, 4:22pm

On Thu, Jun 17, 2010 at 12:40 AM, Dennis N.
[email protected] wrote:

Â Â Â Â 132.chr => “ä”, Â Â #132 a, umlaut Â (228)
}
This worked in 1.8, no problem.

If the input contains a character above 127.chr, it now bombs with the error:

“Encoding::CompatibilityError at /
incompatible encoding regexp match (ASCII-8BIT regexp with ISO-8859-1 string)”

I’ve tried various acts of desperation to fix it, to no avail. Â I
don’t understand exactly what is wrong…

str has the encoding ISO-8859-1, probably inherited from your system
locale.
Convert it to ASCII-8BIT before processing it.

http://blog.grayproductions.net/articles/ruby_19s_string

denno · June 17, 2010, 4:26pm

On Wed, Jun 16, 2010 at 6:30 PM, Michael F.
[email protected] wrote:

str has the encoding ISO-8859-1, probably inherited from your system locale.
Convert it to ASCII-8BIT before processing it.

http://blog.grayproductions.net/articles/ruby_19s_string

Thanks, that worked. I guess we should always specify file encoding
from now on.

Take Care,

mark

–
“I’ve got ham but I’m not a hamster.”

-Bill Bailey

ulasmith · February 1, 2022, 8:47am

If you have Ansi file data and you want to Import your Ansi file to Unicode Account then you can take the help of any third-party tool that has the ability to Import Ansi file to an UnicodeAccount. but In my opinion, you should use the free trial version of this Ansi to Unicode Converter tool that Imports single and multiple Ansi files to Unicode Account in just simple and easy clicks.

Visit at : https://www.osttopstapp.com/ansi-to-unicode-pst.html

LindaMartin · February 6, 2022, 3:49pm

If you want to convert Ansi file into unicode than you can take help from a third-party tool which can convert easily, And in my suggestions you can take help from codeprozone.