Lionel B. wrote the following on 11.09.2007 12:48 :
As i said, the hash is far larger which is why i can’t just do it with
yourstring.split(//).map{|c| hash[c] || c}.join
Note that if your hash is only used to convert single characters to
single characters, you can use String#tr (or tr!). If you are after
performance, as you must prepare the strings used by String#tr from your
hash, you’ll have to bench it to see if it’s worth it in your use case
even if String#tr is faster in itself.
If you are processing UTF-8 content, String#tr is probably not safe
(there are libraries out there for fixing this though IIRC), but my
first answer probably is (assuming $KCODE=‘u’; require ‘jcode’…) as
the regexp processing is utf-8 aware, so the String#split should be
safe.
Thanks that worked well, And no its not single chars, Which is the only
reason i’m doing it this way…
I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example
h = {"~" => “~”}
"hmm ~’.split(/ /).map{|c| h[c] || c}.join(’ ')
Outputs hmm ~, but obviously doing things like question marks wont work,
Maybe i’ll have to use loops and string#tr
Thanks that worked well, And no its not single chars, Which is the only
reason i’m doing it this way…
I have to split on whitespace (/ /) because spliting on characters would
obviously split the text i want to transform, which means it wont match
if the characters are trailing another word, HTML special chars for
example
h = {"~" => “~”}
If you’re just trying to translate numeric html entities it’s easy:
str.gsub(/&#(\d+);/){ [$1.to_i].pack(‘U’) }
If you also want named entities I suggest the htmlentities gems.
If it’s for a more general case, how about:
rx = Regexp.new(hash.keys.map{|k|Regexp.escape(k)}.join("|"))
str.gsub(rx){ hash[$&] }