Making a random string

llinklater · June 29, 2009, 5:18pm

I have been trying to generate a random string. One approach in, say,
pascal would be something like this:

function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0…25: result := chr(ord(‘a’) + r);
else : result := chr(ord(‘0’) + r);
end;
end;

I know that there is something like “a”.next but I need something more
like “a” + some_random_value. Even though it is more terse than the
Pascal, I am trying to avoid something time consuming and inelegant like

s = “a”
rand(26).times do {s.next!}

Any suggestions?

llinklater · June 29, 2009, 5:33pm

On Mon, Jun 29, 2009 at 5:18 PM, Lloyd L. [email protected]
wrote:

else : result := chr(ord(‘0’) + r);
end;
end;

‘Translating’ your Pascal program I’d do something like this:

def get_random_char
(r = rand(36)) < 26 ? (?a+r).chr : (?0+r-26).chr
end

10.times { puts get_random_char} # => 10

>> 5

>> x

>> p

>> k

>> s

>> x

>> d

>> w

>> 9

>> t

–
Il pinguino ha rubato il mio lanciafiamme.

Blog: http://citizen428.net/
Twitter: http://twitter.com/citizen428

llinklater · June 29, 2009, 5:45pm

Le 29 juin 2009 à 17:18, Lloyd L. a écrit :

Any suggestions?

(‘a’…‘z’).to_a + (‘0’…‘9’).to_a
=> [“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”,
“n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “0”,
“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”]

Or even :

[ *‘a’…‘z’, *‘0’…‘9’ ]
=> [“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”,
“n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “0”,
“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”]

So, if you need to compute a large number of random strings, store the
array aside in a constant :

CHARS = [ *‘a’…‘z’, *‘0’…‘9’ ]
=> [“a”, “b”, “c”, “d”, “e”, “f”, “g”, “h”, “i”, “j”, “k”, “l”, “m”,
“n”, “o”, “p”, “q”, “r”, “s”, “t”, “u”, “v”, “w”, “x”, “y”, “z”, “0”,
“1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”]

Then, you can use it to build your string :

def build_random_string(len)
r = ‘’
len.times { r << CHARS[rand(36)] }
r
end
=> nil

build_random_string(10)
=> “fdf93xdwq5”

Fred

llinklater · June 29, 2009, 6:01pm

On Mon, Jun 29, 2009 at 5:32 PM, Michael K. [email protected]
wrote:

‘Translating’ your Pascal program I’d do something like this:

Oh yeah, you said you want to build a string:

def generate_string(len)
raise ArgumentError if len < 1
(1…len).map{ get_random_char }.join
end

generate_string(10) # => “mjaxig1w35”

or

def generate_string(len)
raise ArgumentError if len < 1
s = ‘’
len.times { s << get_random_char }
s
end

generate_string(10) # => “dtuou833gq”

–
Il pinguino ha rubato il mio lanciafiamme.

Blog: http://citizen428.net/
Twitter: http://twitter.com/citizen428

llinklater · June 29, 2009, 9:45pm

On 29.06.2009 17:18, Lloyd L. wrote:

else : result := chr(ord('0') + r);
Any suggestions?
Stealing generate_id from my git repo:

github.com

rklemme/muppet-laboratories/blob/16dc8851554bf29cee37a0dd75a7869c99b10c7d/bin/test-gen.rb

#!/usr/local/bin/ruby19 -w

# create a sample log file to stdout

require 'optparse'

#
# types
#

TIME_FMT = '%Y-%m-%d %H:%M:%S.%3N'.freeze
DEFAULT_DURATION = 60 # 1 hour

MESSAGES = [
  'GET',
  'SET',
  <<EOS
java.io.FileNotFoundException: fred.txt
	at java.io.FileInputStream.<init>(FileInputStream.java)
	at java.io.FileInputStream.<init>(FileInputStream.java)

This file has been truncated. show original

Kind regards

robert

llinklater · June 29, 2009, 6:45pm

That works wonderfully. Thanks for the help.

llinklater · June 29, 2009, 10:38pm

From: “Lloyd L.” [email protected]

I have been trying to generate a random string.

I use:

def gen_random_string(len)
(0…len).collect{rand(36).to_s(36)}.map{|x|
(rand<0.5)?x:x.upcase}.join
end

…for short strings of length 64 or whatever. For very long strings,
the
above may be a bit inefficient. (To generate a 1_000_000 character
string takes about 2.4 seconds on my system.)

Regards,

Bill

llinklater · June 29, 2009, 11:05pm

On 29.06.2009 22:38, Bill K. wrote:

above may be a bit inefficient. (To generate a 1_000_000 character
string takes about 2.4 seconds on my system.)

Benchmark time!

robert

ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
user system total real
1 generate_id 0.110000 0.000000 0.110000 ( 0.105000)
1 gen_random_string 0.015000 0.000000 0.015000 ( 0.015000)
1 g3 0.016000 0.000000 0.016000 ( 0.014000)
1 g4 0.000000 0.000000 0.000000 ( 0.007000)
2 generate_id 0.015000 0.000000 0.015000 ( 0.004000)
2 gen_random_string 0.016000 0.000000 0.016000 ( 0.021000)
2 g3 0.031000 0.000000 0.031000 ( 0.026000)
2 g4 0.016000 0.000000 0.016000 ( 0.006000)
4 generate_id 0.000000 0.000000 0.000000 ( 0.010000)
4 gen_random_string 0.031000 0.000000 0.031000 ( 0.030000)
4 g3 0.047000 0.000000 0.047000 ( 0.043000)
4 g4 0.016000 0.000000 0.016000 ( 0.011000)
8 generate_id 0.015000 0.000000 0.015000 ( 0.012000)
8 gen_random_string 0.047000 0.000000 0.047000 ( 0.051000)
8 g3 0.078000 0.000000 0.078000 ( 0.081000)
8 g4 0.032000 0.000000 0.032000 ( 0.021000)
16 generate_id 0.015000 0.000000 0.015000 ( 0.020000)
16 gen_random_string 0.078000 0.000000 0.078000 ( 0.083000)
16 g3 0.157000 0.000000 0.157000 ( 0.150000)
16 g4 0.046000 0.000000 0.046000 ( 0.044000)
32 generate_id 0.032000 0.000000 0.032000 ( 0.038000)
32 gen_random_string 0.156000 0.000000 0.156000 ( 0.160000)
32 g3 0.281000 0.000000 0.281000 ( 0.296000)
32 g4 0.094000 0.000000 0.094000 ( 0.087000)
64 generate_id 0.078000 0.000000 0.078000 ( 0.080000)
64 gen_random_string 0.313000 0.000000 0.313000 ( 0.312000)
64 g3 0.562000 0.000000 0.562000 ( 0.563000)
64 g4 0.188000 0.000000 0.188000 ( 0.177000)
128 generate_id 0.140000 0.000000 0.140000 ( 0.151000)
128 gen_random_string 0.625000 0.000000 0.625000 ( 0.638000)
128 g3 1.156000 0.000000 1.156000 ( 1.211000)
128 g4 0.360000 0.000000 0.360000 ( 0.364000)
256 generate_id 0.328000 0.000000 0.328000 ( 0.322000)
256 gen_random_string 1.172000 0.000000 1.172000 ( 1.236000)
256 g3 2.172000 0.000000 2.172000 ( 2.223000)
256 g4 0.703000 0.000000 0.703000 ( 0.781000)
512 generate_id 0.625000 0.000000 0.625000 ( 0.624000)
512 gen_random_string 2.422000 0.000000 2.422000 ( 2.502000)
512 g3 4.406000 0.000000 4.406000 ( 4.674000)
512 g4 1.406000 0.000000 1.406000 ( 1.453000)
ruby 1.9.1p129 (2009-05-12 revision 23412) [i386-cygwin]
user system total real
1 generate_id 0.000000 0.000000 0.000000 ( 0.002000)
1 gen_random_string 0.015000 0.000000 0.015000 ( 0.009000)
1 g3 0.016000 0.000000 0.016000 ( 0.016000)
1 g4 0.016000 0.000000 0.016000 ( 0.006000)
2 generate_id 0.000000 0.000000 0.000000 ( 0.002000)
2 gen_random_string 0.000000 0.000000 0.000000 ( 0.009000)
2 g3 0.031000 0.000000 0.031000 ( 0.018000)
2 g4 0.000000 0.000000 0.000000 ( 0.004000)
4 generate_id 0.000000 0.000000 0.000000 ( 0.003000)
4 gen_random_string 0.016000 0.000000 0.016000 ( 0.018000)
4 g3 0.031000 0.000000 0.031000 ( 0.022000)
4 g4 0.000000 0.000000 0.000000 ( 0.007000)
8 generate_id 0.015000 0.000000 0.015000 ( 0.005000)
8 gen_random_string 0.032000 0.000000 0.032000 ( 0.031000)
8 g3 0.031000 0.000000 0.031000 ( 0.032000)
8 g4 0.016000 0.000000 0.016000 ( 0.012000)
16 generate_id 0.015000 0.000000 0.015000 ( 0.013000)
16 gen_random_string 0.063000 0.000000 0.063000 ( 0.058000)
16 g3 0.047000 0.000000 0.047000 ( 0.054000)
16 g4 0.016000 0.000000 0.016000 ( 0.023000)
32 generate_id 0.031000 0.000000 0.031000 ( 0.029000)
32 gen_random_string 0.109000 0.000000 0.109000 ( 0.104000)
32 g3 0.110000 0.000000 0.110000 ( 0.112000)
32 g4 0.031000 0.000000 0.031000 ( 0.040000)
64 generate_id 0.063000 0.000000 0.063000 ( 0.060000)
64 gen_random_string 0.218000 0.000000 0.218000 ( 0.232000)
64 g3 0.203000 0.000000 0.203000 ( 0.209000)
64 g4 0.094000 0.000000 0.094000 ( 0.095000)
128 generate_id 0.140000 0.000000 0.140000 ( 0.151000)
128 gen_random_string 0.407000 0.000000 0.407000 ( 0.503000)
128 g3 0.375000 0.000000 0.375000 ( 0.380000)
128 g4 0.187000 0.000000 0.187000 ( 0.180000)
256 generate_id 0.313000 0.000000 0.313000 ( 0.365000)
256 gen_random_string 0.812000 0.000000 0.812000 ( 0.806000)
256 g3 0.688000 0.000000 0.688000 ( 0.708000)
256 g4 0.359000 0.000000 0.359000 ( 0.352000)
512 generate_id 0.578000 0.000000 0.578000 ( 0.599000)
512 gen_random_string 1.547000 0.000000 1.547000 ( 1.549000)
512 g3 1.328000 0.000000 1.328000 ( 1.339000)
512 g4 0.735000 0.000000 0.735000 ( 0.807000)

require ‘benchmark’

def generate_id(len = 15)
s = ‘’
len.times { s << 97 + rand(26) }
s.freeze
end

def gen_random_string(len)
(0…len).collect{rand(36).to_s(36)}.map{|x|
(rand<0.5)?x:x.upcase}.join
end

def g3(len)
s = “.” * len
s.gsub!(/./) { (97 + rand(26)).chr }
s
end

def g4 len
s = “.” * len
len.times {|i| s[i] = (97 + rand(26)).chr}
s
end

REP = 1000

Benchmark.bm 25 do |b|
len = 1

while len < 1_000

 b.report '%7d generate_id' % len do
   REP.times do

generate_id len
end
end

 b.report '%7d gen_random_string' % len do
   REP.times do

gen_random_string len
end
end

 b.report '%7d g3' % len do
   REP.times do

g3 len
end
end

 b.report '%7d g4' % len do
   REP.times do

g4 len
end
end

 len <<= 1

end
end

llinklater · June 30, 2009, 10:57am

Lloyd L. wrote:

function GetRandomChar: char;
var
r: integer;
begin
r := random(36);
case r of
0…25: result := chr(ord(‘a’) + r);
else : result := chr(ord(‘0’) + r);
end;
end;

In this case, you can just do:

def get_random_char
rand(36).to_s(36)
end

Since Ruby allows arbitrary bignums, you can also get strings this way
too. e.g. for an 8-digit string:

rand(36 ** 8).to_s(36)

However there’s a bug there, because numbers with one or more leading
zeros will be truncated. How to left-pad a non-decimal number with zeros
isn’t actually that obvious. Maybe someone can point out something
simpler than this:

("0"*8 + rand(36 ** 8).to_s(36))[-8..-1]

(Unfortunately, “%08s” as a format string pads with spaces not zeros)

llinklater · June 30, 2009, 11:07am

2009/6/30 Brian C. [email protected]:

end;
rand(36 ** 8).to_s(36)
Interesting idea! Does rand have enough precision to fill arbitrary
large numbers?

However there’s a bug there, because numbers with one or more leading
zeros will be truncated. How to left-pad a non-decimal number with zeros
isn’t actually that obvious. Maybe someone can point out something
simpler than this:

(“0”*8 + rand(36 ** 8).to_s(36))[-8…-1]

(Unfortunately, “%08s” as a format string pads with spaces not zeros)

irb(main):012:0> “e”.rjust 10, “0”
=> “000000000e”
irb(main):013:0>

Kind regards

robert

llinklater · June 30, 2009, 11:11am

Robert K. wrote:

irb(main):012:0> “e”.rjust 10, “0”
=> “000000000e”
irb(main):013:0>

Neat, thanks. So:

rand(36 ** 8).to_s(36).rjust(8,“0”)

Useful for random binary strings too:

rand(2 ** 8).to_s(2).rjust(8,“0”)

llinklater · June 30, 2009, 10:20pm

All very groovy stuff! The final version I think I shall use is a bit
of an amalgam of your input.

def randString(len)
s = “”
1.upto(len) { s << rand(36).to_s(36) }
s.upcase
end

At one point I had “unless len < 1” but this functions the same way.

Thanks again, everyone!

llinklater · June 30, 2009, 11:06pm

On 30.06.2009 22:20, Lloyd L. wrote:

All very groovy stuff! The final version I think I shall use is a bit
of an amalgam of your input.

def randString(len)
s = “”
1.upto(len) { s << rand(36).to_s(36) }

I’d prefer len.times but this is just cosmetic.

s.upcase

You can probably squeeze out a bit performance especially for large
strings by replacing this with “s.upcase!; s”.

end

At one point I had “unless len < 1” but this functions the same way.

Thanks again, everyone!

You’re welcome!

Kind regards

robert

llinklater · July 1, 2009, 7:57am

Robert K. wrote:

s.upcase

You can probably squeeze out a bit performance especially for large
strings by replacing this with “s.upcase!; s”.

A very bad idea IMO. Why make your code larger and less readable for the
sake of perhaps one microsecond or less? If performance matters on this
microscopic scale, you should be writing in C.

llinklater · July 1, 2009, 12:38pm

On 1 Jul 2009, at 06:57, Brian C. wrote:

microscopic scale, you should be writing in C.
I agree that “s.upcase!; s” is ugly, and I really wish the bang
methods returned self on success and raised an exception on error as
that’s closer to how I use them than the current approach, but it’s
still a well-established idiom and hardly likely to confuse even a
neophyte so long as they bother to RTFM.

As to the notion that we should only write performant code in C… why
complicate a codebase by using two languages (with all the debugging
nightmare that can entail) if the language you’re already working in
is capable of doing the job anyway?

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

llinklater · July 1, 2009, 1:31pm

Eleanor McHugh wrote:

As to the notion that we should only write performant code in C… why
complicate a codebase by using two languages (with all the debugging
nightmare that can entail) if the language you’re already working in
is capable of doing the job anyway?

“Doing the job” is the critical part of that sentence.

In my opinion, if (and only if) your existing program won’t do the job
within acceptable parameters, should you start modifying the code to
make it acceptable. Since you should have “done the simplest thing that
will possibly work” in the first place, then by definition, the modified
code will be more complex.

But more importantly: profile first, modify second. I find it highly
unlikely that in a real-world program, removing that one single hidden
string dup will make a noticeable improvement. More likely you’ll want
to change your algorithm or data structures.

Of course, if in your particular application this change does improve
performance noticeably, then by all means make the change (and add a
comment as to why it was necessary to write it in a non-obvious way, so
that somebody doesn’t simplify it back again later). But I think that’s
the point: write more complex code only if it makes a measurable
improvement, not on the off-chance that it might.

llinklater · July 1, 2009, 1:45pm

Hi –

On Wed, 1 Jul 2009, Eleanor McHugh wrote:

I agree that “s.upcase!; s” is ugly, and I really wish the bang methods
returned self on success and raised an exception on error as that’s closer to
how I use them than the current approach, but it’s still a well-established
idiom and hardly likely to confuse even a neophyte so long as they bother to
RTFM.

I don’t think the distinction is between success and failure, though.
(str=“ABC”).upcase! succeeds – it just doesn’t change str. (I’m not a
huge fan of the nil returns either, by the way.)

Of course there’s always:

str = “ABC”
str.tap(&:upcase!)

David

llinklater · July 1, 2009, 1:32pm

2009/7/1 Brian C. [email protected]:

Robert K. wrote:

s.upcase

You can probably squeeze out a bit performance especially for large
strings by replacing this with “s.upcase!; s”.

A very bad idea IMO. Why make your code larger and less readable for the
sake of perhaps one microsecond or less?

I do not subscribe to the “less readable” assessment of yours -
uglier, yes. Also note that a microsecond per execution can be
harmful when the method is invoked often and / or the rest of the code
is not much costlier.

If performance matters on this
microscopic scale, you should be writing in C.

I couldn’t have put it better than Ellie. Notice that object
allocation is one of the most expensive operations in Ruby. So it may
pay off to save one. Btw, this is also the reason why my solution
only works with Fixnums.

Apart from that I find unnecessary object creation ugly. You may call
that personal taste but with GC in mind there is also a quantifiable
reason to not waste objects.

Kind regards

robert

llinklater · July 1, 2009, 2:34pm

On 1 Jul 2009, at 12:31, Robert K. wrote:

I couldn’t have put it better than Ellie. Notice that object
allocation is one of the most expensive operations in Ruby. So it may
pay off to save one. Btw, this is also the reason why my solution
only works with Fixnums.

Apart from that I find unnecessary object creation ugly. You may call
that personal taste but with GC in mind there is also a quantifiable
reason to not waste objects.

I share that view. Being promiscuous with resources just because it’s
simple to do so seems like a sure-fire way to build applications with
intrinsic scalability problems. It may save me some effort today, but
experience taught me long ago that it’s a decision that will come back
to haunt me.

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

llinklater · July 1, 2009, 2:29pm

On 1 Jul 2009, at 12:44, David A. Black wrote:

I don’t think the distinction is between success and failure, though.
(str=“ABC”).upcase! succeeds – it just doesn’t change str. (I’m not a
huge fan of the nil returns either, by the way.)

Of course there’s always:

str = “ABC”
str.tap(&:upcase!)

I’m still adjusting to these 1.9 conveniences

Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

raise ArgumentError unless @reality.responds_to? :reason

Making a random string

>> 5

>> x

>> p

>> k

>> s

>> x

>> d

>> w

>> 9

>> t

Eleanor McHugh Games With Brains http://slides.games-with-brains.net

Eleanor McHugh Games With Brains http://slides.games-with-brains.net

Eleanor McHugh Games With Brains http://slides.games-with-brains.net

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net