CORE - Inconsistent Handling of Uninitialized Variables

dubstep · June 15, 2011, 7:53pm

puts “\n== Testin in MAIN Context ==”

local = ‘local’
@instance = ‘instance’
@@class = ‘class’
$global = ‘global’

puts “#@instance, #@@class, #$global, #{local}”

begin puts $empty_global == nil rescue puts “undefined” end
begin puts @empty_instance == nil rescue puts “undefined” end
begin puts empty_local == nil rescue puts “undefined” end
begin puts @@empty_class == nil rescue puts “undefined” end

class VarTest
puts “\n== Testin in Class Context ==”

local = ‘local’
@instance = ‘instance’
@@class = ‘class’
$global = ‘global’

puts “#@instance, #@@class, #$global, #{local}”

begin puts $empty_global == nil rescue puts “undefined” end
begin puts @empty_instance == nil rescue puts “undefined” end
begin puts empty_local == nil rescue puts “undefined” end
begin puts @@empty_class == nil rescue puts “undefined” end

end
#OUTPUT

== Testin in MAIN Context ==
instance, class, global, local
true
true
undefined
undefined

== Testin in Class Context ==
instance, class, global, local
true
true
undefined
undefined

The inconsistency:

become nil, do not raise error:
$empty_global
@empty_instance

are undefined, raise an error:
empty_local
@@empty_class

Is this a defect or is there an explanation for this behaviour?

.

Ilias_L · June 15, 2011, 8:06pm

On Jun 15, 2011, at 1:50 PM, Ilias L. wrote:

Is this a defect or is there an explanation for this behaviour?

I can speak to local variables; class variables still break my brain a
little.

Local variables are created during parsing and initialized to nil when
they
are first encountered. They are available for use at any point in the
same
scope lexically after their initialization.

This is done because if you attempt to read a local variable lexically
before it has
been introduced in a local context, we can be sure you have made an
error.

For global variables and instance variables, we cannot have such lexical
guarantees. While it may be obvious within the scope of a simple program
that a global or ivar has not yet been introduced, it is not a local
property.
Thus, access is permitted; if the variable has not been initialized,
then it
is initialized to nil.

The same occurs when Ruby sees an introduced, but uninitialized, local
variable:

def foo(y)
if false
x = y
end
p x
end
foo(10)

#=> nil

The local is seen at “x = y”, created in the local variable table, and
initialized
to nil. The reference to “x” later succeeds because the local has been
created,
though never initialized.

Michael E.
[email protected]
http://carboni.ca/

Ilias_L · June 15, 2011, 11:00pm

On 15 , 21:05, Michael E. [email protected] wrote:

On Jun 15, 2011, at 1:50 PM, Ilias L. wrote:

Is this a defect or is there an explanation for this behaviour?

I can speak to local variables; class variables still break my brain a little.
[…] - (explanation)

You have company now, cause all this “breaks my brain”, too.

I understand usually better by example, thus I focus on what I’ve
understood bye the code (and you explanation):

The same occurs when Ruby sees an introduced, but uninitialized, local
variable:

The code

def foo(y)
if false
x = y

x is introduced (exists), but not yet assigned (value: nil)

end
p x
end
foo(10)

#=> nil

The local is seen at “x = y”, created in the local variable table, and
initialized
to nil. The reference to “x” later succeeds because the local has been created,
though never initialized.

I understand this.

To simplify (and thus protect our brains), we discuss only locals/
globals

I understand the “set_local”, and it works as expected.

def set_local(y)
if false
x = y
end
p x #=> nil
#p x2 #=> undefined
end
set_local(10)

def set_global(y)
if false
$x = y
end
p $x #=> nil
p global_variables.include?(:$x) #=> true
p $xx #=> nil
p global_variables.include?(:$xx) #=> true

end
set_global(10)

I don’t understand, why “p $xx” does not fail with and “not defined”
error.

Technically, the existence of the variable is observable all over the
program.

I don’t see the reason why “$xx” is created and set to nil, instead of
throwing an “not defined” error (which I would expect when accessing
an undefined var, could be e.g. a typo of me).

Can this be demonstrated with code?

.

Ilias_L · June 16, 2011, 12:05am

On Wednesday, June 15, 2011 01:05:39 PM Michael E. wrote:

On Jun 15, 2011, at 1:50 PM, Ilias L. wrote:

Is this a defect or is there an explanation for this behaviour?

I can speak to local variables; class variables still break my brain a
little.

I tend to think class variables are a defect in the first place. If you
need
one for some reason, it’s usually much better to define it as an
instance
variable on the class, rather than as a class variable. That is, instead
of:

class Foo
@@bar = true

def hello
if @@bar
puts ‘Hello, world!’
else
puts ‘Goodbye, world!’
end
end

def depressed!
@@bar = false
end
end

Do something like this instead:

class Foo
self << class
attr_accessor :bar
end

self.bar = true

def hello
if self.class.bar
puts ‘Hello, world!’
else
puts ‘Goodbye, world!’
end
end

def depressed!
self.class.bar = false
end
end

Of course, that’s a terrible example of why you’d ever want to do such a
thing
– there’s rarely a reason to have anything approaching class variables
– but
if you need them, I think it makes much more sense to do them that way.
This
also keeps them somewhat saner across inheritance, in my opinion. That’s
the
part that breaks your brain, right? This way, while subclasses seem to
inherit
class methods from the superclass, they won’t automatically access the
same
values, though it’s trivial to override them to do that:

class Other < Foo
self << class
def bar
superclass.bar
end
def bar= value
superclass.bar = value
end
end
end

I’d probably use the superclass value as a default until someone
overrides it
on this class, so more like:

class Other < Foo
self << class
def bar
@bar || superclass.bar
end
end
end

It’s still not as clean as I’d like it to be, but at least this
generally
obeys basic concepts I’ve learned elsewhere which I actually understand.
I
have never actually understood class variables.

Ilias_L · June 18, 2011, 4:01am

On 15 , 20:49, Ilias L. [email protected] wrote:
[…]

A simplified version, using only locals and globals

def set_local(y)
if false
x = y
end
p x #=> nil
#p xx #=> undefined
end
set_local(10)

def set_global(y)
if false
$x = y
end
p $x #=> nil
p global_variables.include?(:$x) #=> true
p $xx #=> nil
p global_variables.include?(:$xx) #=> true
end
set_global(10)

I understand the “set_local”, and it works as expected.

I don’t understand, why “p $xx” does not fail with an “not defined”
error.

Technically, the existence of the variable is observable all over the
program.

I don’t see the reason why “$xx” is created and set to nil, instead of
throwing an “not defined” error (which I would expect when accessing
an undefined var, could be e.g. a typo of me).

.

Ilias_L · June 19, 2011, 6:21pm

On 19 Ιούν, 18:54, Christopher D. [email protected] wrote:

On Wed, Jun 15, 2011 at 10:50 AM, Ilias L. [email protected] wrote:

Is this a defect or is there an explanation for this behaviour?

Well, I haven’t checked the draft ISO standard or RubySpec, but the
[…]

There is a simplified version, see message from 2011-06-18

.

Ilias_L · June 19, 2011, 6:33pm

On Jun 19, 2011, at 11:54 AM, Christopher D. wrote:

Class variables probably should behave like instance variables, but
then, class variables are almost never the right tool to use anyway.

This is the crux of where Ilias does have a bit of a point: class
variables
always require initialization, instance variables and globals do not.
The
same rationale for ivars and globals not requiring initialization likely
applies to class variables. I honestly always assumed class variables
behaved like ivars and globals until this thread started.

I imagine the reason behind class variables requiring initialization is
because they’re hard enough to use correctly to begin with. However,
changing them to be auto-initialized to nil would be more consistent.

I don’t care either way - as Chris pointed out, class variables are
almost
always wrong. But for long-term internal consistency, it might actually
be worth discussing.

Michael E.
[email protected]
http://carboni.ca/

Ilias_L · June 19, 2011, 5:54pm

On Wed, Jun 15, 2011 at 10:50 AM, Ilias L. [email protected]
wrote:

Is this a defect or is there an explanation for this behaviour?

Well, I haven’t checked the draft ISO standard or RubySpec, but the
behavior is certainly widely known and documented in many Ruby books,
and lots of code relies on the current behavior, and would break if it
were changed. So, I’m going to say its a feature rather than a defect,
whether or not it was originally an intended feature.

And I suspect that its pretty well thought out with the possible
exception of class variable behavior. Globals and locals are often
first assigned remotely from the site where they are used, so calling
them when they haven’t been set isn’t really a sign of a logic error
– default to nil makes sense. Local variables are generally (there
are some possible exceptions, I think) set directly in the local
context where they are used, so using one without it being assigned
first is a sign of an error, so not defaulting to nil makes sense.

Class variables probably should behave like instance variables, but
then, class variables are almost never the right tool to use anyway.

Ilias_L · June 20, 2011, 12:08am

On Jun 15, 2011, at 5:00 PM, Ilias L. wrote:

I don’t understand, why “p $xx” does not fail with and “not defined”
error.

Because the parser ‘sees’ the variable $xx and defines it before “p $xx”
gets executed.

This behavior is different than the local variable case because global
variables can be discovered by their name alone (i.e. they start with
‘$’). This is a syntactic property that the parser can take advantage
of. That property doesn’t exist for local variables because, in some
contexts, they are indistinguishable syntactically from a zero-argument
method call. Consider these examples separately and not as part of a
single snippet of code:

a = b # ‘a’ is clearly a local variable while ‘b’ could be a local
variable or a zero-argument method call

c = d() # ‘d()’ is clearly a zero-argument method call and not a
local variable, ‘c’ is a local variable

do_something_with(x,y,z) # x,y, and z could be method calls or
variables

Gary W.

Ilias_L · June 19, 2011, 10:46pm

(slightly corrected, order of variable types)

puts “\n== Testin in MAIN Context ==”

$global = ‘global’
@instance = ‘instance’
@@class = ‘class’
local = ‘local’

puts “#$global, #@instance, #@@class, #{local}”

begin puts $empty_global == nil rescue puts “undefined” end
begin puts @empty_instance == nil rescue puts “undefined” end
begin puts @@empty_class == nil rescue puts “undefined” end
begin puts empty_local == nil rescue puts “undefined” end

class VarTest
puts “\n== Testin in Class Context ==”

$global = ‘global’
@instance = ‘instance’
@@class = ‘class’
local = ‘local’

puts “#$global, #@instance, #@@class, #{local}”

begin puts $empty_global == nil rescue puts “undefined” end
begin puts @empty_instance == nil rescue puts “undefined” end
begin puts @@empty_class == nil rescue puts “undefined” end
begin puts empty_local == nil rescue puts “undefined” end

end

#OUTPUT

== Testin in MAIN Context ==
global, instance, class, local
true
true
undefined
undefined

== Testin in Class Context ==
global, instance, class, local
true
true
undefined
undefined

Ilias_L · June 20, 2011, 1:21am

On Jun 19, 2011, at 15:26 , Michael E. wrote:

On Jun 19, 2011, at 6:01 PM, Gary W. wrote:

Because the parser ‘sees’ the variable $xx and defines it before “p $xx” gets
executed.

Gary,

As I summarized in an e-mail I sent on this thread earlier today, the
distinction
being drawn is between global variables and class variables. Your same
argument for global variables applies equally well to class variables, yet
class variables require initialization.

Global variables are just that… global. There’s no two ways about it.

Class variables are shared amongst a tree… but WHERE in the tree is
defined by where it is initialized.

And for the record, I think that your assertion that class variables
“are almost always wrong” is false. Like all things in software design,
they can be used poorly or they can be used well. When they’re used
well, they’re perfect for the job. When they’re not, they’re horrible. I
use class variables all the time to good effect (some of these are from
Eric):

% p4 grep -le @@ //src/*/dev/lib/…
//src/IMAPCleanse/dev/lib/imap_client.rb#8
//src/Inliner/dev/lib/inliner.rb#3
//src/ParseTree/dev/lib/parse_tree_extensions.rb#3
//src/RubyInline/dev/lib/inline.rb#37
//src/Sphincter/dev/lib/sphincter/search.rb#5
//src/Sphincter/dev/lib/sphincter/tasks.rb#3
//src/ZenHacks/dev/lib/r2c_hacks.rb#6
//src/ZenHacks/dev/lib/zenoptimize.rb#6
//src/ZenTest/dev/lib/autotest.rb#123
//src/ZenTest/dev/lib/autotest/autoupdate.rb#2
//src/ZenTest/dev/lib/autotest/isolate.rb#2
//src/ZenTest/dev/lib/autotest/rcov.rb#5
//src/ZenTest/dev/lib/functional_test_matrix.rb#3
//src/ZenTest/dev/lib/zentest_mapping.rb#4
//src/ZenWeb/dev/lib/ZenWeb.rb#6
//src/ZenWeb/dev/lib/ZenWeb/MetadataRenderer.rb#2
//src/ar_mailer/dev/lib/action_mailer/ar_mailer.rb#10
//src/flay/dev/lib/flay.rb#26
//src/flog/dev/lib/flog.rb#51
//src/heckle/dev/lib/autotest/heckle.rb#1
//src/heckle/dev/lib/heckle.rb#47
//src/heckle/dev/lib/test_unit_heckler.rb#20
//src/hoe/dev/lib/hoe.rb#148
//src/hoe/dev/lib/hoe/deps.rb#5
//src/imap_processor/dev/lib/imap_processor.rb#19
//src/imap_processor/dev/lib/imap_processor/archive.rb#7
//src/minitest/dev/lib/minitest/spec.rb#23
//src/minitest/dev/lib/minitest/unit.rb#74
//src/newri/dev/lib/ri_display.rb#1
//src/png/dev/lib/png.rb#15
//src/png/dev/lib/png/font.rb#2
//src/rake-remote_task/dev/lib/rake/remote_task.rb#12
//src/rake-remote_task/dev/lib/rake/test_case.rb#1
//src/ruby_parser/dev/lib/ruby_lexer.rb#84
//src/ruby_parser/dev/lib/ruby_parser_extras.rb#52
//src/ruby_to_c/dev/lib/rewriter.rb#13
//src/ruby_to_c/dev/lib/typed_sexp.rb#3
//src/sexp_processor/dev/lib/pt_testcase.rb#1
//src/sexp_processor/dev/lib/sexp.rb#6
//src/sexp_processor/dev/lib/unique.rb#1
//src/wilson/dev/lib/wilson.rb#9
//src/zenprofile/dev/lib/memory_profiler.rb#1
//src/zenprofile/dev/lib/spy_on.rb#3
//src/zenprofile/dev/lib/zenprofiler.rb#9

Ilias_L · June 20, 2011, 1:36am

On 20 Ιούν, 01:01, Gary W. [email protected] wrote:

On Jun 15, 2011, at 5:00 PM, Ilias L. wrote:

I don’t understand, why “p $xx” does not fail with and “not defined”
error.

Because the parser ‘sees’ the variable $xx and defines it before “p $xx” gets
executed.
[…] - (explanations, referring to why to locals cannot behave this
way)

I’ll reduce this further down, dealing only with global variables.

(Please, if possible, avoid comparisons to the other variable types.)

$x_void

def make_nil_global(val)
$x_undefined = val
end

p global_variables

Behaviour (ruby 1.9.2p180):

$x_void is added to the the global_variables (undefined, nil)
$x_undefined is added to the global_variables (undefined, nil)

Expected Behaviour:

$x_void is ignored (not added to global_variables, access would
raise error)
$x_undefined is added to the global_variables (undefined, nil)

I cannot see a use case, where placing $x_void into the
global_variables is necessary.

I possibly oversee something very fundamental, but the main rules for
my expectations are:

variables come to existence when a value is assigned
if the value cannot be determined, “nil” is assigned

.

Ilias_L · June 20, 2011, 3:37am

On Jun 19, 2011, at 7:20 PM, Ryan D. wrote:

Global variables are just that… global. There’s no two ways about it.

Class variables are shared amongst a tree… but WHERE in the tree is defined by
where it is initialized.

This does not explain why they do not auto-initialize to nil like all
other shared variable types.

And for the record, I think that your assertion that class variables “are almost
always wrong” is false. Like all things in software design, they can be used
poorly or they can be used well. When they’re used well, they’re perfect for the
job. When they’re not, they’re horrible. I use class variables all the time to
good effect (some of these are from Eric):

I think you support my point - for an entire variable class type, I see
about 20 projects there, some of which you’ve admittedly not written
yourself. They certainly have uses; I use them in Laser for dynamically
loaded warning passes, and I see many of the projects you link use them
for plugins. “Almost always wrong” isn’t countered by fewer than 2 dozen
projects that use them once or twice.

Michael E.
[email protected]
http://carboni.ca/

Ilias_L · June 20, 2011, 12:27am

On Jun 19, 2011, at 6:01 PM, Gary W. wrote:

Because the parser ‘sees’ the variable $xx and defines it before “p $xx” gets
executed.

Gary,

As I summarized in an e-mail I sent on this thread earlier today, the
distinction
being drawn is between global variables and class variables. Your same
argument for global variables applies equally well to class variables,
yet
class variables require initialization.

Michael E.
[email protected]
http://carboni.ca/

Ilias_L · June 20, 2011, 3:50am

On Sun, Jun 19, 2011 at 4:35 PM, Ilias L.
[email protected]wrote:

(Please, if possible, avoid comparisons to the other variable types.)

First, you specifically started this thread discussing the
“inconsistency”
in the way different variables types are handled. Now you sit here and
complain when people are comparing the variable types. Which would you
like?
Inconsistencies most times come from trade offs between the different
types

a discussion of those differences can be important to figuring out why
they are “inconsistent”

You technically might have a point here - but as with most things in
life
there is always a trade off. Obviously needing to do a lookup to find a
var
is always the slowest option so there appears to be a speed enhancement
here
in that the variable is put into the global vars table immediately so we
don’t need to do a lookup each time we hit it. This can be accomplished
because the $ notation makes it automatic what we are looking at. An
argument could be made that the global vars table might be enhanced to
denote something along the lines of “not initialized” for variables in
it
that have not have a value assigned yet.

As always however, a patch that implements something is the easiest way
to
get there.

John

Ilias_L · June 20, 2011, 6:56am

On Jun 19, 2011, at 9:36 PM, Michael E. wrote:

On Jun 19, 2011, at 7:20 PM, Ryan D. wrote:

Global variables are just that… global. There’s no two ways about it.

Class variables are shared amongst a tree… but WHERE in the tree is defined
by where it is initialized.

This does not explain why they do not auto-initialize to nil like all
other shared variable types.

Just speculation on my part but if you had implicit-initialization, then
the scope of the instance variable would be very dependent on the order
in which various classes were parsed. An errant reference to the class
variable in a subclass would effectively hide the same class variable in
a superclass, which is probably not what is intended. By requiring an
explicit initialization the actual scope of the class variable is
explicitly established by the programer.

Gary

Ilias_L · June 20, 2011, 7:07am

On Jun 20, 2011, at 12:55 AM, Gary W. wrote:

Just speculation on my part but if you had implicit-initialization, then the
scope of the instance variable would be very dependent on the order in which
various classes were parsed. An errant reference to the class variable in a
subclass would effectively hide the same class variable in a superclass, which is
probably not what is intended.

Given my most common use of class variables (shared array/hash/object
which
is mutated by the inheritance hierarchy), this makes sense. The cvar is
written once at the top of the class tree, and read and mutated multiple
times by
subclasses. Interestingly, @@cvar ||= … doesn’t raise (in 1.9.2 at
least) if it has not
been initialized.

Michael E.
[email protected]
http://carboni.ca/

Ilias_L · June 20, 2011, 6:36pm

I think that your assertion that class variables
“are almost always wrong” is false. Like all things in software
design, they can be used poorly or they can be used well. When
they’re used well, they’re perfect for the job. When they’re
not, they’re horrible. I use class variables all the time to
good effect (some of these are from Eric):

I once happily used class variables. Then I spent about two
hours debugging an old code of mine only to find out that I
messed up with class variables.

I then realized something else:

I did not actually need the class variables in the first place.

Since that day I have not found a use case for using class
variables again.

There may be a few instances where class variables may indeed
be useful, but so far I managed to live without class variables.

I also dislike that:

@foo

and

$foo

take only one character but

@@foo

requires two of the same. We could well have:

$$foo

too. And it would be ugly.

I myself would rather see class variables be removed from ruby.

Ilias_L · June 20, 2011, 7:22pm

On 20 , 04:49, John W Higgins [email protected] wrote:

a discussion of those differences can be important to figuring out why
they are “inconsistent”

First of all, my plea was optional (“If possible”).

And then it’s just meant for this sub-thread (where I wanted to focus
just on the globals).

I cannot see a use case, where placing $x_void into the
global_variables is necessary.

I possibly oversee something very fundamental, but the main rules for
my expectations are:

variables come to existence when a value is assigned

if the value cannot be determined, “nil” is assigned

You technically might have a point here - but as with most things in life
[…] - (assuming it’s for speed reasons)

The question is:

Concretely, is there any reason, by design/specification or by
implementation, that the above rules are not kept?

.