As in the release announcement of JRuby 1.4.0RC1, embedding API (aka
Red Bridge) is included in JRuby. Red Bridge has some configurations
though each of them has a default value. Recently, a discussion about
the default value of a local context type came up in this list, http://www.nabble.com/Bizarre-problem-in-1.4.0RC1-td25777485.html. I
want to figure out what is the best default type to users, so shall we
discuss about that.
As I wrote in wiki, http://kenai.com/projects/jruby-embed/pages/Home#Context_Instance_Type,
there are three types to save context local values. Those are
singleton, thread local and single thread. Ruby runtime, variables to
be shared between Java and Ruby, key-value pair to save some
parameters, and reader/writer/error writer used in Ruby scripts are
context local values and saved in one of three types of context
instance.
For the first release, I chose the thread local because this type is
the best for web applications. But, some people wanted to have
singleton to be default for their use case. I want to know at where
you are using embedding API or have a plan to use Red Bridge. Also, I
want to know what would you like to be default, singlton, thread local
or single thread?
Just reiterating my opinion from the other thread.
I believe that the “principle of least surprise” would have the default
being either singleton or single-threaded. Having the default be
thread-local means that you have to be aware that there might be
multiple
JRuby engines behind the scenes (one for each thread). A default of
singleton (or single-threaded) means there’s a single JRuby engine
unless
you do something special to get multiple instances.
And having multiple instances is anywhere from bad to disastrous (as I
discovered) if you save an object you get back from a JRuby invocation
in a
Java variable and pass it back in to another invocation.
-Mario.
–
I want to change the world but they won’t give me the source code.
 new ThreadLocalScriptingContainer()
This isn’t a bad thought. Then there’s no unexpected behavior.
I tend not to like thread local stuff when it comes to something like
JRuby for a few key reasons:
If it happens behind the scenes, it can produce very unexpected
results. Mario’s case is a perfect example.
Generally, unless people want multiple runtimes, I’m inclined to
only give them one
JRuby is trending toward moving runtimes into a static
(classloader-global) location so that you don’t have to hold on to the
reference
Most of the facts seem to favor a singleton runtime.
Yoko: Can you explain a bit more why you chose thread local by
default? What is the typical use case you have in mind that would want
thread-local runtimes, and why would that be better than a
static/global runtime?
I’ve been thinking about this a lot and have tried to write several
times
today but couldn’t make up my mind.
My problem is that the behavior is REALLY different depending on which
you
pick. Vastly so. In terms of capability, safety, performance and all
the
other classic tradeoffs. So I will propose the heresy that for Embed
Core
this needs to be something you choose explicitly, e.g.:
new ScriptingContainer() – the singlethread “naive” version
new SingletonScriptingContainer()
new ThreadLocalScriptingContainer()
I’d say to Javadoc the differences heavily to help developers choose the
right one for their application.
JSR223 and BSF may have defaults that fit their model well. For JSR223
I
think the general expectation is that a ScriptEngine corresponds to a
language runtime, that it doesn’t preserve state between evaluations
(other
than that persisted in the eval’s ScriptContext), and that, if the
threadsafe flag is set, you don’t have to worry about collisions with
any
other threads doing evaluations at the same time. ThreadLocal seems to
work
just fine for the JSR223 case; I dropped it in as a replacement for the
old
com.sun.scripting stuff on our CMS and it worked perfectly with better
multithreading performance. I don’t have any experience idiomatically
using
BSF so don’t really know the contract or expectations of that model.
For the first release, I chose the thread local because this type is
the best for web applications. But, some people wanted to have
singleton to be default for their use case. I want to know at where
you are using embedding API or have a plan to use Red Bridge. Also, I
want to know what would you like to be default, singlton, thread local
or single thread?
How does this choice affect those of us using javax.script to perform
the embedding?
As long as doing something in one ScriptEngine instance doesn’t
interfere with another ScriptEngine’s “view of the world”, I’m easy.
On Thu, Oct 15, 2009 at 1:47 AM, Charles Oliver N. [email protected] wrote:
Yoko: Can you explain a bit more why you chose thread local by
default? What is the typical use case you have in mind that would want
thread-local runtimes, and why would that be better than a
static/global runtime?
The biggest reason I chose thread local for default is to protect
global variables from concurrent processing. People use global
variables to share, in many cases, local states between Java and Ruby.
The problems is that global variables are tied to Ruby runtime. I know
the design of a global variable is semantically correct, but
implementers and users have used it for sharing variables perhaps
because of easiness of programming. When a single Ruby runtime is
shared by multiple worker threads, each of global variables is shared
by all threads. JSR223 reference impl declares “synchronized” in all
script execution methods to avoid a race condition caused by multiple
worker threads. Red Bridge makes Ruby runtime thread local to avoid
the race condition.
When I decided the default value, I thought Red Bridge should behave
the same as JSR223 imple does. This is because programs worked on
JSR223 reference impl usually works on Red Bridge without changes.
Many people don’t read a manual unless something bad is happened, so I
chose thread local so that global variables are protected even in
concurrent processing.
The typical use case in my mind is a web application. Since
instantiating Ruby runtime is costly, I want to avoid to instantiate
it for every HTTP request but need to protect global variables. If
Ruby runtime is thread local, number of runtime instances are limited
to worker threads to process each HTTP request on web application
server. Otherwise, thread local is not effective and maybe
troublesome.
right one for their application.
The idea that users need to choose one seems to be good. If people
understand how Red Bridge works, troubles would be decreased.
We have an application with multiple ScriptEngines/InvocableObjects
running
on multiple threads performing tasks in parallel that have a “view of
the
world”. Classic Master/Workers design pattern.
That view consists of shared services objects for data access, retrieval
and
storage, which are thread safe combined with scripts and objects local
to
scripts. The scripts written in JRuby by end-users are not threadsafe,
i.e.
we don’t expect people to have the knowledge to code in a threadsafe
manner,
therefore we would need to ensure that the state defined in scripts is
isolated.
We would want define a JRuby class “Document” with some state and
behaviour,
create say 5 Invocable Objects (i.e. Document instances), and hand those
5
objects to 5 separate threads with guaranteed isolation of one Document
instance from another.
As long as doing something in one ScriptEngine instance doesn’t
interfere with another ScriptEngine’s “view of the world”, I’m easy.
If you are free from worrying about race condition caused by
concurrent processing, you should choose singleton or singlethread
type. These are simple. Thread local is for concurrent processing such
as a Java web application to leverage performance from its design.
On Fri, Oct 16, 2009 at 6:26 AM, Ijonas Kisselbach [email protected] wrote:
We would want define a JRuby class “Document” with some state and behaviour,
create say 5 Invocable Objects (i.e. Document instances), and hand those 5
objects to 5 separate threads with guaranteed isolation of one Document
instance from another.
I think “5 Invocable Objects (i.e. Document instances), and hand those
5 objects to 5 separate threads with guaranteed isolation” should be
performed on Red Bridge. To make this happen we need think whether
global variables are protected or not. Also we want to have better
performance. It depends on use cases. If it is a web application,
thread local works. If it is a standalone application on a single
thread, thread local is not the choice.
-Yoko
you are using embedding API or have a plan to use Red Bridge. Also, I
-Mario.
Sorry for making you run into a trouble. As I explained in another
reply on this thread, we applications were in my mind. To protect
global variables from race condition, I chose thread local as a
default value. If people think singleton is the best choice, I don’t
mind to change default.
Ruby runtime is thread local, number of runtime instances are limited
to worker threads to process each HTTP request on web application
server. Otherwise, thread local is not effective and maybe
troublesome.
-Yoko
Hmmmm… this is interesting, since what I did was precisely drop in
RedBridge instead of the old Scripting API code…and that’s where I ran
into trouble. I am not using any global vars, all communication between
Ruby
and Java (or in my case, Scala) is done via parameters and return values
and
any state that needs to be saved between invocations is stored in Scala
variables. My case is also atypical in that I’m actually writing a Swing
client app What ended up happening was that I got a value back from
an
invocation in the main thread, and passed it back in in the event
dispatch
thread.
I do agree that forcing the issue (forcing the selection of threading
type
to be explicit) is probably the best idea, the thing is I’m not sure if
JSR223 allows for that. Also, to prevent my error case would be for each
RubyObject to store a reference to their engine and verify that
reference on
invocation. If they are passed in in an invocation to a different one
you
could throw an exception. Note that you don’t need to verify it in every
single call, just when they cross the Java-Ruby boundary.
I’d say to Javadoc the differences heavily to help developers choose the
right one for their application.
The idea that users need to choose one seems to be good. If people
understand how Red Bridge works, troubles would be decreased.
Anything which causes unexpected problems for users is always a
benefit. A second way would be to require the container to accept
this context parameter as an arg (no no-arg constructor).
I think we did enough discussion, and singleton seems to be the best
default value. I switched the default value of LocalContextScope from
THREADSAFE to SINGLETON in rev. 9e557a2.
Rob, please specify threadsafe explicitly on multi-threaded
environment when you use the latest, JRuby trunk.
I’m still thinking about two options: no-arg constructor (Tom’s plan)
and three types of ScriptingContainers (Rob’s plan), so I might change
this in a future release if people think it is unexpected behavior.
-Yoko
On Fri, Oct 16, 2009 at 12:09 PM, Thomas E Enebo [email protected]
wrote:
I think we did enough discussion, and singleton seems to be the best
default value. I switched the default value of LocalContextScope from
THREADSAFE to SINGLETON in rev. 9e557a2.
We probably want to merge this into 1.4 as well, don’t we? With a big
release note saying we switched this after RC3?
I’m still thinking about two options: no-arg constructor (Tom’s plan)
and three types of ScriptingContainers (Rob’s plan), so I might change
this in a future release if people think it is unexpected behavior.
I like the idea of having explicit different entry-points into
embedding for the various threading setups (Rob’s plan). That fits
with other frameworks that have different “session” types that are
global, singleton, thread-local, and so on.