BackgrounDRb 1.0.4 release

Hi,

I am proud to release BackgrounDRb 1.0.4. This release contains, many
fixes as listed below:

Folks, as you are probably aware that I am working on getting
cluster/distributed job handling
support for BackgrounDRb.

I am trying to make a brief overview of changes that I have currently
checked into testcase
branch of github.

  1. Both synchronous and asychronous task invocation API has changed
    for accomadating
    clustering of BackgrounDRb servers. New API is:

for synchronous tasks:

invoke method some_task in worker hello_worker

MiddleMan.worker(:hello_worker).some_task()

invoke method some_task passing ‘data’ as an argument to method and

use

session[:user_id] as job_key

MiddleMan.worker(:hello_worker).some_task(:arg => data,:job_key =>
session[:user_id])

run method ‘some_task’ on all backgroundrb servers

MiddleMan.worker(:hello_worker).some_task(:arg => data,:job_key =>
session[:user_id],:host => :all)

run method ‘some_task’ on only locally configured server

MiddleMan.worker(:hello_worker).some_task(:arg => data,:job_key =>
session[:user_id],:host => :local)

if you are using a bdrb cluster and now specifying :host option,

your job will run on a

server selected in round-robin manner

MiddleMan.worker(:hello_worker).some_task(:arg => data,:job_key =>
session[:user_id])

run the task on specified server and return the results back

MiddleMan.worker(:hello_worker).some_task(:arg => data,:job_key =>
session[:user_id],:host => “10.0.0.2:11210”)

For asychronous tasks:

invoke method some_task in worker hello_worker

MiddleMan.worker(:hello_worker).async_some_task()

invoke method some_task passing ‘data’ as an argument to method and

use

session[:user_id] as job_key

MiddleMan.worker(:hello_worker).async_some_task(:arg =>
data,:job_key => session[:user_id])

run method ‘some_task’ on all backgroundrb servers

MiddleMan.worker(:hello_worker).async_some_task(:arg => data,
:job_key =>
session[:user_id],:host => :all)

run method ‘some_task’ on only locally configured server

MiddleMan.worker(:hello_worker).async_some_task(:arg => data,
:job_key =>
session[:user_id],:host => :local)

if you are using a bdrb cluster and now specifying :host option,

your job will run on

server selected

MiddleMan.worker(:hello_worker).async_some_task(:arg =>
data,:job_key => session[:user_id],:host => “10.0.0.2:11210”)

  1. Worker_key and job_key:
    BackgrounDRb prior to this version used job_key as worker global
    data and workers
    started new were using ‘job_key’. This naming convention was
    somewhat confusing
    and hence has been changed.

    we use “worker_key” for associating worker with some unique key in
    the same capacity
    as job_key was being used. Now, newer ‘job_key’ is essentially a
    task associated
    data and hence will be used in that capacity. For example, for
    starting a new
    worker, API is:

    MiddleMan.new_worker(:worker => :hello_worker,:worker_key =>
    <worker_key>)

    For invoking a task on the worker, you will use:

    MiddleMan.worker(:hello_worker,<worker_key>).some_task(:arg =>
    “lol”,:job_key => “boy”)

    Also, for above method if your worker code should be like this:

    class HelloWorker
    def some_task args
    result = some_scrap_url(args)
    cache[job_key] = result
    end
    end

    register_status has been removed and all workers have access to this
    cache
    object which can be used to put data in. ‘cache’ object is thread
    safe and hence
    multiple threads can write to it. Also, for each executed task, if a
    job_key
    was supplied while invoking the task, it will be available in worker
    method.

    Also, job_key is basically a thread local variable inside workers and
    hence
    if multiple tasks are running concurrently using thread pool or
    something, job_key
    will still resolve to correct value.

  2. New thread pool:
    Following syntax of adding tasks to thread pool has been deprecated:
    thread_pool.defer(args) do |arg|
    # runs in new thread
    end
    New syntax is:
    thread_pool.defer(:some_method,args)

    Where some_method will run in thread pool and argument passed to
    #defer will be
    passed to methods. This was done, to fix a memory leak associated
    with huge
    number of blocks created at runtime.

  3. Persistent job Queue:
    BackgrounDRb now have out of box support for persistent job queues
    which are persisted to the
    database. API to add a task in the job_queue is pretty simple:

    MiddleMan(:hello_worker).enq_some_task(:arg =>
    “hello_world”,:job_key => “boy”)

    So in your hello worker:

    class HelloWorker
    def some_task args
    … do some work …
    persistent_job.finish! #=> marks the job as finished. totally
    thread safe
    end
    end

    persistent_job is a thread local variable and will refer to currently
    running queued task can be used from thread pool as well. For
    example:

    class HelloWorker
    def some_task args
    thread_pool.defer(:fetch_url,args)
    end

    def fetch_url tags
    … runs in thread …
    … fetch tasks …
    persistent_job.finish!
    end
    end

  4. Log worker no longer has rails environment.

  5. Result Caching and retrieval:
    Each worker has a worker local ‘cache’ object, which can be used
    to cache results for later retrieval. You can use memcach as
    result storage, you just need to specify that in config file.

    Also, when you fetch objects from cache all the specified servers are
    queried and response is returned.

  6. Clustering and Fail safe:
    If a node in cluster goes down, bdrb will automatically remove that
    node from participation in load balancing, but once that node comes
    up, bdrb will automatically add it back and node will start running
    tasks.

  7. Fixes for large size data transfer.

URLS:
http://backgroundrb.rubyforge.org/

Credits:
Ezra for letting me work on BackgrounDRb. Francis for EventMachine.


Let them talk of their oriental summer climes of everlasting
conservatories; give me the privilege of making my own summer with my
own coals.

http://gnufied.org

On Sun, Jul 13, 2008 at 8:39 PM, hemant [email protected] wrote:

Hi,
I am trying to make a brief overview of changes that I have currently
checked into testcase
branch of github.

Ignore this part, all these changes are available in master and svn
repository as well.

Thanks.

hemant wrote:

Ignore this part, all these changes are available in master and svn
repository as well.

What can BackgrounDRb do that Spawn can’t do?

One is out-of-process and the other in-process.

Would a server like Mongrel hang up on a Spawn?

On Sun, Jul 13, 2008 at 8:46 PM, Phlip [email protected] wrote:

Would a server like Mongrel hang up on a Spawn?
I do not think it will hang up on moderate load, but can create
problems when you are offloading lots of tasks to a thread running
inside mongrel. Also, there is no way to check, what happened to the
task.

Thats from a very brief glance at Spawn.