Best practices for accessing external jar dependencies from within a gem

Hi,

I have created a gem on my local system for text mining which relies on
external Java libraries for classification tasks. I’m trying to harness
the power of Java through JRuby here. Directory structure is as follows:

classifier/
├── classifier.gemspec
├── Gemfile
├── Gemfile.lock
├── lib
│ ├── classifier
│ │ ├── version.rb
│ │ └── sample_classifier.rb
│ ├── classifier.rb
│ └── java
│ ├── sample_lib.jar
│ ├── another_sample_lib.jar
│ └── yet_another_sample_lib.jar
└── Rakefile

I’m loading these jars in lib/classifier.rb as

Dir[“java/*.jar”].each{|jar| require jar}

I have added this gem to a Git repository and referenced it in the
Gemfile of my Rails 3.0.9 project. However, my sample_classifier.rb
fails to locate any classes 'import’ed from any of the jars under
lib/java.

Things work if I copy lib/java to my Rails application lib/ directory
and add the following in sample_classifier.rb:

Dir[“lib/java/*.jar”].each{|jar| require jar}

However, I don’t think it’s a good practice to spill the gem’s
dependencies in my Rails app. Is there a better way to achieve this? I
looked around for bundling jar files with a gem but I found results for
exactly the opposite i.e. bundling gems within a jar file. Can I add
these jars to Rails’ config.autoload_path or similar to load them when
the application starts? Is it a good practice to bundle jar dependencies
with the gem?

I’m sure there’s a cleaner way to do this.

On 11/01/2012, at 4:57 PM, Michael K. wrote:

Nikhil Lanjewar:

However, I don’t think it’s a good practice to spill the gem’s
dependencies in my Rails app. Is there a better way to achieve this?

Make your new gem load the jar, then load it as you would any other Ruby
library.

I’ve just been working on similar thing.
Take a look guys:

Cheers,
Dmytrii
http://www.ApproachE.com

I really wonder that “vendor” the jar dependencies is the common way
to go. I guess there is a reason why gem dependencies are not (all)
“vendored”. (vendored == copied into the gem)

if I look at neoj4-enterprise and its slf4j-api-1.6.2 dependency -
what happens if another gem is using slf4j-api-1.5.8 in the same
manner, then you end up with having different versions of the same jar
residing inside the same classloader. that will sooner or later just
produces a lot of trouble. there is no formal way to analyse the jar
dependencies, etc.

having a single gem with vendored jars is probably OK unless you use
other gems with such “vendored” jars and have to figure out what jar
dependencies your application is using. IMO spilling your jars inside
rails is just fine, a lot of java projects do exactly this: copy all
its jar dependencies into the “lib” directory BUT do not distribute
them with you project jar unless it is was file where you collect all
those dependencies.

regards, Kristian

PS I do it slightly differently but I want refrain from pushing my
“solution” so I think this is an urgent problem to be addressed
though.

Nikhil Lanjewar:

However, I don’t think it’s a good practice to spill the gem’s
dependencies in my Rails app. Is there a better way to achieve this?

Make your new gem load the jar, then load it as you would any other Ruby
library.

MK

http://twitter.com/michaelklishin

kristian:

I really wonder that “vendor” the jar dependencies is the common way
to go. I guess there is a reason why gem dependencies are not (all)
“vendored”. (vendored == copied into the gem)

if I look at neoj4-enterprise and its slf4j-api-1.6.2 dependency -
what happens if another gem is using slf4j-api-1.5.8 in the same
manner, then you end up with having different versions of the same jar
residing inside the same classloader. that will sooner or later just
produces a lot of trouble. there is no formal way to analyse the jar
dependencies, etc.

I believe there are solution that make it possible to install Maven
artifacts as gems.
I haven’t used them personally but in theory they should introduce some
dependency management features you want.

MK

Dmytrii N.:

I’ve just been working on similar thing.
Take a look guys:
neo4j-enterprise/lib/neo4j-enterprise.rb at master · neo4jrb/neo4j-enterprise · GitHub

Just to throw in more examples, Hot Bunnies has been doing it since day

  1. It works great in all kinds of scenarios,
    with Bundler and without, in standalone apps, on Heroku (yes, it is
    possible to make Java build pack run JRuby apps) and running under JBoss
    AS.

https://github.com/ruby-amqp/hot_bunnies/blob/master/lib/hot_bunnies.rb#L3

MK

http://twitter.com/michaelklishin

yes maven can solve it to some extend but the “vendored” jars are
hidden, you never know when they are getting loaded into the
classloader and where - it may even depend on the user interaction
with the application.

maven can not solve since there is no formal declaration of such jars
packed within a gem.

  • Kristian

On Wed, Jan 11, 2012 at 1:42 PM, Michael K.

kristian:

yes maven can solve it to some extend but the “vendored” jars are
hidden, you never know when they are getting loaded into the
classloader and where - it may even depend on the user interaction
with the application.

maven can not solve since there is no formal declaration of such jars
packed within a gem.

I did not imply manually managed dependencies will be magically downside
free as a result.
It is not any different from manually managed .jar dependencies in Java,
Clojure and other codebases.

I would like to see an easy way to manage Maven dependencies for JRuby
similar to what Leiningen does
in the Clojure world, however, marrying it with rubygems is likely to be
a really non-trivial undertaking.
That said, most apps just don’t vendor much so it is not all that bad.

MK

http://twitter.com/michaelklishin

On Wed, Jan 11, 2012 at 2:03 PM, Michael K.
[email protected] wrote:

I would like to see an easy way to manage Maven dependencies for JRuby similar
to what Leiningen does
in the Clojure world,

any pointer to this ?

however, marrying it with rubygems is likely to be a really
non-trivial undertaking.

rubygems is not compatible with a maven-like dependency resolution -
bundler is compatible since bundler works on fixes set of gems and
freezes the versions for dependent gems. indeed it should be rather
easy to extend bundler and there was already an attempt using the
maven-gem support of jruby.

That said, most apps just don’t vendor much so it is not all that bad.

that is true for the time being but it will become a problem when
there are more gems around packing their jars. and when you come from
the java world and using embedded jruby, in case you want to gems with
packed jars it is even more likely that you run in duplicated classes
in your classloaders.

  • Kristian

kristian:

rubygems is not compatible with a maven-like dependency resolution -
bundler is compatible since bundler works on fixes set of gems and
freezes the versions for dependent gems. indeed it should be rather
easy to extend bundler and there was already an attempt using the
maven-gem support of jruby.

I did not mean rubygems specifically but the gems ecosystem overall
(including rubygems.org and bundler). There are just too many subtle
differences. I would be glad to be proven wrong.

MK

http://twitter.com/michaelklishin

thanx for the extra explanations - I will have closer look at it.

with my projects I am using the ruby-maven gem which allows to declare
jars in a Mavenfile and they get added to the classpath before
executing jruby. it also generates binstubs as bundler does, i.e.

$ rmvn bundler install

will setup bins in target/bin which allows to use
$ rails . . .
$ rake . . .
$ rspec . . .
as in regular (j)ruby but with all jars added to classpath and all
gems setup by bundler. you can even pack these with rubygems as the
gemspec allows to add informal requirements, i.e. misusing this for
the jar dependencies gives me all I need to handle both rubygems and
jar dependencies in nice manner (gems done by bundler and jars by
maven).

see also
http://blog.mkristian.tk/2011/09/jruby-and-rubygems-and-javaclassloader.html

my approach differs a bit from what you suggested but the difference
could be overcome :wink:

  • Kristian

On Wed, Jan 11, 2012 at 2:40 PM, Michael K.

kristian:

I would like to see an easy way to manage Maven dependencies for JRuby similar
to what Leiningen does
in the Clojure world,

any pointer to this ?

Take a look at
monger/project.clj at master · michaelklishin/monger · GitHub, for
example. You will see 3
dependencies listed, 2 of them are Java artifacts (Clojure itself and
MongoDB Java driver) and one library is
a Clojure lib. They are all distributed just like Java artifacts are
with Maven.

Leiningen does not invent anything new, it just uses Maven and piece of
Ant (will likely to away in the future) to
manage dependencies and can package your projects as .jars, generate POM
file and build standalone (“fat”) jars.

Clojure libraries are usually distributed via clojars.org which is just
a public Maven repo that’s easy to use and is on the default Leiningen
repositories list.

I would love to see something like this for JRuby, even if it will be a
separate tool from Bundler. As long as dependencies are
resolved/downloaded and Ruby load path is extended to include resulting
directories with .jars, I need nothing else. My Ruby libraries can
require jars just as I do today.

MK

http://twitter.com/michaelklishin

On 11/01/2012, at 6:25 PM, kristian wrote:

if I look at neoj4-enterprise and its slf4j-api-1.6.2 dependency -
what happens if another gem is using slf4j-api-1.5.8 in the same
manner, then you end up with having different versions of the same jar
residing inside the same classloader. that will sooner or later just
produces a lot of trouble. there is no formal way to analyse the jar
dependencies, etc.

When and if it will cause trouble, it can be easily repackaged.
But I doubt it has so far :slight_smile: