Making Sense out of Slow Rubygems Startup

November 21st, 2009

JRuby is pretty fast these days, in most cases it is faster than MRI. JRuby is especially good at handling long-lasting repeating tasks (on server side). But one thing where JRuby is not blazingly fast is during the startup. Well, actually, JRuby starts up within 0.5 second, which is really nice for the JVM-based implementation. But as soon as rubygems comes into play, the situation gets worse.

So, I spent some time trying to understand why just loading the rubygems package eats up a second or two more. Well, it turned out that rubygems loads almost entire standard library, which is crazy! Sure, this is a bit of exaggeration, but still, let’s see, shall we?

First, fileutils are loaded, 100+ms passed, then rubygems tries to figure out where is the system-level config file, and for that on Windows it uses Win32API library (!!!), which in turn requires FFI (the native interface). Ka-ching, 300+ ms for that. Then, we need to read the configs and stuff, this requires YAML parser, and in JRuby that means loading Yecht, 350+ ms more. Next, JRuby provides some extension of rubygems functionality, like possibility to load gems from the JAR files, and for that we need some classpath magic and the ‘uri’ library. To obtain the JRuby’s classloader, we need to require ‘jruby’, which in turn requires ‘java’, and the java library loads the whole bunch of extra functionality (which we are not using here, btw), that adds up to 400 ms. So, we barely loaded the rubygems library, but already consumed 1200ms of time, or more.

Take a look at the following picture that shows the times to load all those libraries:

Rubygems

As you see, loading rubygems themselves is not that time consuming, most of the time is spent in loading the dependencies. There are some low-hanging fruits here, like lazily loading the ‘uri’ library. I also don’t like that rubygems loads the Win32API and the whole FFI stuff, removing that would also eliminate quite a lot of startup time. Loading the whole bunch of java integration magic in order to only be able to obtain the JRuby classpath also seems to be overkill. All those tweaks, if implemented, might actually cut the rubygems startup time in half. We’ll see.

UPDATE: Wayne Meissner has just tweaked the loading of Win32API, and I adjusted JRuby-specific rubygems tweaks to lazily load the ‘uri’. So, the loading of rubygems is already about 200ms faster than it was yesterday! 8-)

UPDATE#2: Here’s the log of rubygems loading timings in JRuby, in easy to read text format: http://gist.github.com/240322.

How to build latest Ruby on Windows

November 16th, 2009

So you want to build the latest Ruby on Windows. Maybe, you’re working on some new RubySpec tests or would like to see what’s coming with Ruby 1.9.2. At any rate, building Ruby on Windows is much easier than I originally thought, and you don’t even forced to install Microsoft compilers at all! Good folks over RubyInstaller project made all the hard work for you already, setting up the environment to build with MinGW compiler. So, the receipt is as follows:

  • Clone the RubyInstaller repository: git clone git://github.com/oneclick/rubyinstaller.git
  • In that cloned directory, invoke: rake ruby19 CHECKOUT=1 TRUNK=1

And you’re done! That rake command will fetch the MinGW compiler all the required libraries, it will checkout the latest Ruby source from the subversion repository, then it will configure and build it.

Minimal requirements for you system before you start: Subversion, Ruby 1.8.6 installed and somewhere on your PATH. I recommend the one from RubyInstaller (tested with 1.8.6 p383). That’s about it. The Ruby from RubyInstaller comes with rubygems preinstalled, so obtaining the Rake is as trivial as: gem install rake.

Entire setup would take you a couple of minutes. Then, the build… That would take longer, expect it to run for at least 20-30 minutes. And this is not as fast as 30 seconds recompile of entire JRuby tree. :)

P.S. Big thanks to Luis Lavena for helping out with the debugging the build failures and reporting them to Ruby-core so that they are all now fixed.

Proper way to detect Windows platform in Ruby

November 3rd, 2009

I’ve seen this time and again, the Windows platform detection in Ruby is typically done like this: RUBY_PLATFORM =~ /mswin/. THIS IS WRONG! First, there is also mingw version of Ruby on Windows. So, folks started to use the updated check pattern: RUBY_PLATFORM =~ /mswin|mingw/. THIS IS WRONG TOO! There are other implementations of Ruby, like JRuby and IronRuby. And in JRuby the RUBY_PLATFORM value is “java”, it is platform-independent value, expressing one of the core JRuby properties – it is built on top of Java platform. Plus, every time new implementation comes along, all old checks would be obsolete once again.

Now, the proper way to check the Windows platform, which is surprisingly less-known. Use rbconfig!

require ‘rbconfig’

WINDOZE = Config::CONFIG[‘host_os’] =~ /mswin|mingw/

This will work for x32 and x64 versions of Ruby, for mingw-based Ruby, for IronRuby, and for reasonably fresh JRuby (1.3.1, 1.4, 1.5+) as well. Please, please, update your sources!

Native JRuby Launcher For Windows

October 17th, 2009

I’ve been fixing some JRuby BAT launcher script errors recently, and it was tricky, as always: as soon as you fix one issue with the way the BAT file parses parameters, a new one pops up (and a couple of regressions on top of that). Spaces, brackets, quotes, & and ^ signs, you name it. We’ve been playing this game for years now, and every time it just gets uglier and uglier. So, this time I broke down and started to look for better alternatives to the BAT launcher files. And it turned out that there is, like, a hidden industry of Java launchers out there, with at least 4 different open-source projects, not to mention the commercial offerings.

So I tried Launch4J, WinRun4J, Groovy Launcher, Eclipse and Netbeans launchers. For more details, take a look at JRUBY-4100. And the winner was … Netbeans! The NetBeans guys have nice, clean, simple C++ based launcher, (re)written not that long ago. It was easy to start hacking on, right in Netbeans itself. And I can say, Netbeans C/C++ support is really good now, all the more reasons to use it! At any rate, writing C++ code was much, much better than struggling with batch files. :)

So I tweaked and cleaned and adjusted the Netbeans launcher to make it suitable for JRuby. And here it is, the JRuby native launcher for Windows. It has nice Java detection, an ability to launch Java in-process (so in the Task Manager one can see jruby.exe process, not just java.exe), it handles most of JRuby command line arguments already, it allows to pass parameters to the JVM when needed (via –J switch, like we always did), and it can handle spaces and brackets in the path, etc. There is a nice feature to enable tracing to see what’s going on (via –-trace command line option). This is an early version though and it will be improved further to provide the very same functionality JRuby users expect (additional command line switches like –client and –-server, etc). Once all the functionality is there, the plan is to integrate it into the main JRuby repository. But if you wish to try it now, you could either build it from sources or go to the GitHub link above and grab the version from the “Downloads” section. And the comments/suggestions and bug fixes are always welcome!

New and Noteworthy in JRuby 1.1.3

July 21st, 2008

JRuby 1.1.3 has been released a couple of days ago, and I wanted to highlight some of the most interesting changes in this release, from my perspective.

  1. RubyGems 1.2. If you ever installed even a single Ruby gem, you know the blues, it was taking way too long, consuming lots of memory and felt pretty heavyweight. In the past, we even had to increase the memory limits for JRuby up to 500Mb so that RubyGems could work without out of memory errors. Not anymore! RubyGems 1.2 is a fantastic release that speed things up dramatically, and JRuby 1.1.3 comes with it by default. Just try it and you’ll be amazed, I promise! :) Not only that, but RubyGems 1.2 is much easier to customize to suit the needs of particular implementations/platforms, and we’ve taken full advantage of that, eliminating essentially all custom JRuby-specific patches over the RubyGems sources. All in all, kudos to RubyGems team!
  2. Performance. Some of us are obviously obsessed with performance, so there was a lot of work put into that area, yielding out nice results. And the first measurements are pretty encouraging. Here are some of the things that were improved in this release:
    • Interpreter performance is greatly improved (Tom landed a couple of nice bombs on that one!)
    • Core IO performance and memory requirements significantly improved, memory leaks in IO eliminated.
    • Time’s methods are much faster now.
    • Some loops and block invocations improvements. Things like 10000.times {} are faster now.
    • General performance fixes to use faster methods where possible, JIT tweaks, reduced object churn.
    • Memory leaks in RubyArray are fixed.
  3. Compatibility/Conformance. There were *LOADS* or compatibility test fixes, most of them were driven by RubySpec tests and test failures. For StringIO alone, we had fixed at least 50 failures. Then, lots of fixes for Socket, ARGF, IO, Kernel.select, for problems with various empty expressions, etc. I tell you, if you feel serious about some important functionality in Ruby and want it to be properly tested and consistent across ever-growing list of Ruby implementations, there is no other good way but to join the RubySpec revolution!
  4. Usability. New command line options like --server or --client instead of -J-server or -J-client, also --debug and --jdb now functional on Windows. Better out of memory error messages. Some 1.8.7-level improvements, like better Tempfile (with ability to specify the suffix/prefix for the file). Also, some classloader fixes so that JRuby could work better with 3rd party libraries like Spring, XMLDecoder, Activerecord-JDBC drivers shipped via RubyGems, JXTable, etc, would work without clunky workaround (see JRUBY-2495 for more details).

I think, that’s quite enough for 1.5 month work… Enjoy!

Full Git Clone of JRuby Repository

June 21st, 2008

While I was at it with making a full git clone of Matz Ruby repository I also created a full clone of JRuby subversion repository and posted it on my GitHub account for your enjoyment:

Full Git Clone of JRuby Subversion Repository.

This repository has all the tags and most branches. As with Matz Ruby repo, it is being updated hourly. Personally, I find Git much more convenient to work with, especially when I don’t have commit rights to the repository, but still can make local patch queues and local branches. See my earlier entry on the topic: “Using Git for Ruby/JRuby development“.

Full Git Clone of Matz Ruby Subversion Repository

June 19th, 2008

I’ve been wanting to have a full git clone of Matz Ruby Subversion repository for a while. In fact, I’ve been using a private git clone for a few months already, and really like the speed of switching between branches and immediate history search. This all gets really handy with current explosion of different versions of Ruby (1.8.5, 1.8.6 with more than 200 patch-levels, 1.8.7, 1.8-dev, 1.9). Using all these different versions is essential when writing the new RubySpec tests.

And since now more folks than ever are writing the RubySpecs, dealing with Subversion gets painful for many, and the public Git repository of Matz Ruby is going to address that. So here it is:

Full Git Clone of Matz Ruby Subversion Repository.

For more info on how to use, take a look at the README. In a nutshell, just clone the repository, create local tracking branches for those remote branches that of interest to you, and keep updating your repository periodically.

And for those who’d like to know the steps in order to repeat them on other Subversion repositories, read on.

First, I fully cloned the entire Matz Ruby Subversion repo, using git svn:

   1: git svn clone --stdlayout http://svn.ruby-lang.org/repos/ruby

Then, I created an empty git repository on GitHub. So far, pretty standard procedure. The only tricky part was to figure out how to push remote branches from my freshly svn-cloned repository to public branches in GitHub’s repository. Without this, it would be pretty complicated to keep all the branches updated (you’d need to create local branches, update them manually, one by one, and then push them).

Luckily, git’s flexibility allows to do all kinds of interesting things, sot it was easy to write some config entries to “re-wire” remote branches in local repository to public branches in  the GitHub’s one, I had to adjust the .git/config file:

   1: [remote "origin"]
   2:    url = git@github.com:vvs/ruby-mirror.git
   3:    push = refs/remotes/trunk:refs/heads/trunk
   4:    push = refs/remotes/ruby_1_8:refs/heads/ruby_1_8
   5:    push = refs/remotes/ruby_1_8_7:refs/heads/ruby_1_8_7
   6:    push = refs/remotes/ruby_1_8_6:refs/heads/ruby_1_8_6
   7:    push = refs/remotes/ruby_1_8_5:refs/heads/ruby_1_8_5
   8:    push = refs/remotes/ruby_1_6:refs/heads/ruby_1_6
   9:    push = refs/remotes/ruby_1_4:refs/heads/ruby_1_4
  10:    push = refs/remotes/ruby_1_3:refs/heads/ruby_1_3
  11:    push = refs/remotes/tags/*:refs/tags/*

With that change, every push to the “origin” repository (the GitHub one) will push remote branches in my private repository to the public branches of GitHub repository.

Finally, the process to keep the repository up-to-date is now straightforward:

   1: git svn fetch --all # fetches ALL branches from svn repo
   2: git push            # pushes all branches to public git repo

Update: The repository has been moved from my personal GitHub account to the RubySpec GitHub account, to compliment the currently existing RubySpec and Mspec repositories there.

Note #2: Nick Sieger has a JRuby’s git clone (only for the main trunk though): http://github.com/nicksieger/jruby

NetBeans 6.1 JRuby trick: Enable JRuby Console

May 5th, 2008

NetBeans 6.1 has been released recently, and the upgrade was easy and pain-free for me. I haven’t found any serious problems so far and the release looks very solid and performant. The Java stuff works just fine, and Ruby capabilities are great. For those who look for a great and intelligent Ruby editor, NetBeans is one of the great candidates.

One minor issue with NetBeans 6.1 is that it ships by default with very basic IRB console for Ruby: no history, no pop-ups for code completion. Since I’m used to JRuby IRB Console which provides those advanced features, that was a bit of inconvenience for me.

One of NetBeans guys, Martin Krauskopf is to the rescue! It turned out that there is a special property that enables the full JRuby IRB Console. Just add

   1: -J-Dirb.jruby=true

to the netbeans_default_options entry in the etc/netbeans.conf file. A word of caution though: The JIRB Console was disabled by default due to some problems it was causing in some environments. For me, on Windows, it works just great.

Response to "Subversion’s Future?"

May 1st, 2008

Ben Collins-Sussman posted an interesting blog entry “Subversion’s Future?”, where one of the main points made was that while distributed source control systems are OK for smallish/open-source projects, Subversion’s sweet spot is with huge projects. I couldn’t disagree more. And here’s my response.

I’ve been using the distributed source controls systems for more than a decade and been watching other big projects using the distributed systems and it seems to me the DVCS provide the more benefits the bigger the size of the project. What are the characteristics of a huge project? In most cases, it means that there is a big team working on it. Big team means global team, all over the globe. This is not an “open-source thing”, this is a reality of corporate software development too. Most companies (well, at least those who are actually producing huge projects) are global companies, with offices in the U.S, Europe, India, etc. Working globally on the central repository *is* painful and slow. Tried to bisect the regression introduced between the releases, switching between many  revisions, tried to follow the history of some code in Subversion? Doing this when the main repository is overseas is not fun.

A big team of engineers is typically organized in a hierarchy of smaller sub-teams focusing on particular area of the product. Again, it’s much more natural to organize a hierarchy of workspaces matching the structure of the organization. There are many benefits to that approach: mostly the members of such sub-team care about their area, not touching/changing anything in other places. And they better know their code, so they could find/fix new bugs and regressions faster. Typically, there is a special QA force for each sub-team, trained and specialized in testing particular area. Once they tested/OK’ed the particular state of the team code, it can be pushed upwards to the integration workspace. Thus, members of other teams won’t even be disrupted by local problems/regressions, since they would get the more stable and better tested code. The distributed source control allows to do that beautifully and naturally. Doing this in Subversion is seriously painful.

Also, what are those mythical huge projects that nobody knows about? How about OpenSolaris or Linux?Are they “huge enough”? How about Mozilla or Ubuntu or NetBeans or JDK or MySQL? And all these projects do use distributed source control tools. Solaris and Java SDK only recently were open-sourced. Before that they were, by all means, huge commercial software projects, each with many, many years of development by hundreds of people. They were developed with distributed source control system. There *is* a reason why Linux never used CVS/Subversion and why even commercial non open source distributed system was used to develop Linux (since there were no good open DVCS at the time). And the reason, of course, is that distributed source control helps managing the overwhelming complexity of the huge projects much better than centralized one.

One other point in the Ben’s blog entry was about usability and ease of use of Subversion. Yeah, it is easier to use in simple scenarios, but once the size of a project grows, it gets harder and harder. Besides, if engineers are smart enough to develop and maintain a huge project, adjusting to distributed source control systems would be piece of cake! :) And for those folks who are stuck with Subversion, there is a great git-svn tool that would allow to leverage the power of distributed source control while working with centralized Subversion repository.

I’m not really saying that Subversion is “bad”. It was actually great for its time, but now there are better and smarter tools out there.

The value of the RubySpecs

April 27th, 2008

Currently, there are at least FIVE actively developed Ruby implementations out there in the wild. For an excellent overview of the current state of affairs take a look at the Charles’ article: “Promise and Peril for Alternative Ruby Impls”. Having that many implementations, despite all the good things, also brings a new set of challenges. One of the biggest challenges is the compatibility. Some are even starting to say “balkanization” word. (You know who you are!)

That’s where RubySpecs come in. The great promise of the RubySpecs is to provide a unified compatibility and conformance test suite, universally used by all Ruby implementations. The Ruby implementers are not only going to use the tests, but also contribute back the new specs, increasing the value and the quality of the test suite over time. At the moment, we know that Rubinius folks (who started the whole RubySpec thing) actively use them on a daily basis, with a couple of continuous integration bots, and using the RubySpecs as a driver to implement new features in a test-driven approach. Brian Ford is doing an outstanding job supporting the mspec (the engine behind the specs) and accommodating users requests for new features and enhancements. JRuby uses the RubySpecs equally actively, running them on MacOS, Linux and partially on Windows. All the new tests that are of compatibility/conformance nature typically go directly to the RubySpec repository. IronRuby uses the RubySpecs too. Hopefully, we’ll see some contributions from them too, that would be really sweet! :) And last, but not least, during the first “Design Meeting” with ruby-core folks (including Matz), the RubySpec question was raised, and it seems that even Matz Ruby folks would start looking into it, and eventually using it.

So, if all goes well, most of the current active Ruby implementers would be using the common set of conformance tests. The RubySpecs are going to be forked off from the Rubinius repository and would live in the proper place as a main project on its own, with a bug tracker, a web site, etc. Also, some more folks would start working on the specs via Google’s Summer of Code program. Very exciting!

As an example, take a look at the BigDecimal support. At the beginning of April, there were no specs for BigDecimal. JRuby had a skeleton implementation, missing some methods or just having stubs for some other methods. Rubinius had no BigDecimal support altogether. Three weeks and 43 commits later (by the way, five different folks contributed to the tests), there are 1772 test cases for BigDecimal in the RubySpecs, JRuby has already fixed hundreds of failures, now passing most of them (with 6 remaining tests to be fixed soon) and Rubinius has a actively improving BigDecimal support too.

I guess, the main thing I’m trying to say here is that now it’s the perfect time to look into the RubySpecs and to start contributing. That way, you’ll affect ALL the Ruby implementation, making ALL of them better! For more info, take a look here and here.