Archive for March, 2008

Using Git for Ruby/JRuby development

Monday, March 31st, 2008

To my surprise, Git is becoming one of the most popular source control systems in Ruby community. New blog entries on how to use Git are popping up all over the place and the amount of excitement is just very unexpected. :) Who would have thought that source control tools might be so exciting! Well, here are my recipes on how to use Git with Ruby/JRuby subversion repositories.

Personally, I started using Git about 6 months ago, since Rubinius is under Git, and I really wanted to get access to “rubyspecs” which are part of Rubinius repository. At first, I really disliked Git just because the command line is rather complicated, and it takes some time to get used to it and to get familiar with basic Git concepts. After a while, I noticed that I started using Git more and more, and slowly but surely the full power of the tool just started to amaze me. Switches between branches are LESS THAN A SECOND, instant access to the entire history, powerful branch merging, git grep, and the list could go on and on.

So I started to think about switching to Git as my main SCM tool, even on projects that are under Subversion. Again, the git svn turned out to be very capable and complete tool, allowing bidirectional access to subversion repository from Git. Basically, git svn allows to clone the subversion repository with ENTIRE HISTORY into Git repository, and the changes to that Git repository can be pushed back to main Subversion one. On top of that, one can use Stacked Git to maintain a series of patches. This is very useful in case when there is no write access to the main repository. So the patches are handled by Stacked Git and are being sent out to the maintainers via email, or attached to the bug reports. And that’s how I worked on JRuby until I got the commit rights:

  1. stg pull svn - get the latest sources from the main (svn) repository, this command will first pop all your currently applied patches, get the latest sources, and re-apply the patches on top of them.
  2. stg edit - create new patch, provide description
  3. hack, hack, hack
  4. stg refresh - record the changes into the patch
  5. stg export -p - export the patches into standalone files
  6. send the patch to the maintainers

It’s sooo much easier to maintain the patches that way rather than just having the diff files flying around. First, the patches are always against the latest sources, and never outdated (well, at least when somebody changes something in the main repository that conflicts with the patch, you’ll see the issue immediately during the step #1 above, and can correct it right away). Second, it’s trivial to push/pop patches, reorganize them, test with and without them, easy to see which ones applied and which ones are not, etc.

For those who are serious on working on some open source project which is currently under subversion, I recommend to investigate git-svn and stgit in more detail.

Now, here are a couple of examples on how to create a Git repository out of public subversion one.

Let’s start with Ruby itself:

git svn clone -Ttrunk -ttags -bbranches http://svn.ruby-lang.org/repos/ruby ruby.git

Warning: This would take about 12 hours to complete! For those who don’t want to wait that long, here’s the repository I’ve already created: ruby.git.tgz (md5sum: 576b5667affe040fd33478ea074c13b8). Think about it, these 60mb contain entire Ruby history! :) This is LESS than size of subversion’s working copy! I measured the time to switch between Ruby 1.8 and Ruby 1.9 branches, that’s 500.000 lines changed in 2000 files, and it takes on my Ubuntu Linux about  1.5 SECONDS. git-grep of entire 1.9 branch is 0.2 seconds. git-log of entire 1.9 branch is 0.3 seconds, from the very last revision to the revision #1, is 0.3 seconds. What else can I say?

Now, the JRuby repository:

git-svn clone -Ttrunk/jruby -ttags -bbranches http://svn.codehaus.org/jruby jruby.git

If you have commit rights to JRuby repository, just change HTTP to HTTPS. The process would take about 2.5 hours (and I saw reports that it can be interrupted and then resumed too).

Once you have the local Git repository, you’ll probably need to do the following (when your are in the root of the repository) :

   1: git gc --aggressive --prune
   2: (echo; git-svn show-ignore) >> .git/info/exclude

This would compact the repository further, and will set the proper excludes.

This is just a very basic outline of the process, so if you’re interested, do check out the official documentation and feel free to ask questions.

Wordpress 2.5 and Security

Sunday, March 30th, 2008

wp-logo I’ve learned my lesson the hard way. I’ve been running this blog on the older version of Wordpress, and some day it got hacked, and hidden spam links were inserted into the blog entries. I must say that was brilliant, nobody could see the spam links, but the search engines were indexing them. Luckily, the problem was visible at least in the RSS aggregator. So I started digging trying to figure out what’s going on and to my horror it turned out that these kinds of hacks are very common, especially against the outdated versions of Wordpress:

  1. Detailed Post-Mortem of a Website Hack Through WordPress
  2. Support ยป Weird and Dangerous
  3. Justaddwater.dk hacked
  4. Another Day, Another WordPress Hack

I checked with Wordpress folks on the #wordpress IRC channel, and was advised to purge the compromised install and redo it from scratch, and I did…

And the main lesson here is to pay attention to new Wordpress releases and upgrade when new security update is out. This entry sums it up nicely. On top of that, I decided to use the BARE MINIMUM of external plugins to minimize the risk.

Just in time, a new version of Wordpress has just been released, v2.5, so I took the opportunity to upgrade, since quite some changes were specifically improving the security situation. Also, I decided to change the theme for the blog. Using the default one was getting pretty old, too damn many blogs use it!

And so far, I like the new version, the admin interface is much cleaner and useable. But I’m not really sure that some “usability” improvements are useful, things like one-click plugin updates. After the security breach, I’m paranoid on this issue, and I’d like to minimize what can be done via web interface. Yes, doing manual updates via command line is not as fun, but I’d prefer it to stay that way.

P.S. If you’d like to make sure your blog is not hacked, just take a look into HTML code for your latest blog entry and make sure there are no hundreds of links at the end of it. :)

How to contribute to JRuby effectively

Thursday, March 27th, 2008

For the past 5 months I’ve been involved in JRuby project, since it’s a perfect match for my interests: great object oriented scripting language on top of powerful Java platform. And so far, I’m enjoying every second of my time with JRuby! So I thought that I might just post this entry and describe how to contribute to JRuby effectively and comfortably, based on experience and what worked for me.

First major rule: Don’t be shy and submit the bug reports. It’s straightforward, just go to the Jira page for JRuby, get the account and follow the “Create a new issue in project JRuby” wizard. Sure, some bugs would have better chances to be accepted and fixed than others. Clear, precise description on what’s going on, with specified environment (which OS, which JRuby version) is always a plus. Short, reproducible test case attached to the bug would be great! Having a patch that fixes the bug is just perfect. :) History shows that bugs with test cases and with patches are being resolved much faster, which is not a surprise.

Second rule: Talk to core developers in case when you have some problems/issues with JRuby, or even if you just have some questions. And the best way to do that is to hang out on IRC channel #JRuby (more info here). This is probably the most useful way to be involved, since the core developers and many active JRuby users are always there, 24 hours a day. That heavy use of IRC was kind of unexpected for me, I was assuming that most of communication happens over email and thought that IRC is an outdated way to communicate. And I was very wrong. Now, I can’t even imagine how some other open source projects get away without using IRC. Seriously, IRC is the way. But there are also mailing lists if you insist, they are just being used less actively.

Third rule: Get the sources and start hacking. :) The sources are under Subversion, but many been using git-svn successfully as well. To build JRuby is very easy, just invoke ant, and that’s it. Everything that’s needed for the build is already in the repository, and there is no need to download any 3rd party dependencies. All you really need is the JDK and Ant installed. Once you have the fix, there are couple of ways to test it, to run unit tests via “ant test” and to run “rubyspecs” via “ant spec”. Also, most of the changes should be accompanied by the new unit tests that verify/validate the new behavior.

Probably the easiest way to find stuff to work on is to follow “rubyspecs” and try to make JRuby pass those spec tests that currently fail. The rubyspecs is a set of conformance/compatibility tests currently maintained as part of Rubinius project (to be extracted into stand alone project sooner or later). I’ll probably post more about rubyspecs in the future since for me this is an incredibly important piece of overall Ruby compatibility story. Currently, you can easily see which rubyspec tests are failing with JRuby and hence are excluded, just invoke “ant spec-show-excludes”.

Another great way to help out not only JRuby but other Ruby implementations is to start writing new rubyspecs. The Rubinius folks are very open about that and after few good submissions would probably offer the commit rights.