I've been playing around with the Ruby/Rails cloud provider Heroku a little bit lately just to try it out. It is somewhat like Google App Engine or Microsoft Azure in the way it works since you bundle your application and push it out to the Heroku cloud for deployment. It is very easy to get things going but I ran into a few interesting items that I figured I would share.
Building Chromium and Chrome OS with EC2
When the initial cut of the Chromium OS source was released last week I decided to use the opportunity to see if it would run on my EEE PC 900 netbook (check out EEE PC 900 running Chrome OS on Youtube to see the final result). The first roadblock I hit with the build instructions was the Ubuntu requirement (I did give a little effort to getting it working on Fedora first). I don't have an Ubuntu box so I started out trying to use VirtualBox but that was going to take forever so I decided to move things to EC2 and what follows is the result. This isn't meant to be a replacement for the build docs since they are surely going to change, it is more of a cookbook to build Chromium (the browser) and Chromium OS using EC2 (EBS is used as well if you want to cache the source over time).
When I first started down the path of using EC2 I thought I would grab the source each time I wanted to build. I quickly ran into a snag however because it took forever to sync the source and download the Ubuntu repo. Once I had the initial sync of the source I decided I would copy it all to an EBS volume and keep that volume up to date. Using EBS to store the source feels better too since I assume Google expects people to be syncing changes only as opposed to pulling the entire source tree down every time they want to build.
I started out by finding this Ubuntu AMI for a base to work from. For the most efficient compile times I ended up using the High CPU (c1.medium) instance. I started with the default small instance but it was just too slow. With the high cpu instance you are looking at about 45 minutes to build the OS after you have the source synced for the first time and if you add building Chromium in there you are looking at around 55 additional minutes. All told you can have a complete build in less than 2 hours even if there are some source updates needed. For EBS you need a 3G volume for the Chrome OS source plus Ubuntu package repo and a 4G volume for the Chromium source.
Upgrade to Fedora 12 from Fedora 11
Fedora 12 was just released and it is time to upgrade again of course. I almost thought this was going to be a version to yawn at but then I saw that there was going to be a new version of Fedora based on Moblin and it seemed exciting again. Of course that isn't the only thing being upgraded in the latest version of Fedora. Some of the more notable changes in this version:
- Updated window managers Gnome 2.28, KDE 4.3 and Fedora Moblin
- Delta RPM support
- i686 as the base architecture
- Lots of virtualization changes: KSM, KVM huge page support, KVM QCow2 performance improvements, KVM Stable Guest ABI, libguestfs, Virtual network management and improved virtual privileges to name a few
- An easier to use bug reporting interface Abrt 1.0
- Better Webcam Support
You can find the complete list of Fedora 12 enhancements as well if you want more details.
Full Text Search with Sphinx
While developing my GeeQE iPhone application I decided I needed a way to let users search posts so I started looking around for a simple search engine that I could use with PHP. I took a look at a number of different options like MySQL Full Text search, Sphinx, Solr and others based on Lucene. After looking at what it would take to get started with each I decided to go with Sphinx. Sphinx looked like it would be the easiest and quickest to set up, didn't require a lot of resources to run in an idle state and would integrate with PHP easily.
This post goes over how I went about configuring Sphinx and gives an example of how to integrate it with PHP. I'm using MySQL as the data store filled with the Stack Overflow CC data dump although it should be easy to adapt the instructions to other data sources. To follow along just download a copy of the data dump and use my schema and loader to get the same MySQL database.
How I Used Hpricot and Mechanize in GeeQE
While building GeeQE I wanted to enhance the CC dump of Stack Overflow's data. The main reason I wanted to do this was to capture Gravatar hashes and user badges. To do this I decided to continue using Ruby as I did with the XML loading (see my previous post on XML parsing with Ruby). The easy choice was of course Hpricot to parse the HTML from the users page and Mechanize to move from one page to the next.
Fast XML parsing with Ruby
One of the first things I needed to do while building the GeeQE iPhone application was process the CC data dump from Stack Overflow. The dump contains XML files representing tables from Stack Overflow with the largest file being posts.xml weighing in at 1.2G as of September. I decided it would be pretty easy to use Ruby to parse the XML and load the data into MySQL so I went about finding the right parser for the job.
If you haven't processed large amounts of XML before one thing to realize is that you don't want to use a DOM parser because it is going to load the entire XML structure into memory. What you want is a SAX parser that can work on the XML stream as it comes in. With this in mind I started looking around and quickly found an older benchmark post that gave me an educated guess that the LibXML library was going to be the fastest parser for Ruby. After figuring out how to use it I decided to also give a couple other libraries a shot to see how they stacked up, the other two I looked at were REXML and Nokogiri.
RFID Reader USB Prototyping Kit
I recently won a programming contest that netted me a gift card for ThinkGeek and not knowing what else to do I strolled the site looking for something interesting to use the gift card on. Eventually I ran into the RFID Experimentation Kit they have and decided that was what I needed. I have been wanting to play around with RFID for a while and this kit turned out to be pretty nice for tinkering.
iPhone Windowed HTTP Live Streaming Using Amazon S3 and Cloudfront Proof of Concept
This post should be seen as a proof of concept. I'm working on creating a more concise and easier to use package of everything covered here but I felt like getting the knowledge out sooner rather than later would be of help to people looking for a way to do this. If you are interested keep an eye on the HTTP live video stream segementer and distributor project page as well as the github git repository.
After my post on using FFMpeg and an open source segmenter to create videos for the iPhone that conform to the HTTP live streaming protocol I decided to see if I could get the same segmenter to work on a live stream. As it turns out it didn't take much modification to work.
If you are looking for something you can buy out of the box it appears that Akamai is doing iPhone video streaming now. I believe that the following solution using Amazon S3 and Cloudfront is probably as good as what Akamai can offer but it may be a better choice if you don't want to have to maintain the configuration.
I put together a quick diagram of the process of transferring the video stream from source to final destination that will hopefully help people understand the full picture before jumping into the details:

Developing Adobe Air Apps with Linux
I finally found a little project I wanted to do using Adobe Air and after some searching I found out you can use Linux to develop Air applications. At first I thought I would have to use Flex Builder which is still in alpha for Linux but it turns out there is a better option from Aptana.
The Aptana Air plugin supports developing Adobe Air applications using HTML and Javascript. It even support the 2.0 release of Air that is currently in beta. Aptana uses the Eclipse framework as an editor so if you are familure with Eclipse it will be even easier to use.
I started by downloading and installing the latest version of the Air runtime. Next I grabbed the Air SDK, the SDK doesn't come with the plugin so it is something you have to get directly from the Air developers site. After getting the SDK unpacked I installed the latest Aptana core release. Once the core is installed there is a big plugin button on the startup screen that currently has Air listed.
The install went smoothly except for a few issues. The first one I ran into was very noticeable since it kept any dialog buttons from working when they were clicked although they did work when I clicked them and then hit enter or navigated to them with the keyboard. Luckily someone has already figured out that there is an issue with Eclipse and GTK+ that is the cause (even though the post is for Ubuntu the same problem and solution worked for me on Fedora). The fix is to set the GDK_NATIVE_WINDOWS variable before running the Aptana binary:
The next thing I noticed was the application.xml descriptor that Aptana created didn't generate correctly. It needs to start with the correct xmlns or the following error will be thrown on run: "invalid application descriptor: descriptor version does not match runtime version". To fix this check the version of the Air SDK by running the following command:
adt version "1.5.3.9120"
For the version of the Air SDK I downloaded the correct xmlns was http://ns.adobe.com/air/application/1.5 so I needed the following application tag:
Once I had that working I was able to compile and execute a demo application. I was also able to create an Air application package from within Aptana using File > Export > Adobe AIR > Adobe AIR Package. Before creating the Air package I had to create a signing certificate. Creating the certificate can be done within Aptana too but because I had not yet fixed the above button issue I created a cert on the command line with the Air SDK and then imported it. To create the Air signing certificate from the command line I used the adt command from the SDK:
Remember the password that gets used to generate the certificate because it will have to be used before a package is signed.
Finally Adobe has a lot of information on developing Air applications on their Air devnet site. The Air ajax section is especially important.