Partial Indexes with PostgreSQL and Rails

So what happens when you run into a situation that you really need to be paranoid about making duplicate records into the db? When dealing with financial systems, this is more often the case (You never want to have a scenario that you end up double spending). Rails model validations more often takes care of this, but it is technically possible that something slips through and God forbid it is that one million dollar transaction. Having a validation such as:

should do the trick. Upon trying to save a record with a duplicate name, the record would not save, and .save would return false. However, there are some exceptional conditions that can result in the application level validations being by passed. This could happen with race conditions such as multiple application servers hitting a database. Having suffered this situation and if you get as paranoid as I am, then you would probably also bring down the validation to the database level by making use of unique indexes.

To spice things up, the records for which we want to avoid duplicates have uniqueness being defined by multiple columns hence the neccesity of a compund index. As an example, say you have a model called Payment with the following attributes: id, amount, order_id, payment_date, description, status. order_id is a foreign key pointing to an Order item. Since it is possible to make multiple payments per order, there is a one to many relationships between orders and payments. All payments transition from a status of ‘Scheduled’ to ‘Paid’ after they are confirmed. The migration file for payments would look like this:

In our world that is filled with paranoia we would want to add a unique index to ensure that duplicates are avoided at the db level. To do this we would create a compound index.

Let’s analyze what this index will prevent. You would not be able to create a payment of the same amount being applied to the same order on the same day. Unless of course the first payment’s status flips from ‘Scheduled’ to ‘Paid’ before the second payment is made. Yes this is a simplistic example and it prevents the possibility that two legitimate consecutive payments of the same amount to can be made against the same order within a short period of time. Though theoretically possible, that scenario would be practically very rare, and in many cases might be an inconvenience worth dealing with if the upside is completely guaranteeing that a double payment does not get registered in error.

How can this index be improved further? Enter ‘Partial Indexes’. The payments table will be undoubtedly be a very huge table. However, the uniqueness we are concerned about above, only applies to ‘Scheduled’ payments. The payments table is bount to have hundreds or thousands of ‘Paid’ status payments whereas the payments with a ‘Scheduled’ status will always be a very small subset of the payments table. Using a partial index that takes into account only the status ‘Scheduled’ would significantly increase performance and efficiency by reducing the index’s size thus taking up less storage. The index is also easier to maintain and faster to scan. To know about partial indexes you can read the original white paper by Michael Stonebraker located here. Modifying the migration to cater for a partial index on the status field yield

Here is a catch though; I was trying this in rails 3.2 and it happens that partial index migrations are not supported. In Rails 4.x there is in built suppot for the above migration. To get around this for rails 3.2, we used the gem pg_power which is billed as an ActiveRecord extension to get more out of PostgreSQL.
In psql, our table would look like this

With this we now are able to avoid in-inadvertent duplicate payments which might otherwise be catastrophic.

Uninstalling Ubuntu from a Mac Book Pro

I am not an apple fan-boy, but I admit that I like apple hardware – specifically the Mac Book Pro. For this reason, I usually run Ubuntu on my Mac Book Pro. If it was not for iTunes, I probably would not even dual-boot. I can never get over the awesomeness of ‘sudo apt-get install xxxxx’ in comparison to the apple ‘walled garden’. Recently, I wanted to sell one of the Mac Book Pro’s I had so that  I could up-grade to a later model. (Apparently Mac Book Pro’s never grow old!). The person I was selling the laptop to did not want Ubuntu in it so I had the odious task of un-installing Ubuntu (12.04). Therein lies the problem. There is a lot of literature on the web about how to install Ubuntu on Mac Books since the advent of Mac Books using Intel chips, but there are not a lot of folks talking about how to remove Ubuntu. After Googling around, here are the steps that worked for me:

  1. Get an Ubuntu – Live CD. You will need it to delete the Linux partitions
  2. Boot into the Ubuntu Live CD (press the C button while restarting with the Ubuntu CD/DVD in the drive)
  3. Start Gparted the Ubuntu partitioning program. If unsure on how to do this, just go the the ‘Dash Home’ and search for Gparted
  4. Delete the Linux partition
  5. Set the swap partition to “off”. In my case, I tried deleting the swap partition and it would not let me do so until I turned it ‘off’
  6. Now delete the swap partition
  7. Gparted operations are complete at this time
  8. Important note! Do NOT re-size the Mac-OS partition while in Gparted this will be done from Mac-OS
  9. Reboot the laptop and boot into Mac OS X
  10. Start the Disk Utility application and go to ‘Partition’
  11. Re-size the Mac OD partition by dragging it to the bottom to take up the space that had been taken up by Ubuntu
  12. Re-boot and make sure those changes have taken effect
  13. Next we need to delete rEffit (I was using rEffit – but  I understand it is now legacy)
  14. Instructions for uninstalling rEffit are located at their site here
  15. In my case method 1 was the one applicable
    • Delete the efi folder
    • Delete rEffitBlesser folder in “Library/StartupItems”
  16. Reboot and that’s it!!

Hope this helps someone.

 

 

Single Table Inheritance and updating ‘gotcha’ in Rails 3

For my first blog post, I may as well start by talking about a recent oddity that I encountered working on our rails application (at Kopo Kopo, Inc). It probably may not be an oddity for some rails aficionados but well, it was a head scratcher so I decided to share.

So basically here is the setup. Say you have a model called Person and two other models that inherit from Person called Student and Teacher respectively

Let’s say you want to use only one form (view) to be able to create either a Student or Teacher and the object to be created would have to depend on the user selecting an option in a select box to determine what model will actually be created. So in essence the creation form would look somewhat like the following:-

 

Note that the object that is used to call the form_for is a ‘Person’ object so the forms fields are actually bound to a Person object. The same would go for the editing view

 

You would expect the Persons controller to have:-

 

The interesting/tricky part comes up in the update method. It so happens that since the objects were created as either a ‘Teacher’ or ‘Student’ entity but the edit form is bound to a ‘Person’ object and the update_attributes is also called on the parameters of a ‘Person’ object (the superclass), the update does not work and the fields are not updated. For the update to work, you would have to call update_attributes on the particular object being updated. Remember this case is probably arising because we want to use the same form to update both a ‘Teacher’ and ‘Student’ object. We thus have to change the update method like so:-