When YAML Bit Me


This is the story of a bug I encountered about 3 years ago. Basically, I had a model Book that contained a serialized object attribute called info and BooksController initially contained this search query:

Book.where('info LIKE ?', "%ref: '#{params[:ref]}'")

The bug that surfaced at the time was that some books were not fetched when the search query contained alpha-numeric reference numbers (those that contains both digits and letters) as some reference numbers were pure digits and some were a mix of digits and alphabetical characters.

The main issue with the previous search query is that it assumes that the value of ref is always stored wrapped in single quotes, i.e. ref: '123' as in the case with integer reference numbers, however; alpha-numeric reference numbers were stored in the database without the single quotes!

My initial solution was to provide two search queries for each type of reference numbers because, at the time, I thought the issue was with the way the object was stored in PostgreSQL. However, and after further investigation, the problem turned out to be with YAML.

As it turns out, the main issue was with how YAML (or Psych in Ruby) dumps integers and strings differently. Checkout the following code snippet and see for yourself:

require 'psych'

Psych.dump('123')   # => "--- '123'\n"

## vs.

Psych.dump('hello') # => "--- hello\n"

This is not a bug in Psych. In fact, the same behavior is found in Python implementation, PyYAML:

import yaml

yaml.dump('123')   # => "'123'\n"

## vs.

yaml.dump('hello') # => 'hello\n...\n'

It’s weird to me why YAML doesn’t treat all strings equally and keep or remove the quotes all together. This just adds up to the list of weird behavior of YAML. Although to be fair, it might be a way to differentiate between integers and integers wrapped in quotes when they are dumped and reloaded (I’m not waging a war on YAML, I swear).

Eventually, to work around this issue, I had to explicitly convert params[:ref] to an integer iff it could be cast into an integer. This way, we can write a single search query and remove the wrapping single quotes for all types of reference numbers:

ref = Integer(params[:ref]) rescue params[:ref]

Book.where('info LIKE ?', "%ref: #{ref}")

That’s it for now. Thanks for reading!