Production treatment of scripts

Applying production code treatment to utility scripts

Those single-file, single-purpose scripts that you might occasionally create are still worthy of the treatment you give production applications. Specifically, the use of explicit dependency declarations and behavior-driven development, and we can do this without adding any additional files.

Background

A few months back, I was tasked with taking some Microsoft Word DOCX files and incorporating their contents into an internationalized, front-end application. Each DOCX file contained the verbiage to use for a particular language, and the end goal was to populate JavaScript locale files with the contents. I found a RubyGem - appropriately named “docx” - to parse DOCX files, and after fumbling around in IRB for a while, I found a sequence of commands that produced the output I desired. I then repeated this sequence of commands for all the files and incorporated the results into the locale files, which completed the task at hand. More recently, I was tasked with updating those same locale files from a new set of files, and I immediately regretted that I had not created a better system for handling this task the previous time. I was resolved to write a script which handled the bulk of this largely repetitive workload.

Utility Scripts

While the processing of these DOCX files is certainly not a responsibility of the application I was working on - instead it’s more of a consequence of how we receive these data from our client - building a utility script to automate as much as possible is still a worthwhile endeavor. When writing these kinds of utility scripts, I often turn to Ruby, as it’s a language I know well and it’s well suited for this type of task. Historically, I’ve had a couple gripes with these scripts: first, I tend to develop them by running them over and over, with small changes between each invocation. Second, it can be tricky to ensure I have the correct version of a dependency installed when I want to use the script months, or years, after it was created. While our applications could potentially suffer from similar problems, they are mitigated through the practice of behavior-driven development for the former problem and dependency management tools for the latter. Why not apply the same techniques and tools when building single-file utility scripts?

Dependency Management

For dependency management in Ruby, we generally use Bundler, where our dependency declarations are all put into one file - typically named Gemfile. For a single-file script, having a separate file for its dependencies seems awkward. Luckily, Bundler has a trick up its sleeve that fits our needs perfectly: bundler/inline. It provides a gemfile method, where we can declare our dependencies using the same interface as Gemfiles.

require 'bundler/inline'

gemfile do
  gem 'rspec', '3.7', require: false
end

Now that we’re able to declare the dependencies for our scripts, we no longer need to worry about their durability when we come back to them months, or years, later.

Behavior-Driven Development (BDD)

The same benefits we receive from using BDD while building our applications can be applied to building scripts. A suite of RSpec examples is usually contained in files that are separate from those of the implementation, but again, that approach is awkward for a single-file script. While I haven’t come across a part of RSpec purpose-built for our use case, we can take advantage of RSpec’s runner to differentiate between how our script file is being used. That is, we’ll inspect the $PROGRAM_NAME global (also aliased as $0) and the __FILE__ constant; if they are the same then the script itself is being executed, otherwise it’s being loaded from another program - RSpec in our case. That will allow us to have our script code and our RSpec examples all in the same file, and the examples will only be evaluated when the script is loaded by RSpec’s runner.

# script logic

if $PROGRAM_NAME == __FILE__
  # the script itself is being executed
  exit
end

require 'rspec'

RSpec.describe 'the script' do
  it 'is expected to do something'
end

Final Script

Now we just need to put it all together into an actual script. For the sake of an example, we’ll use a simplified recreation of Unix’s wc utility, which counts lines, words and characters of the text passed in through standard input (STDIN):

#! /usr/bin/env ruby

require 'bundler/inline'

gemfile do
  gem 'rspec', '3.7', require: false
end

class Program
  def initialize(input)
    @input = input
  end

  def call
    "%8d%8d%8d" % [lines, words, characters]
  end

  def characters
    @input.size
  end

  def lines
    @input.split(/\n/).size
  end

  def words
    @input.split(/\s+/).size
  end
end

if $PROGRAM_NAME == __FILE__
  puts Program.new(STDIN.read).call
  exit
end

require 'rspec'

RSpec.describe Program do
  subject(:program) { described_class.new(input) }
  let(:input) { "line 1\nline 2\nline 3\n" }

  it { expect(program.call).to eq('       3       6      21') }
  it { expect(program.characters).to eq(21) }
  it { expect(program.lines).to eq(3) }
  it { expect(program.words).to eq(6) }
end

With that, we can run our specs:

$ rspec wc.rb
....

Finished in 0.00348 seconds (files took 0.19061 seconds to load)
4 examples, 0 failures

And, we can run the script itself:

$ chmod +x wc.rb
$ echo "hello, world" | ./wc.rb
       1       2      13
$ printf "line 1\nline 2\n" | ./wc.rb
       2       4      14

`exit 0`

Hopefully, you’ll be able to use these tricks when writing your next utility script. With them, you’ll be able to take advantage of BDD while developing the script, and you can be confident about its use of dependencies long after it was written.

Jacob Swanner Development Blog