Infinite sequences in ruby

One feature of harmonia is tasks that recur on a schedule, e.g. every Thursday, or on the 30th day of each month. For these tasks we need to know not just when they’ll next occur, but also things like the next 4 occurrences, or all occurrences this month.

To do this we’ve used a technique more common in clojure: using an infinite sequence.

Defining simple infinite sequences

Ruby 1.9 and above let us define infinite sequences using the Enumerator class. A simple example is the sequence of integers:

integers = Enumerator.new do |yielder|
  n = 0
  loop do
    yielder.yield n
    n = n + 1
  end
end

>> integers.take(5)
=> [0, 1, 2, 3, 4]
>> integers.take(10)
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Here’s how this works. The block passed to Enumerator.new defines our sequence. It takes a yielder argument with a special method #yield, used to return elements in the sequence. Whenever #yield is called, execution of the block stops. Execution only restarts if more elements are needed, making the sequence lazy. The Enumerator class handles this stopping and starting execution — we only need to worry about how to generate each element.

Most of the code above is concerned with looping and yielding values, not generating them. We can factor this out, giving us a method that makes it trivial to define new sequences:

def sequence(&generator)
  Enumerator.new do |yielder|
    n = 0
    loop do
      yielder.yield generator.call(n)
      n = n + 1
    end
  end
end

>> integers = sequence {|n| n}
>> integers.take(10)
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>> squares = sequence {|n| n * n}
>> squares.take(10)
=> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

When using infinite sequences laziness is extremely important, as it’s impossible to generate all members of an infinite list in anything less than infinite time. A sequence that outputs whenever a new value is calculated shows this laziness in action:

integers = sequence {|n| puts "Calculating result #{n}"; n}

>> integers.take(3)
Calculating result 0
Calculating result 1
Calculating result 2
=> [1, 2, 3]

Now we can define sequences, how can we use them? We’ve already seen that we can #take any number of elements from our sequence. We can also use #take_while to take elements until a condition is met, such as finding all square numbers under 250:

>> squares.take_while {|n| n < 250}
=> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225]

In fact all Enumerable methods are available, but we have to take care in using them. As our sequences are infinite, any method that iterates over all members has the potential to take an infinite amount of time. For example calling #any? will either return true (if a matching element exists) or never return.

Another big drawback is that when we call Enumerable methods, laziness isn’t preserved. Suppose we want the first 5 odd square numbers. We might try the following:

>> squares.select {|n| n % 2 == 1}.take(5)

Unfortunately this will never return. Even though we only want a finite set of results, the call to #select operates on the full infinite sequence before it returns. #take(5) is never called. The same problem exists with #map, #drop, #reject and more.

Preserving laziness across derived sequences

Without laziness preservation, our sequences seem of limited use. In ruby 2.0 we can use #lazy to make our sequences lazier, but in 1.9 this isn’t available to us. Thankfully we can get around this by generating lazy versions of Enumerable methods ourselves. Let’s take the previous example, finding the first 5 odd square numbers. We hit a roadblock because #select never returned. If instead of using #select we use a new Enumerator to do our selecting, we can work around this:

odd_squares = Enumerator.new do |yielder|
  squares.each do |square|
    yielder.yield square if (square % 2 == 1)
  end
end

>> odd_squares.take(5)
=> [1, 9, 25, 49, 81]
>> odd_squares.take(10)
=> [1, 9, 25, 49, 81, 121, 169, 225, 289, 361]

Our new Enumerator iterates lazily through our original sequence, yielding only odd values. We’ve chained two enumerators together to preserve laziness.

This is all a bit cumbersome as is, but we can turn this into a ‘#select’ method on a new LazyEnumerator class:

class LazyEnumerator < Enumerator
  def select(&block)
    self.class.new do |yielder|
      each do |value|
        yielder.yield value if block.call(value)
      end
    end
  end
end

def lazy_sequence(&generator)
  LazyEnumerator.new do |yielder|
    n = 0
    loop do
      yielder.yield generator.call(n)
      n = n + 1
    end
  end
end

>> lazy_squares = lazy_sequence {|n| n * n}
>> lazy_squares.select {|n| n % 2 == 1}.take(5)
=> [1, 9, 25, 49, 81]

#reject and #map can be chained in a similar way to #select:

class LazyEnumerator < Enumerator
  def reject(&block)
    self.class.new do |yielder|
      each do |value|
        yielder.yield value unless block.call(value)
      end
    end
  end

  def map(&block)
    self.class.new do |yielder|
      each do |value|
        yielder.yield block.call(value)
      end
    end
  end
end

#drop and #drop_while are slightly more complicated, but follow a similar pattern. The main difference being that they need to keep track of how much to drop:

class LazyEnumerator < Enumerator
  def drop(n)
    self.class.new do |yielder|
      dropped_enough = false
      dropped = 0
      each do |element, index|
        dropped_enough ||= dropped >= n
        yielder.yield element if dropped_enough
        dropped = dropped + 1
      end
    end
  end

  def drop_while(&block)
    self.class.new do |yielder|
      match_found = false
      each do |element|
        match_found ||= !block.call(element)
        yielder.yield element if match_found
      end
    end
  end
end

Together, these methods give us a LazyEnumerator that can be chained in a great number of ways, giving our sequences a lot of power. #take and #drop let us select which members of a sequence we’re interested, while #select, #reject and #map allow us to build new sequences from existing ones:

>> integers = lazy_sequence {|n| n}
>> squares = integers.map {|n| n * n }
>> odd_squares = squares.select {|n| n % 2 == 1}
>> odd_squares.drop(10).take(10)
=> [441, 529, 625, 729, 841, 961, 1089, 1225, 1369, 1521]

Using only a few simple methods, we’ve been able to answer a complicated (though contrived) question, what are the second ten odd square numbers? This particular answer may not be interesting, but the technique of defining and deriving infinite sequences is much more general and useful. This is only a small sample of what you can do with them.