Advent of Code 2019 Day 4

This year I have decided to try and do the code challenges on the Advent of Code website in Scala and possibly Spark if needed (or an interesting solution arises).
These are simple little coding challenges given once per day like an Advent Calendar before Christmas.

Day 4: Part 1

This challenge is relatively short so I will include the whole thing below:

— Day 4: Secure Container —

You arrive at the Venus fuel depot only to discover it’s protected by a password. The Elves had written the password on a sticky note, but someone threw it out.

However, they do remember a few key facts about the password:

  • It is a six-digit number.
  • The value is within the range given in your puzzle input.
  • Two adjacent digits are the same (like 22 in 122345).
  • Going from left to right, the digits never decrease; they only ever increase or stay the same (like 111123 or 135679).

Other than the range rule, the following are true:

  • 111111 meets these criteria (double 11, never decreases).
  • 223450 does not meet these criteria (decreasing pair of digits 50).
  • 123789 does not meet these criteria (no double).

How many different passwords within the range given in your puzzle input meet these criteria?

Your puzzle input is 136760-595730.

So we need to crack that password! Or at least work out how many combinations there are.

This is a nice and simple thing to do in Scala:

val min = 136760
val max = 595730

val fullRange = min to max

First we define the minimum and maximum and create a range between them.

Next I want to extract each digit inside each item in the range into a single number. I actually use a bit of a short-cut to do this:

def charToInt(char: Char): Int = char.toInt - '0'

This method will take a character and assuming it is a number character will convert it into a matching integer. Combined with a string version of a candidate password this lets me produce an array of digits with ease like so:

fullRange
  .map(n => n.toString)
  .map(string => string.map(char => charToInt(char)))

Now all we need to do is filter down this big collection of digits to match the criteria described:

First lets find all the combinations with repeating digits:

def hasRepeatedDigit(number: IndexedSeq[Int]): Boolean = {
  for (index <- 0 until number.size - 1) {
    val digit = number(index)
    val nextDigit = number(index + 1)
    if (digit == nextDigit) {
      return true
    }
  }
  false
}

That’s pretty simple and easy.

Next let us filter to just those digits with incrementing or remaining the same digits:

def isIncrementingOrSame(number: IndexedSeq[Int]): Boolean = {
  var index: Int = 0
  while (index < number.size - 1) {
    val digit = number(index)
    for (i <- index + 1 until number.size) {
      val testDigit = number(i)
      if (testDigit < digit) {
        return false
      }
    }
    index += 1
  }
  true
}

A little more complex but not hard.

Putting these together like so:

val validPasswords = fullRange
  .map(n => n.toString)
  .map(string => string.map(char => charToInt(char)))
  .filter(hasRepeatedDigit)
  .filter(isIncrementingOrSame)

println(validPasswords.size)

Will print out the amount of valid values asked for in part 1!

Day 4: Part 2

Now part 2 modifies one of the conditions slightly:

— Part Two —

An Elf just remembered one more important detail: the two adjacent matching digits are not part of a larger group of matching digits.

Given this additional criterion, but still ignoring the range rule, the following are now true:

  • 112233 meets these criteria because the digits never decrease and all repeated digits are exactly two digits long.
  • 123444 no longer meets the criteria (the repeated 44 is part of a larger group of 444).
  • 111122 meets the criteria (even though 1 is repeated more than twice, it still contains a double 22).

How many different passwords within the range given in your puzzle input meet all of the criteria?

Now we can add an extra filter to cover this.

There’s a few ways of writing this filter, one slightly hacky way is to convert the digits back to a string and use a Regular Expression to find all the repeating digits:

val pattern = Pattern.compile("(?<=(.))(?!\\1)")
def repeatDigitsNotPartOfLargerGroup(number: IndexedSeq[Int]): Boolean = {
  val asString = number.map(digit => digit.toString).mkString
  val repeatedDigits = pattern.split(asString).toSeq
  repeatedDigits.exists(repeat => repeat.length == 2)
}

The pattern does a positive lookbehind and negative lookahead. Kind of hard to understand unless you use Regular Expressions a lot.
You could do a similar thing with a Java Scanner too.

But if we wanted to do this properly without converting to a string we really only need 2 nested loops to perform the same logic on the digits:

def repeatDigitsNotPartOfLargerGroup(number: IndexedSeq[Int]): Boolean = {
  val groupCounts = mutable.Buffer[(Int, Int)]()

  var start = 0
  while (start < number.length - 1) {
    val digit = number(start)
    var count = 1
    var i = start + 1
    var changed = false
    while (i < number.length && !changed) {
      val nextDigit = number(i)
      if (digit != nextDigit) {
        changed = true
      } else {
        count += 1
        i += 1
      }
    }
    val group = (digit, count)
    groupCounts += group
    start += count
  }

  groupCounts.exists(group => group._2 == 2)
}

And with either of these filters added we get our result for part 2!

Advent of Code 2019 Day 3

This year I have decided to try and do the code challenges on the Advent of Code website in Scala and possibly Spark if needed (or an interesting solution arises).
These are simple little coding challenges given once per day like an Advent Calendar before Christmas.

I did complete this challenge on the day but am only now managing to write about it!

Day 3: Part 1

Today’s challenge is again slightly more difficult than previous days.
I will again only try to paste the relevant parts of the challenge here.

We are presented with input that describes 2 wires coming out of a port and snaking around a grid.

Specifically, two wires are connected to a central port and extend outward on a grid. You trace the path each wire takes as it leaves the central port, one wire per line of text (your puzzle input).

And our job is to find the closest point they cross:

To fix the circuit, you need to find the intersection point closest to the central port. Because the wires are on a grid, use the Manhattan distance for this measurement. While the wires do technically cross right at the central port where they both start, this point does not count, nor does a wire count as crossing with itself.

Manhattan distance is a common distance metric used when dealing with grids and is also known as “Taxicab Geometry” because of the fact you measure the same way as a taxi would navigate through a city like Manhattan (a grid based city).
That means you measure your distance in each axis and add them up.
For example if you are at position (0, 1) and want to get to (10, 9) you take the x components and find the difference, 0 to 10 = 10, and do the same with the y components, 1 to 9 = 8, and add those results together, 8 + 10 = 18, and that is your Manhattan distance.

The wire paths are describe using what are essentially commands, for example:

For example, if the first wire’s path is R8,U5,L5,D3, then starting from the central port (o), it goes right 8, up 5, left 5, and finally down 3:

...........
...........
...........
....+----+.
....|....|.
....|....|.
....|....|.
.........|.
.o-------+.
...........

Then, if the second wire’s path is U7,R6,D4,L4, it goes up 7, right 6, down 4, and left 4:

...........
.+-----+...
.|.....|...
.|..+--X-+.
.|..|..|.|.
.|.-X--+.|.
.|..|....|.
.|.......|.
.o-------+.
...........

These wires cross at two locations (marked X), but the lower-left one is closer to the central port: its distance is 3 + 3 = 6.

So to begin we will need to be able to read and represent the wires in the text file.

I begin by creating some data structures to do this:

import scala.collection._

object Direction extends Enumeration {
  type Direction = Value
  val UP = Value("U")
  val RIGHT = Value("R")
  val DOWN = Value("D")
  val LEFT = Value("L")
}

import Direction.Direction

case class Command(direction: Direction, distance: Int)

type Wire = Seq[Command]

Here I have defined the direction as an enumerated type and commands as being a combination of directions and distance, with a wire simply being an ordered sequence of commands.

Now for parsing and reading I will again be using the Scala Source class and split this into several function to make it easier to read and think about the code:

def parseCommand(command: String): Option[Command] = {
  if (command.length < 2) {
    return None
  }
  try {
    val direction = Direction.withName(command.substring(0, 1))
    val distance = command.substring(1).toInt
    Some(Command(direction, distance))
  } catch {
    case _: Exception =>
      println(s"Unhandled command: $command")
      None
  }
}

First up I think about how I want to handle parsing a single command from the file I am given. These will be in forms similar to U1, R12, D3 and L23, basically a letter denoting direction followed by an integer denoting distance.
In my Direction enumerated object I defined each direction to have a name corresponding to the letters used in the input. I take the first character of the command and attempt to match it, then take the remainder and attempt to convert it to an integer.
If something goes wrong with the parsing I return a None that I can handle later and log the bad command.

def parseLine(line: String): Option[Wire] = {
  if (line == null || line.isEmpty) {
    return None
  }
  val commands: Seq[Option[Command]] = line.split(',')
    .map(part => parseCommand(part))

  if (commands.forall(item => item.isDefined)) {
    Some(commands.flatten)
  } else {
    println(s"Unhandled line: $line")
    None
  }
}

import scala.io.Source

def readInput(filename: String): Seq[Wire] = {
  val source = Source.fromFile(filename)
  val wires = source.getLines()
    .map(line => line.trim)
    .filter(line => line.nonEmpty)
    .map(line => parseLine(line))

  wires.flatten.toSeq
}

The next function is for parsing a whole line.
It splits the line up on the comma separator and uses the first function to extract a command from it.
It then checks all the results of parsing and sees if there were any errors with the command parsing. If there were it returns a None and logs an error, otherwise it flattens out all the Some[Command] instances into Command instances.

Finally there is the readInput function that actually opens the file, reads it line by line and uses the parseLine method to generate whole wires.

With all that done we can now represent our wires in a way we can easily manipulate. It’s now time to consider how to determine where on a grid the wires actually live!
For this we need to represent positions somehow:

type Position = (Int, Int)

For now this simple tuple will suffice.

Now we need to convert the commands that make up a wire and convert them into all the positions they sit on a grid.
With our data structures this can be relatively simple:

def addCommand(start: Position, command: Command): Position = {
  val Command(direction: Direction, distance: Int) = command
  if (distance == 0) {
    return start
  }
  val (x, y) = start

  direction match {
    case Direction.UP => (x, y + distance)
    case Direction.DOWN => (x, y - distance)
    case Direction.RIGHT => (x + distance, y)
    case Direction.LEFT => (x - distance, y)
  }
}

The way this function works is that given a starting position and command it will determine where the command would cause the position to move to and return that as a result.

Of course if I used just this method I would only end up with positions where the wire changed direction (or got a new command), not all the points in between these positions.
For this reason I need a method of getting all the points between the starting position and the ending position of a command:

def pointsBetween(start: Position, end: Position): Seq[Position] = {
  val results = mutable.Buffer[Position]()
  val (x0, y0) = start
  val (x1, y1) = end
  val xStep = if (x0 > x1) -1 else 1
  val yStep = if (y0 > y1) -1 else 1
  for (x <- x0.to(x1, xStep)) {
    for (y <- y0.to(y1, yStep)) {
      val pos: Position = (x, y)
      // Don't add the start to the results
      if (x != x0 || y != y0) {
        results += pos
      }
    }
  }
  results
}

This method is relatively simple again, it’s basic interpolation between the two points.
I make use of a mutable Scala Buffer here to make things easier to read.

Now that I can get the points between 2 points I can bring this altogether to get all the points in a wire:

def getPositions(wire: Wire, origin: Position = (0, 0)): Seq[Position] = {
  val positions = mutable.Buffer[Position]()
  var lastPosition: Position = origin
  for (command <- wire) {
    val firstPosition: Position = lastPosition
    lastPosition = addCommand(firstPosition, command)
    // add all points between start (exclusive) and end (inclusive)
    positions ++= pointsBetween(firstPosition, lastPosition)
  }

  positions
}

This function starts at an origin and executes each command, using the start and end points of each, adding them all to a buffer and returning them all.

Now we can get all the points in a wire we need a way of finding out when wires intersect each other.
This is actually pretty simple:

def findIntersections(paths: Seq[Seq[Position]]): Seq[Position] = {
  val positionCounts: Map[Position, Int] =
    paths.flatMap(path => path.distinct)
      .groupBy(identity)
      .mapValues(_.size)

  positionCounts.filter(entry => entry._2 > 1).keys.toSeq
}

Since each path a wire takes now contains all the positions a wire can be in we just need to find where a position exists in both wires paths.
I have done this using some standard Scala code;

  1. First I get a distinct list of all positions in each path, that way I can avoid counting a wire crossing itself.
  2. Then I use flatmap to combine the paths into one list.
  3. Then I group them all by themselves (that’s the identity method I use) and convert the list into a Map[Position, Int] with the values being the count of occurrences of a given position.

This resulting map contains all the positions in both paths, if I then filter it down to only those that have a count greater than 1 I can find any intersections.

I can use the above methods to get me this far like so:

val wires = readInput("day3.input.txt")
val wireToPositions = wires.map(wire => (wire, getPositions(wire))).toMap
val intersections = findIntersections(wireToPositions.values.toSeq)

Now I need to actually use the manhattan distance to find out which of the intersections is the closest.
The code fot the manhattan distance in Scala is simple:

def manhattanDistance(origin: Position, other: Position): Int = {
  val x: Int = math.abs(origin._1 + other._1)
  val y: Int = math.abs(origin._2 + other._2)
  x + y
}

I can then find the closest intersection like so:

def findClosestIntersection(
    origin: Position,
    intersections: Seq[Position]
    ): (Position, Int) = {
  val withDistances =
    intersections.map(pos => (pos, manhattanDistance(origin, pos)))

  withDistances.minBy(f => f._2)
}

What this does is similar to finding the intersections initially; it takes each position and gets it’s distance from the origin, then simply returns the one with the smallest distance.

I can then use this command like so to answer part 1:

val closest = findClosestIntersection(
  (0, 0),
  intersections
)
println(s"Closest intersection ${closest._1} distance=${closest._2}")

Day 3: Part 2

Finally onto part 2.
We now need to use a different measurement on the intersections:

To do this, calculate the number of steps each wire takes to reach each intersection; choose the intersection where the sum of both wires’ steps is lowest. If a wire visits a position on the grid multiple times, use the steps value from the first time it visits that position when calculating the total value of a specific intersection.

The number of steps a wire takes is the total number of grid squares the wire has entered to get to that location, including the intersection being considered. Again consider the example from above:

...........
.+-----+...
.|.....|...
.|..+--X-+.
.|..|..|.|.
.|.-X--+.|.
.|..|....|.
.|.......|.
.o-------+.
...........

In the above example, the intersection closest to the central port is reached after 8+5+5+2 = 20 steps by the first wire and 7+6+4+3 = 20 steps by the second wire for a total of 20+20 = 40 steps.

However, the top-right intersection is better: the first wire takes only 8+5+2 = 15 and the second wire takes only 7+6+2 = 15, a total of 15+15 = 30 steps.

With our code this is actually pretty easy.
Since we have a list of all the positions in a wire we can use it with the intersections we uncovered before to find out their distances by simply counting the steps it takes to get to them:

val wireToIntersectionDistances: Map[Wire, Map[Position, Int]] =
  wireToPositions.map(entry => {
    val wire = entry._1
    val positions = entry._2
    val positionsToDistance = intersections.map(
      // remember to +1 as we excluded the origin from our original list
      intersection => (intersection, positions.indexOf(intersection) + 1)
    ).toMap
    (wire, positionsToDistance)
  })

Then with this map of wires to their intersections and their distances we can do some more calculations to find out the total steps taken from both wires for each intersection and then find the lowest:

val intersectionsToTotalDistances: Map[Position, Int] =
  wireToIntersectionDistances.foldLeft(Map[Position, Int]())((sum, map) => {
    val otherMap = map._2
    (sum.keySet ++ otherMap.keySet).map { key: Position =>
      (key, sum.getOrElse(key, 0) + otherMap.getOrElse(key, 0))
    }.toMap
  })

val minDistanceIntersection = intersectionsToTotalDistances.minBy(f => f._2)

println(s"Min distance intersection at ${minDistanceIntersection._1} distance=${minDistanceIntersection._2}")

Now admittedly that foldLeft block of code does look quite complex but what it does is fairly simple:

  1. The first set of arguments contains the initial, empty, value for what we want to eventually return, a map of positions to the total amount of steps taken.
  2. The next set of arguments contains the function that keeps the running summarized map and the current map being processed from the wireToIntersectionDistances map entries.
  3. The rest of the code then sums up the maps values within based on their keys, which are the positions.
  4. Finally we get the minimum entry like before.

And that’s part 2 of Day 3 done!

Advent of Code 2019 Day 2

This year I have decided to try and do the code challenges on the Advent of Code website in Scala and possibly Spark if needed (or an interesting solution arises).
These are simple little coding challenges given once per day like an Advent Calendar before Christmas.

I am starting a day late on the 2nd of December, but this hopefully means my solutions will not spoil it for anyone else!

Day 2: Part 1

This days challenge is quite different to Day 1 and involves creating a simple interpreter or emulator for processing a simple input program and set of opcodes.

I have decided not to copy and paste the whole challenge here for brevity’s sake but I will refer back to parts. I encourage you to read the whole challenge before continuing.

We are tasked with building a “computer” to interpret “Intcode” programs:

An Intcode program is a list of integers separated by commas (like 1,0,0,3,99). To run one, start by looking at the first integer (called position 0). Here, you will find an opcode - either 1, 2, or 99. The opcode indicates what to do; for example, 99 means that the program is finished and should immediately halt. Encountering an unknown opcode means something went wrong.

We are provided with 3 opcodes in this part of the task, 1, 2, and 99.

Opcode 1 does the following:

Opcode 1 adds together numbers read from two positions and stores the result in a third position. The three integers immediately after the opcode tell you these three positions - the first two indicate the positions from which you should read the input values, and the third indicates the position at which the output should be stored.

And opcode 2 does:

Opcode 2 works exactly like opcode 1, except it multiplies the two inputs instead of adding them. Again, the three integers after the opcode indicate where the inputs and outputs are, not their values.

With opcode 99 halting the program.

We are also told how to move on to the next operation when done calculating the current opcode:

Once you’re done processing an opcode, move to the next one by stepping forward 4 positions.

It is useful to note that all opcodes and data in this task appears to be integers.

It is also useful to realise that from the description of this task we are actually implementing a very simple computer with Von Neumann architecture, that is, a computer where program input and output and program instructions are stored within the same space, and is the basis of most common computers in use today.
An interesting side-effect of this architecture is that code can be self modifying.

As part of this task we are given some example inputs and their eventual outputs which will be useful when testing our implementation:

Here are the initial and final states of a few more small programs:

  • 1,0,0,0,99 becomes 2,0,0,0,99 (1 + 1 = 2).
  • 2,3,0,3,99 becomes 2,3,0,6,99 (3 * 2 = 6).
  • 2,4,4,5,99,0 becomes 2,4,4,5,99,9801 (99 * 99 = 9801).
  • 1,1,1,4,99,5,6,0,99 becomes 30,1,1,4,2,5,6,0,99.

Our overall task is given at the end as:

Once you have a working computer, the first step is to restore the gravity assist program (your puzzle input) to the “1202 program alarm” state it had just before the last computer caught fire. To do this, before running the program, replace position 1 with the value 12 and replace position 2 with the value 2. What value is left at position 0 after the program halts?

Implementing the Computer

We first need to read in the input program and convert it into a structure our program can use.

import scala.io.Source

val filename = "day2.input.txt"
// Open the input file
val bufferedSource = Source.fromFile(filename)

// Convert the contents into our opcodes
val originalOpcodes: Array[Int] = bufferedSource.mkString
  .trim
  .split(',')
  .map(string => string.toInt)

// Close the input file
bufferedSource.close()

This code will convert the input file into an array of integers ready for us to work with.
The mkString method will load the whole contents of the file into a string then the trim method removes any trailing spaces, with the split and map methods dividing the string up on the commas and converting that output to integers.

Now with our 3 given opcodes we should define some kind of structure to make calculating them easier. Since this is a quick puzzle I will opt for defining a simple functions and will also use the scala type keyword to try and make my code easier to understand.

I will be making use of Scala’s mutable indexed type mutable.IndexedSeq to store the working memory of the program that will be read and modified by each operation:

type Memory = Array[Int]
type Position = Int
type Opcode = Int

// Simple Operation type:
// Taking in the current memory state and position and outputting the new
// state and position
type Operation = (Memory, Position) => (Memory, Position)

Now I can create a simple lookup table of opcodes and their operations:

type Memory = mutable.IndexedSeq[Int]
type Opcode = Int

// Simple Operation type:
// Taking in the current position in memory and memory itself and outputting
// the new position and whether this operation should halt or not.
type Operation = (Int, Memory) => (Int, Boolean)

// The add operation
val addOp: Operation = (pos: Int, memory: Memory) => {
  val inputAddress1 = memory(pos + 1)
  val inputAddress2 = memory(pos + 2)
  val outputAddress = memory(pos + 3)
  memory(outputAddress) = memory(inputAddress1) + memory(inputAddress2)
  (pos + 4, false)
}

// The multiply operation
val multiplyOp: Operation = (pos: Int, memory: Memory) => {
  val inputAddress1 = memory(pos + 1)
  val inputAddress2 = memory(pos + 2)
  val outputAddress = memory(pos + 3)
  memory(outputAddress) = memory(inputAddress1) * memory(inputAddress2)
  (pos + 4, false)
}

// The simple halting operation
val haltOp: Operation = (pos: Int, memory: Memory) => {
  (pos, true)
}

// The map of opcodes to their operations
val opcodeMap = Map[Opcode, Operation](
  (1, addOp),
  (2, multiplyOp),
  (99, haltOp)
)

Now we have a simple map of opcodes to their operations we need to write the code to execute them:

val errorOp: Operation = (pos: Int, memory: Memory) => {
  val opcode = memory(pos)
  println(s"Unknown opcode encountered at $pos: $opcode")
  (pos, true)
}

@scala.annotation.tailrec
def iterate(pos: Int, memory: Memory): Unit = {
  val opcode = memory(pos)
  val operation = opcodeMap.getOrElse(opcode, errorOp)
  val (newPos, shouldHalt) = operation(pos, memory)
  if (shouldHalt) {
    return
  }
  iterate(newPos, memory)
}

This method will take in a position in memory and the memory itself and execute opcodes on it until it reaches an operation that will cause it to halt.
I have done this using the Tail Recursion support in Scala to make it easy to read. This will avoid stack overflow issues.

I have also added an error operation that will be executed upon hitting an unknown opcode.

We can test one of the examples:

val mainMemory: Memory = mutable.IndexedSeq(2,4,4,5,99,0)

iterate(0, mainMemory)

val finalOutput = mainMemory.mkString(",")
println(finalOutput)

This will output: 2,4,4,5,99,9801

We can run this with the file contents my copying the original code to the memory variable:

val mainMemory: Memory = mutable.IndexedSeq(originalOpcodes: _*)

Of course the task also instructs us to fix the program:

To do this, before running the program, replace position 1 with the value 12 and replace position 2 with the value 2.

mainMemory(1) = 12
mainMemory(2) = 2

Then execute it.
And we have the answer to the puzzle in index 0.

Day 2: Part 2

The second part of the day requires us to figure out the inputs to the program that will result in an expected value.

“With terminology out of the way, we’re ready to proceed. To complete the gravity assist, you need to determine what pair of inputs produces the output 19690720.”

Something important noted in the puzzle is that opcodes can move the position in memory a variable amount of steps depending on what instructions there are:

The address of the current instruction is called the instruction pointer; it starts at 0. After an instruction finishes, the instruction pointer increases by the number of values in the instruction; until you add more instructions to the computer, this is always 4 (1 opcode + 3 parameters) for the add and multiply instructions. (The halt instruction would increase the instruction pointer by 1, but it halts the program instead.)

This actually means our halt instruction should technically look like:

val haltOp: Operation = (pos: Int, memory: Memory) => {
  (pos + 1, true)
}

The following extra details are provided to narrow down the search:

The inputs should still be provided to the program by replacing the values at addresses 1 and 2, just like before. In this program, the value placed in address 1 is called the noun, and the value placed in address 2 is called the verb. Each of the two input values will be between 0 and 99, inclusive.

This narrows down our search somewhat.

To repeat what we need to do is:

Find the input noun and verb that cause the program to produce the output 19690720. What is 100 * noun + verb? (For example, if noun=12 and verb=2, the answer would be 1202.)

It is also suggested that we should make sure to reset the memory to the original opcodes before each attempt.

To this end we can write a function to make it easier to test various inputs:

def decode(noun: Int, verb: Int, originalMemory: IndexedSeq[Int]): Int = {
  val mainMemory: Memory = mutable.IndexedSeq(originalMemory: _*)
  mainMemory(1) = noun
  mainMemory(2) = verb
  iterate(0, mainMemory)
  mainMemory(0)
}

This will execute for the given noun and verb pair and output the result.

We can then brute force the answer to the puzzle:

val random = scala.util.Random
var output: Int = 0
var noun: Int = 0
var verb: Int = 0
while (output != 19690720) {
  noun = random.nextInt(100)
  verb = random.nextInt(100)
  output = decode(noun, verb, originalOpcodes)
}
println(s"noun=$noun verb=$verb")
val answer = 100 * noun + verb
println(answer)

And this will output the solution to part 2!

Advent of Code 2019 Day 1

This year I have decided to try and do the code challenges on the Advent of Code website in Scala and possibly Spark if needed (or an interesting solution arises).
These are simple little coding challenges given once per day like an Advent Calendar before Christmas.

I am starting a day late on the 2nd of December, but this hopefully means my solutions will not spoil it for anyone else!

Day 1: Part 1

The first challenge is a simple one, I will copy the whole challenge below:

— Day 1: The Tyranny of the Rocket Equation —

Santa has become stranded at the edge of the Solar System while delivering presents to other planets! To accurately calculate his position in space, safely align his warp drive, and return to Earth in time to save Christmas, he needs you to bring him measurements from fifty stars.

Collect stars by solving puzzles. Two puzzles will be made available on each day in the Advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

The Elves quickly load you into a spacecraft and prepare to launch.

At the first Go / No Go poll, every Elf is Go until the Fuel Counter-Upper. They haven’t determined the amount of fuel required yet.

Fuel required to launch a given module is based on its mass. Specifically, to find the fuel required for a module, take its mass, divide by three, round down, and subtract 2.

For example:

  • For a mass of 12, divide by 3 and round down to get 4, then subtract 2 to get 2.
  • For a mass of 14, dividing by 3 and rounding down still yields 4, so the fuel required is also 2.
  • For a mass of 1969, the fuel required is 654.
  • For a mass of 100756, the fuel required is 33583.

The Fuel Counter-Upper needs to know the total fuel requirement. To find it, individually calculate the fuel needed for the mass of each module (your puzzle input), then add together all the fuel values.

What is the sum of the fuel requirements for all of the modules on your spacecraft?

As I said this is simple enough; we need to calculate the fuel requirement based on the given formula for each module and sum them all together for our answer.

The formula given is: fuel = floor(mass / 3) - 2 (floor is just a function that rounds down the input).

We are given a puzzle input of a text file where each line is a number denoting the mass of a single module e.g.

86870
94449
119448
53472
140668
64989
112056
88880
131335
94943

We can load this into Scala and apply the formula to each line then sum the answer using code similar to the following:

import scala.io.Source
import scala.math.floor

val filename = "input.txt"
// Open the input file
val bufferedSource = Source.fromFile(filename)

// For each line:
val total = bufferedSource.getLines()
  // Convert it to a Long
  .map(line => line.toLong)
  // Apply the formula we were given
  .map(mass => floor(mass / 3) - 2)
  // Sum all results together
  .sum

// Display the total
println(s"Total: $total")

// Close the resource
bufferedSource.close()

Once we have a result it’s on to part 2 of Day 1.

Day 1: Part 2

The puzzle reads as:

— Part Two —

During the second Go / No Go poll, the Elf in charge of the Rocket Equation Double-Checker stops the launch sequence. Apparently, you forgot to include additional fuel for the fuel you just added.

Fuel itself requires fuel just like a module - take its mass, divide by three, round down, and subtract 2.
However, that fuel also requires fuel, and that fuel requires fuel, and so on. Any mass that would require negative fuel should instead be treated as if it requires zero fuel; the remaining mass, if any, is instead handled by wishing really hard, which has no mass and is outside the scope of this calculation.

So, for each module mass, calculate its fuel and add it to the total. Then, treat the fuel amount you just calculated as the input mass and repeat the process, continuing until a fuel requirement is zero or negative.

For example:

  • A module of mass 14 requires 2 fuel. This fuel requires no further fuel (2 divided by 3 and rounded down is 0, which would call for a negative fuel), so the total fuel required is still just 2.

  • At first, a module of mass 1969 requires 654 fuel. Then, this fuel requires 216 more fuel (654 / 3 - 2). 216 then requires 70 more fuel, which requires 21 fuel, which requires 5 fuel, which requires no further fuel. So, the total fuel required for a module of mass 1969 is 654 + 216 + 70 + 21 + 5 = 966.
  • The fuel required by a module of mass 100756 and its fuel is: 33583 + 11192 + 3728 + 1240 + 411 + 135 + 43 + 12 + 2 = 50346.

What is the sum of the fuel requirements for all of the modules on your spacecraft when also taking into account the mass of the added fuel? (Calculate the fuel requirements for each module separately, then add them all up at the end.)

This is a little harder than Part 1.

Now for each module we need to calculate the fuel required for not just the module but the fuel to carry the additional fuel!

Luckily the way we structure our Scala code makes this easy to do. We can replace our simple fuel calculation with a call to a more complex function before our sum function call:

def calculateFuel(mass: Long): Long = {
  // define the fuel function we will be using
  val fuelFunction = (mass: Long) => (floor(mass / 3) - 2).toLong
  
  // calculate the initial fuel we need for the given mass
  val initialFuel: Long = fuelFunction(mass)
  var total: Long = initialFuel

  // Loop round adding any additional fuel required until it reaches 0 or less
  var additional: Long = fuelFunction(initialFuel)
  while (additional > 0) {
    total += additional
    additional = fuelFunction(additional)
  }

  // return the total
  total
}

// For each line:
val total = bufferedSource.getLines()
  // Convert it to a Long
  .map(line => line.toLong)
  // Apply the formula we were given
  .map(mass => calculateFuel(mass))
  // Sum all results together
  .sum

This will return us our result, applying that function to all the masses we are given before totalling up.

This completes Day 1 of Advent of Code 2019!

Creating a Date Range in Apache Spark Using Scala

Sometimes when dealing with data in Spark you may find yourself needing to join data against a large date range. I have encountered this when needing take a sparsely populated table (in terms of the dates) and fill in any missing entries with some sensible value, be it a default value (using the na functions) or a previous dates value (using a windowing function).

Take for example the following example table:

Date Stock
2019-01-01 0
2019-01-12 10
2019-01-14 9
2019-01-15 8
2019-01-20 10
2019-01-25 7
2019-01-31 5

If we wanted to fill in the gaps in the dates here we’d need a date range between the minimum and maximum dates within this table: 2019-01-01 to 2019-01-31.

Let’s represent this in some Spark Scala code to help illustrate:

val sparseData = spark.sparkContext.parallelize(Seq(
  ("2019-01-01", 0),
  ("2019-01-12", 10),
  ("2019-01-14", 9),
  ("2019-01-15", 8),
  ("2019-01-20", 10),
  ("2019-01-25", 7),
  ("2019-01-31", 5)
)).toDF("date", "stock")
  .withColumn("date", col("date").cast(DateType))

val minMax = sparseData.select("date")
  .agg(min("date").as("min"), max("date").as("max"))

That minMax DataFrame here will look like:

min max
2019-01-01 2019-01-31

Now we want to create a DataFrame containing all the dates between min and max, our date range.
One simple way of doing this is to create a UDF (User Defined Function) that will produce a collection of dates between 2 values and then make use of the explode function in Spark to create the rows (see the functions documentation for details).

The following Scala code will create a sequence of java.sql.Date types between 2 dates. Note that I have used the newer java.time classes here and converted between them as I am more comfortable with these classes:

import java.sql.Date
import java.time.{Duration, LocalDate}

/**
 * Create a sequence containing all dates between from and to
 * @param dateFrom The date from
 * @param dateTo The date to
 * @return A Seq containing all dates in the given range
 */
def getDateRange(
    dateFrom: Date,
    dateTo: Date
): Seq[Date] = {
  val daysBetween = Duration
    .between(
      dateFrom.toLocalDate.atStartOfDay(),
      dateTo.toLocalDate.atStartOfDay()
    )
    .toDays

  val newRows = Seq.newBuilder[Date]
  // get all intermediate dates
  for (day <- 0L to daysBetween) {
    val date = Date.valueOf(dateFrom.toLocalDate.plusDays(day))
    newRows += date
  }
  newRows.result()
}

With this function we can create a UDF to use:

val dateRangeUDF = udf(getDateRange _, ArrayType(DateType))

What this UDF will do is create an array column containing all the dates between within the given range:

val minMaxWithRange = minMax.withColumn(
  "range",
  dateRangeUDF(col("min"), col("max"))
  )

This will look something like this (I have truncated the range column to make it easier to see):

min max range
2019-01-01 2019-01-31 2019-01-01, 2019-01-02, … 2019-01-30, 2019-01-31

Each entry in the range array will also be typed correctly as a date.

With this array we can make use of the built in spark function explode to create rows for us:

val allDates = minMaxWithRange
  .withColumn("date", explode(col("range")))
  .drop("range", "min", "max")

This will produce a DataFrame that looks like the following, with all the dates between 2019-01-01 and 2019-01-31:

date
2019-01-01
2019-01-02
2019-01-30
2019-01-31

We can then join on this with our original DataFrame:

val joined = sparseData.join(allDates, Seq("date"), "outer")
  .sort("date")
// A sort might be needed if you want your data to remain ordered by date

And proceed to do any filling logic we want for the missing fields.
For example filling them with 0s:

joined.na.fill(0, Seq("stock"))

Or filling them with previous known values, essentially stretching the data along to make a dense table:

import org.apache.spark.sql.expressions.Window

val window = Window
  // Note this windows over all the data as a single partition
  .orderBy("date")
  .rowsBetween(Window.unboundedPreceding, Window.currentRow)
val filled = joined.withColumn(
  "stock",
  when(
    col("stock").isNull,
    last("stock", ignoreNulls = true).over(window)
    )
    .otherwise(col("stock"))
)

Hopefully this post is useful to anyone curious about stretching a data set or creating a date range in Apache Spark using Scala.
There are likely alternative methods to doing this, especially in Python where you can potentially make use of external libraries like Pandas to create a range and then send it to Spark to use.
You may even be able to use Spark Native functions which would avoid any potential performance issues with using a UDF.