Recent Posts

Advent of Code 2019 Day 1

This year I have decided to try and do the code challenges on the Advent of Code website in Scala and possibly Spark if needed (or an interesting solution arises).
These are simple little coding challenges given once per day like an Advent Calendar before Christmas.

Contine reading...

Creating a Date Range in Apache Spark Using Scala

Sometimes when dealing with data in Spark you may find yourself needing to join data against a large date range. I have encountered this when needing take a sparsely populated table (in terms of the dates) and fill in any missing entries with some sensible value, be it a default value (using the na functions) or a previous dates value (using a windowing function).

Contine reading...

Designing a Comment System

In my previous post I mentioned one possible solution for adding comments to my blog using the built in support for data files in Jekyll. This approach was pioneered by Damien Guard.
In this post I hope to have a crack at designing such a system myself and implementing it.

Contine reading...

Comments on a Static Blog

I have been considering adding comments to my blog for a little while now.

Contine reading...

Extracting Published Dates from web pages

One objectively useful piece of information often present on news and blog post articles online is the date of publication.
It can be used to determine how fresh and relevant an article is and when used in conjunction with other processing allow you get a feel for the subject of the article, be it a company, person or event.

Contine reading...