“You can have data without information, but you cannot have information without data.” – Daniel Keys Moran

Selected Publications

Recent Posts

More Posts

(best viewed on original website) Over the last year, my focus has been diverted from exploring analytics, new packages and blogging, to completing my dissertation. With the dissertation now complete and only final edits remaining, I had some spare time to spend on projects that I have been curating throughout the year. One such project that has been in the back of my mind for the last couple of months concern itself with with faster, scalable machine learning.


H2O + AWS + purrr (Part III) This is the final installment of a three part series that looks at how we can leverage AWS, H2O and purrr in R to build analytical pipelines. In the previous posts I looked at starting up the environment through the EC2 dashboard on AWS’ website. The other aspect we looked at, in Part II, was how we can use purrr to train models using H2O’s awesome api.


H2O + AWS + purrr (Part II) This is the second installment in a three part series on integrating H2O, AWS and p(f)urrr. In Part II, I will showcase how we can combine purrr and h2o to train and stack ML models. In the first post we looked at starting up an AMI on AWS which acts as the infrastructure upon which we will model. Part one of the post can be found here


Packages used in this post Disclaimer: I am no financial advisor, have never been and you should not take any of this analysis as investment advice. These thoughts are my own, please dont mail me about your money strategies/problems. I enjoy numbers, scraping and data analysis and that is wat this post is about. Also, do you really want to trust someone who wrote this post at 2am… no you dont


H2O + AWS + purrr (Part I) In these small tutorials to follow over the next 3 weeks, I go through the steps of using an AWS1 AMI Rstudio instance to run a toy machine learning example on a large AWS instance. I have to admit that you have to know a little bit about AWS to follow the next couple of steps, but be assured it doesn’t take too much googling to find your way if you get confused at any stage by some of the jargon.




A collection of analytical/helper functions that I have collected over years of coding R. This package has now become my base library whenever I start a project. Use at own risk ;-)


RInno makes it easy to install local shiny apps by providing an interface between R and Inno Setup, an installer for Windows programs (sorry Mac and Linux users). It is designed to be simple to use (two lines of code at a minimum), yet comprehensive.

Pareto's Playground

A blog that I helped to build and maintain while at Eighty20 as a Data Scientist. This was before the days of Hugo and blogdown



Teaching forms a integral part of what I enjoy.

I am an assistant lecture for the following course at University of Stellenbosch:


  • Introduction to Algorithm Based Typography
  • Economic and Development Problems in Sub-Sahara Africa
  • Introduction to Excel: Basic data analytics and tools for efficiency
  • Introduction to R: What is programming and data analytics
  • Reproducible Research: Integrating R, Rtsudio, github and markdown into your research


University of Cape Town

  • BUS4053H - Quantitative Finance Project

University of Stellenbosch