Machine Learning : I did it, so can you too.

As promised, here’s my Python notebook which I used to generate my first set of predictions using machine learning. This is for passenger survival in the Titanic data set (Kaggle). The Titanic training data set contains 890 rows (Passengers), each with 12 columns including Name, Sex, Age, Ticket number, Passenger Class, Cabin name, Port of Embarkation, number of siblings. A relatively small data set but fun to play with as a novice without feeling overwhelmed! After some quick explorations of the data, I begin with some basic data munging. This is to prepare the data set for fitting...

Continue reading


Data Science: My Journey from Doctor to Noob Kaggler.

In the last couple of months, I’ve been pursuing another hobby  – data science and artificial intelligence (AI). You might have followed the much-publicised match between Lee Sedol, one of the world’s best Go players, and AlphaGo, an AI system developed by Google that spurred discussions about the future of AI. (Spoiler : AlphaGo won 4 – 1) This has been partly triggered by my work at Holmusk, where our brilliant data science team has been winning multiple competitions and most recently took part in the Second Annual Data Science Bowl on Kaggle. This was a challenge...

Continue reading