Sudeep's Blog

Disorganized Thoughts in Organized Manner

Kaggle's CauseEffect Competition
Kaggle's CauseEffect Pair Competition has ended and I got a rank of 19 out of 269 teams --- my second top 10% finish. Here is a link to the competition Here is a brief description of the contest: You are given a large number of A-B observation pairs, each pair itself contains a list (a_i, b_i). Each pair has an indicator assigned to it A->B (meaning A is a cause of B), B->A (meaning B is a cause of A) and A-B (meaning neither, they could be independent or could be affected by a third common cause etc). The aim of the contest was to predict similar indications - A->B, B->A, A-B on new unknown pairs. The competition involved a lot of feature engineering. The learning part was mostly handled by off-the-shelf learning algorithm --- I used Gradient Boosting Forest. But the one with better feature was the winner. (For example, one of the winners developed thousands of features out of the pair). Here is a discussion of everyone's approaches: Finally, here is a comprehensive list of what worked and what not: , compiled by the hosts.
Back to Home