Kaggle Challenges

Making sense of feature overload to optimize business processes

Kaggle Challenge
Topic:Reliability Prediction

Dataset: Mercedes-Benz Greener Manufacturing dataset

Goal: Predict the length of time it takes for each configuration of car to pass testing.

Dataset complexity: Small dataset with the curse of dimensionality: training data of 4,209 rows with a relatively large number of features, 377.

Solution: Set Firefly Lab to train up to 1000 models to arrive at the golden ensemble. Four complementary algorithms formed the ensemble: XGBoost, Ridge Regression, Extra Trees and Random Forest.

Firefly Lab Results: 0.55792

Rank: Exceeded 1st place of 3835 competing teams (The winner scored: 0.55550.)

Data Scientist Time: 30 minutes