Making sense of feature overload to optimize business processes
Dataset: Mercedes-Benz Greener Manufacturing dataset
Goal: Predict the length of time it takes for each configuration of car to pass testing.
Dataset complexity: Small dataset with the curse of dimensionality: training data of 4,209 rows with a relatively large number of features, 377.
Solution: Set Firefly Lab to train up to 1000 models to arrive at the golden ensemble. Four complementary algorithms formed the ensemble: XGBoost, Ridge Regression, Extra Trees and Random Forest.
Firefly Lab Results: 0.55792
Rank: Exceeded 1st place of 3835 competing teams (The winner scored: 0.55550.)
Data Scientist Time: 30 minutes