home.social

#gradientboosting — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #gradientboosting, aggregated by home.social.

  1. 3. #Gradientboosting is VERY interpretable with the #SHAPley method. They are totally misleading by saying their Deep Neural Network is more interpretable and boosting is not interpretable. They are apparently ignorant of these important advanced in interpretability, even though it is more than 5 years old now.

    4. Despite a lot of talk about class imbalance, the churn datasets are not very imbalanced - 10%-20% churn rates. Really imbalanced data is low single digit churn rates.

  2. I ran a quick Gradient Boosted Trees vs Neural Nets check using scikit-learn's dev branch which makes it more convenient to work with tabular datasets with mixed numerical and categorical features data (e.g. the Adult Census dataset).

    Let's start with the GBRT model. It's now possible to reproduce the SOTA number of this dataset in a few lines of code 2 s (CV included) on my laptop.

    1/n

    #sklearn #PyData #MachineLearning #TabularData #GradientBoosting #DeepLearning #Python

  3. I'm excited to see #gradientboosting making some news! There is so much #aihype around #llms (and before that it was #deeplearning) but I think that for most #datascientists working in industry the development of #gradientboosting #machinelearning algorithms (like #xgboost and #catboost) are the real revolution and will have a much more long lived impact on our work.

    nature.com/articles/s41598-022

  4. #MachineLearning lesson of the day: Working with a #gradientboosting model, I got no traction cross-validating hyperparameters like tree depth and number; but different evaluation metrics (e.g. SMSE vs. MAE etc.) had a major impact. Have you tried this?

    IMHO #DNN get all the press because they do sexy human jobs like seeing and processing language. But in the business world of tabular data, #gradientboosting is where the real revolution is happening! #xgboost #catboost #lightgbm #DataScience