Deep Learning for Tabular Data: A Bag of Tricks | ODSC 2020




[ad_1]

Jason McGhee, Senior Machine Learning Engineer at DataRobot, has been spending time applying deep learning and neural networks to tabular data. Although the deep learning technique can prove challenging, his research supports how valuable it is when using tabular datasets. In this video (adapted from his presentation at ODSC Boston 2020), Jason shares some important techniques for implementing deep learning when learning heterogenous tabular data. Learn more about Jason’s findings and ask him questions at his DataRobot Community post: https://community.datarobot.com/t5/ai-ml-general-blog/deep-learning-for-tabular-data-a-bag-of-tricks/ba-p/4593

Table of Contents

Motivation: 0:15
Impute missing values: 1:37
Prepare categoricals, text, and numerics: 2:49, 3:10, 3:31
Properly validate: 3:54
Establish a benchmark: 5:24
Start with a low capacity network: 6:10
Determine output activation and loss function for classification and regression: 7:17, 8:26
Determine hidden activation: 9:46
Choose batch size: 10:57
Build learning rate schedule: 12:02
Determine number of epochs: 14:35
Track and interpret regression predictions: 15:30
Track metric and/or loss: 16:09
Track and interpret classification predictions: 16:45
Benchmark the network: 17:11
Dealing with discontinuities: 18:16
Tuning the network: 19:31
Handing overfitting vs. underfitting: 20:41
All tricks in one place: 21:35

Music for this video: https://www.bensound.com.

Stay connected with DataRobot!
Blog: https://blog.datarobot.com/
Community: https://community.datarobot.com/
Twitter: https://twitter.com/DataRobot
LinkedIn: hhttps://www.linkedin.com/company/datarobot/
Facebook: https://www.facebook.com/datarobotinc
Instagram: https://www.instagram.com/datarobot/

Source


[ad_2]

Comment List

  • DataRobot
    December 12, 2020

    I’ve also noticed a lack of emphasis of tabular data with respect to NN’s. This is a great presentation and very informative. Thanks for putting it together.

  • DataRobot
    December 12, 2020

    you helped me A LOT, amazing content and prefect presentation, keep it up

  • DataRobot
    December 12, 2020

    Nicely done 👌

  • DataRobot
    December 12, 2020

    3Blue1Brown style 🙂

  • DataRobot
    December 12, 2020

    what about the 1d sensory data collected from physical and chemical instruments ? i know we can still treat them as tabular data but what about when we have thousands of variables and hundreds of samples only and the variables are not single identity but they are sort of grouping features , how to treat the data analysis ?

  • DataRobot
    December 12, 2020

    This helped me a lot. Thank you 🙏😍

  • DataRobot
    December 12, 2020

    This is gold, keep it up.

  • DataRobot
    December 12, 2020

    Thank you for this. As material’s science researcher dabbling in applying ML techniques to my datasets, this is great.

  • DataRobot
    December 12, 2020

    I dig the background music

Write a comment