New Python Tutorial: Diagnose data for cleaning
First video of our latest course by Daniel Chen: Cleaning Data in Python. Like and comment if you enjoyed the video!
A vital component of data science involves acquiring raw data and getting it into a form ready for analysis. In fact, it is commonly said that data scientists spend 80% of their time cleaning and manipulating data, and only 20% of their time actually analyzing it. This course will equip you with all the skills you need to clean your data in Python, from learning how to diagnose your data for problems to dealing with missing values and outliers. At the end of the course, you’ll apply all of the techniques you’ve learned to a case study in which you’ll clean a real-world Gapminder dataset!
So you’ve just got a brand new dataset and are itching to start exploring it. But where do you begin, and how can you be sure your dataset is clean? This chapter will introduce you to the world of data cleaning in Python! You’ll learn how to explore your data with an eye for diagnosing issues such as outliers, missing values, and duplicate rows. Try the first chapter for free: https://www.datacamp.com/courses/cleaning-data-in-python