Best practices with pandas (video series)

[ad_1]

Might 23, 2018 · Python tutorial

On the PyCon 2018 convention, I introduced a tutorial known as “Utilizing pandas for Higher (and Worse) Information Science”. By way of a sequence of workout routines, I demonstrated greatest practices with pandas to assist college students turn into extra fluent at utilizing pandas to reply information science questions and keep away from information science errors.

I break up the tutorial into 10 videos. The primary video introduces the tutorial and the dataset, and the opposite 9 movies include the workout routines we talk about. I like to recommend that you simply watch the movies so as:

  1. Introducing the dataset (19:40)
  2. Removing columns (6:27)
  3. Comparing groups (8:42)
  4. Examining relationships (8:44)
  5. Handling missing values (5:02)
  6. Using string methods (5:55)
  7. Combining dates and times (9:11)
  8. Plotting a time series (8:48)
  9. Creating useful plots (8:47)
  10. Fixing bad data (16:31)

If you wish to comply with together with the workout routines at residence, you’ll be able to download the dataset and code from GitHub. The dataset was collected by the Stanford Open Policing Project, and features a decade of visitors cease information from the state of Rhode Island.

That is an intermediate tutorial, so for those who’re model new to pandas, I like to recommend that you simply begin with my different video sequence, Easier data analysis in Python with pandas.

Please benefit from the sequence, and I hope to listen to from you within the feedback part!

Embedded movies with descriptions

1. Introducing the dataset (19:40)

This video covers the next matters: studying a CSV file, DataFrame form, information sorts, NaN, lacking values, booleans.

2. Removing columns (6:27)

This video covers the next matters: lacking values, dropping a column, axis parameter, inplace parameter, dropna methodology.

3. Comparing groups (8:42)

This video covers the next matters: filtering a DataFrame, value_counts methodology, normalization, groupby methodology.

4. Examining relationships (8:44)

This video covers the next matters: value_counts methodology, math with booleans, groupby with a number of columns, correlation versus causation.

5. Handling missing values (5:02)

This video covers the next matters: math with booleans, value_counts methodology, filtering a DataFrame, dropna parameter.

6. Using string methods (5:55)

This video covers the next matters: looking strings, math with booleans, value_counts methodology, dropna parameter.

7. Combining dates and times (9:11)

This video covers the next matters: string slicing, string concatenation, changing to datetime format, datetime attributes, value_counts methodology.

8. Plotting a time series (8:48)

This video covers the next matters: math with booleans, groupby methodology, datetime attributes, line plots.

9. Creating useful plots (8:47)

This video covers the next matters: datetime attributes, value_counts methodology, line plots, sorting, groupby methodology.

10. Fixing bad data (16:31)

This video covers the next matters: value_counts methodology, filtering by a number of circumstances, lacking values, NaN, loc accessor, SettingWithCopyWarning.

P.S. When you favored this video sequence, I like to recommend trying out my tutorial from PyCon 2019, Data science best practices with pandas!



[ad_2]

Source link

Write a comment