How do I find and remove duplicate rows in pandas?
During the data cleaning process, you will often need to figure out whether you have duplicate data, and if so, how to deal with it. In this video, I’ll demonstrate the two key methods for finding and removing duplicate rows, as well as how to modify their behavior to suit your specific needs.
SUBSCRIBE to learn data science with Python:
JOIN the “Data School Insiders” community and receive exclusive rewards:
== RESOURCES ==
GitHub repository for the series: https://github.com/justmarkham/pandas-videos
“duplicated” documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.duplicated.html
“drop_duplicates” documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html
== LET’S CONNECT! ==