Pandas fundamentals every data scientist needs to know | by Mısra Turp | Jan, 2021


To boost your performance and code like a pro

Mısra Turp
Photo by Sid Balachandran on Unsplash

You load your data into a data frame. A data frame consists of rows and columns. One row is one data point and one column is, well, a feature when you think about it in machine learning terms.

Columns are the stars of the Pandas dataframe but rows can also be used for actions. Pandas has built-in functions that will apply to each row and also gives you the option to iterate through rows.

Image by me.
Image by me.
Image by me.
Image by me.
Image by me.

If you are coming from a programming background, like me, your first instinct would be to write loops (for or while) for everything you want to change in the data frame.

  • you can change the format of columns,
  • you can fill in missing values with a given value
  • to see only a subset of the data frame,
  • to use only data points with a certain value in your analysis,
  • to exclude data points with a certain combination of values and much more.
Image by me.
  • a condition statement that returns True or False for each data point/row

Sure, maybe they’re not there for you individually but they’ve been there for others and you can use the knowledge they have accumulated. There are so much information, so many questions and so many answers that I think it’s nearly impossible to ask a question that hasn’t been resolved yet.

Read More …


Write a comment