Python Pandas Tutorial (Part 4): Filtering – Using Conditionals to Filter Rows and Columns




[ad_1]

In this video, we will be learning how to filter our Pandas dataframes using conditionals.

This video is sponsored by Brilliant. Go to https://brilliant.org/cms to sign up for free. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription.

In this Python Programming video, we will be learning how to write conditionals in or to filter our data within our Pandas dataframes. This is a fundamental skill to have when using Pandas because it is one of the first things most people do when starting a new Pandas project. Let’s get started…

The code for this video can be found at:
http://bit.ly/Pandas-04

StackOverflow Survey Download Page – http://bit.ly/SO-Survey-Download

✅ Support My Channel Through Patreon:
https://www.patreon.com/coreyms

✅ Become a Channel Member:
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join

✅ One-Time Contribution Through PayPal:
https://goo.gl/649HFY

✅ Cryptocurrency Donations:
Bitcoin Wallet – 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3
Ethereum Wallet – 0x151649418616068fB46C3598083817101d3bCD33
Litecoin Wallet – MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot

✅ Corey’s Public Amazon Wishlist
http://a.co/inIyro1

✅ Equipment I Use and Books I Recommend:
https://www.amazon.com/shop/coreyschafer

▶️ You Can Find Me On:
My Website – http://coreyms.com/
My Second Channel – https://www.youtube.com/c/coreymschafer
Facebook – https://www.facebook.com/CoreyMSchafer
Twitter – https://twitter.com/CoreyMSchafer
Instagram – https://www.instagram.com/coreymschafer/

#Python #Pandas

Source


[ad_2]

Comment List

  • Corey Schafer
    November 14, 2020

    This is awesome, Coreys in the best tutor ever i have experienced

  • Corey Schafer
    November 14, 2020

    I am getting error "series object are mutable, thus they cannot be hashed", what could be solution for this.

  • Corey Schafer
    November 14, 2020

    Please please make series on scikit learn module

  • Corey Schafer
    November 14, 2020

    Thanks a lot for your superb tutorials man. Just the right amount of info for beginners

  • Corey Schafer
    November 14, 2020

    That was the best quality content on Pandas. Keep it up brother.
    ALL THE BEST.

  • Corey Schafer
    November 14, 2020

    I'm not familiar with truth tables! Would be great if you could do a video on this.

  • Corey Schafer
    November 14, 2020

    BASED

  • Corey Schafer
    November 14, 2020

    Bro you have no idea what help you are giving us.Thank You Keep doing what you do!

  • Corey Schafer
    November 14, 2020

    18:00 na=False means we are going to ignore the NaN values. Am I right?

  • Corey Schafer
    November 14, 2020

    @corey Hi I am complete newbie to any programming language. Must say your videos are of great help to me. Can you just help me a little, like I want the output which you got at 3.53 In video to be pasted in new excel sheet. What should be code I must use. I’ll be greatful to get your reply.

  • Corey Schafer
    November 14, 2020

    Your video's are excellent!!
    I was trying the to filter the Age1stCode column , tried all the combination but its not working, Could you please provide the way to achieve this.

    india_df = india_df.infer_objects()

    india_df[ (india_df['Age1stCode'].astype(str))=='16']

  • Corey Schafer
    November 14, 2020

    C:UsersHPanaconda3libsite-packagespandascoreopsarray_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison

    res_values = method(rvalues) how to handle this type of error?

  • Corey Schafer
    November 14, 2020

    Thank You!

  • Corey Schafer
    November 14, 2020

    Amazing, thanks Corey.

  • Corey Schafer
    November 14, 2020

    Thank You

  • Corey Schafer
    November 14, 2020

    Thanks for awesome videos on Pandas. I was able to automate few excel reporting at my work.. but stuck with something very complex(its complex for me!). Could you please help on some complex excel calculations using Python.?
    for ex. suppose I have data in below format.
    db_instance Hostname Disk_group disk_path disk_size disk_used header_status
    abc_cr host1 data01 dev/mapper/asm01 240 90 Member
    abc_cr host1 data01 dev/mapper/asm02 240 100 Member
    abc_cr host1 data01 dev/mapper/asm03 240 60 Member
    abc_xy host1 data01 dev/mapper/asm01 240 90 Member
    abc_xy host1 data01 dev/mapper/asm02 240 100 Member
    abc_xy host1 data01 dev/mapper/asm03 240 60 Member
    abc_cr host1 acfs01 dev/mapper/asm04 90 30 Member
    abc_cr host1 acfs01 dev/mapper/asm05 90 60 Member
    abc_xy host1 acfs01 dev/mapper/asm04 90 30 Member
    abc_xy host1 acfs01 dev/mapper/asm05 90 60 Member
    host1 unassigned dev/mapper/asm06 180 0 Candidate
    host1 unassigned dev/mapper/asm07 180 0 Former
    res_du host2 data01 dev/mapper/asm01 240 90 Member
    res_du host2 data01 dev/mapper/asm02 240 100 Member
    res_du host2 data01 dev/mapper/asm03 240 60 Member
    res_hg host2 data01 dev/mapper/asm01 240 90 Member
    res_hg host2 data01 dev/mapper/asm02 240 100 Member
    res_hg host2 data01 dev/mapper/asm03 240 60 Member
    res_pq host2 acfs01 dev/mapper/asm04 90 30 Member
    res_pq host2 acfs01 dev/mapper/asm05 90 60 Member
    res_mn host2 acfs01 dev/mapper/asm04 90 30 Member
    res_mn host2 acfs01 dev/mapper/asm05 90 60 Member
    host2 unassigned dev/mapper/asm06 180 0 Candidate
    host2 unassigned dev/mapper/asm07 180 0 Former

    As you can see, disk_path is duplicated for each host..because of multiple db_instance. (Even though you see similar disk_paths for host1 & host2, but actually they are different disks from storage end.. but admins follow similar name conventions when they configure disks at host side, resulting similar disk_paths for different hosts)
    My queries are, How
    1. to remove duplicates for disks_path for each host?(considering only two columns Hostname & disk_path, that's how I remove duplicates in excel, I am not worried for db_instance)
    2. once we remove duplicates, calculate total size of 'Member' disks… also total size of 'Candidate' and 'Former' disks combined.
    3. to add another column 'Percent used', which will is result of 'disk_used'/'disk_size'*100 for each row.

    Thanks in advance!

  • Corey Schafer
    November 14, 2020

    how to use two or more filter conditions in same time

  • Corey Schafer
    November 14, 2020

    You're the man!

  • Corey Schafer
    November 14, 2020

    filt = df['LanguageWorkedWith'].str.contains('Python', na= False)
    Do 'contains' method can take more than one value? Suppose I want to enter Python and Java, how can i implement it using contains method.
    Also, request you to upload string method video. Thanks!

  • Corey Schafer
    November 14, 2020

    Hey, how can we access the data you used?

  • Corey Schafer
    November 14, 2020

    why do we have sql,can't we do those "table" things using dataframe,and use the mentioned techniques to grab the required data.

  • Corey Schafer
    November 14, 2020

    Your tutorials are well structured and easy to understand. Thank you so much

  • Corey Schafer
    November 14, 2020

    Hey Corey, great tutorials btw! One thing I don't understand is the na = False. Is it when the objects are integers so you get an error?

  • Corey Schafer
    November 14, 2020

    Wow, Love it. Very useful. thanks👍👍🇵🇰🇵🇰

  • Corey Schafer
    November 14, 2020

    You're a great man Corey Schafer, This is the best channel ever on Youtube.
    Thanks form India.

  • Corey Schafer
    November 14, 2020

    your playlist is amazing.
    to be honest i have taken dozens of courses online for python on many platforms but non of those have been completed beyond 20 %. looking forward to gain the skill here at least, as its fun to hear and v understandable

  • Corey Schafer
    November 14, 2020

    i started learning python basics with you and i just cant believe how much i improved myself this year, thanks a lot!

  • Corey Schafer
    November 14, 2020

    @ 17:20, Corey, the doc for pandas 'contains' function says it has 'Regex = True' by default. I realized this when I searched for "C++" instead of "Python" when it gave me a "multiple repeat" error. In this case, it is essential to set regex to false or escape the '+' using "".

  • Corey Schafer
    November 14, 2020

    slight problem : "Passing list-likes to .loc or [ ] with any missing labels is no longer supported" in Jupyter , but when i try again it works…

  • Corey Schafer
    November 14, 2020

    At 15:40, After filtering by 'Country' col, how can we filter by row? For example, if I only want to see 0-10 rows of this filtered df?

  • Corey Schafer
    November 14, 2020

    Is there a possibility where I can replace "df['column'].str.contains('python')" the 'python key with a regex pattern?

  • Corey Schafer
    November 14, 2020

    Thank you for your videos.

  • Corey Schafer
    November 14, 2020

    I have referred to three video tutorials/courses before this and this is by far the best course on pandas with complete coverage.
    You deserve a million likes for this series. This course is priceless.

  • Corey Schafer
    November 14, 2020

    Your in-depth knowledge in python is astonishing!
    The thing that always amazes me is, when you finish explaining a topic and I ask myself " Now what if I try to do that?" kinda questions and within no seconds I see you explaining the same question! 😳
    Do you mind read your viewers or what? 😅
    Anyways, you are a legend <3

  • Corey Schafer
    November 14, 2020

    best course seriously!!! Quality content.You could have put it on Udemy and earn more but you put it on youtube. Thanks : ).Please do video on Sklearn, Tensorflow.

  • Corey Schafer
    November 14, 2020

    So far so good explanation..I have been watching the series since its started…I came again to revise.
    just a quick suggestion from my side…please give a suitable name to filter variable so that it will look readable if someone wants to understand whats happening.

  • Corey Schafer
    November 14, 2020

    @Corey Schafer your content are really awesome. can you please tell me where can i practice all this concepts.

  • Corey Schafer
    November 14, 2020

    Corey, you're a god of teaching, you don't leave out anything unexplained. Thank you

  • Corey Schafer
    November 14, 2020

    i like his kinda southern accent lmao

  • Corey Schafer
    November 14, 2020

    THANKS! finally I get the filtering from pandas.

  • Corey Schafer
    November 14, 2020

    case:1 filt=(df[' last']=='Doe')
    df.loc[filt ] # It works
    # but when we try this…
    case:2 filt=df[' last ']
    df.loc[' filt ']
    # It doesn't work what's the difference…in both cases

  • Corey Schafer
    November 14, 2020

    Hey! at 6:30 how to get only email not the index number

  • Corey Schafer
    November 14, 2020

    Great series. I would recommend going through these tutorials once with the given dataset, and then going through it again using a dataset you're interested in. Dw, it will go alot faster the 2nd time around, and you'll get great practice actually finding insights you care about.

  • Corey Schafer
    November 14, 2020

    this briliant.org might actually not be a bad idea..

  • Corey Schafer
    November 14, 2020

    19:22 let us know when you do a video on strings! Can't wait for that and if you need any suggestions on any of the methods in the pandas …/text.html let us know! Thanks for all your hard work Corey!

Write a comment