Python Pandas Tutorial (Part 4): Filtering – Using Conditionals to Filter Rows and Columns
[ad_1]
In this video, we will be learning how to filter our Pandas dataframes using conditionals.
This video is sponsored by Brilliant. Go to https://brilliant.org/cms to sign up for free. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription.
In this Python Programming video, we will be learning how to write conditionals in or to filter our data within our Pandas dataframes. This is a fundamental skill to have when using Pandas because it is one of the first things most people do when starting a new Pandas project. Let’s get started…
The code for this video can be found at:
http://bit.ly/Pandas-04
StackOverflow Survey Download Page – http://bit.ly/SO-Survey-Download
✅ Support My Channel Through Patreon:
https://www.patreon.com/coreyms
✅ Become a Channel Member:
https://www.youtube.com/channel/UCCezIgC97PvUuR4_gbFUs5g/join
✅ One-Time Contribution Through PayPal:
https://goo.gl/649HFY
✅ Cryptocurrency Donations:
Bitcoin Wallet – 3MPH8oY2EAgbLVy7RBMinwcBntggi7qeG3
Ethereum Wallet – 0x151649418616068fB46C3598083817101d3bCD33
Litecoin Wallet – MPvEBY5fxGkmPQgocfJbxP6EmTo5UUXMot
✅ Corey’s Public Amazon Wishlist
http://a.co/inIyro1
✅ Equipment I Use and Books I Recommend:
https://www.amazon.com/shop/coreyschafer
▶️ You Can Find Me On:
My Website – http://coreyms.com/
My Second Channel – https://www.youtube.com/c/coreymschafer
Facebook – https://www.facebook.com/CoreyMSchafer
Twitter – https://twitter.com/CoreyMSchafer
Instagram – https://www.instagram.com/coreymschafer/
#Python #Pandas
Source
[ad_2]
This is awesome, Coreys in the best tutor ever i have experienced
I am getting error "series object are mutable, thus they cannot be hashed", what could be solution for this.
Please please make series on scikit learn module
Thanks a lot for your superb tutorials man. Just the right amount of info for beginners
That was the best quality content on Pandas. Keep it up brother.
ALL THE BEST.
I'm not familiar with truth tables! Would be great if you could do a video on this.
BASED
Bro you have no idea what help you are giving us.Thank You Keep doing what you do!
18:00 na=False means we are going to ignore the NaN values. Am I right?
@corey Hi I am complete newbie to any programming language. Must say your videos are of great help to me. Can you just help me a little, like I want the output which you got at 3.53 In video to be pasted in new excel sheet. What should be code I must use. I’ll be greatful to get your reply.
Your video's are excellent!!
I was trying the to filter the Age1stCode column , tried all the combination but its not working, Could you please provide the way to achieve this.
india_df = india_df.infer_objects()
india_df[ (india_df['Age1stCode'].astype(str))=='16']
C:UsersHPanaconda3libsite-packagespandascoreopsarray_ops.py:253: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
res_values = method(rvalues) how to handle this type of error?
Thank You!
Amazing, thanks Corey.
Thank You
Thanks for awesome videos on Pandas. I was able to automate few excel reporting at my work.. but stuck with something very complex(its complex for me!). Could you please help on some complex excel calculations using Python.?
for ex. suppose I have data in below format.
db_instance Hostname Disk_group disk_path disk_size disk_used header_status
abc_cr host1 data01 dev/mapper/asm01 240 90 Member
abc_cr host1 data01 dev/mapper/asm02 240 100 Member
abc_cr host1 data01 dev/mapper/asm03 240 60 Member
abc_xy host1 data01 dev/mapper/asm01 240 90 Member
abc_xy host1 data01 dev/mapper/asm02 240 100 Member
abc_xy host1 data01 dev/mapper/asm03 240 60 Member
abc_cr host1 acfs01 dev/mapper/asm04 90 30 Member
abc_cr host1 acfs01 dev/mapper/asm05 90 60 Member
abc_xy host1 acfs01 dev/mapper/asm04 90 30 Member
abc_xy host1 acfs01 dev/mapper/asm05 90 60 Member
host1 unassigned dev/mapper/asm06 180 0 Candidate
host1 unassigned dev/mapper/asm07 180 0 Former
res_du host2 data01 dev/mapper/asm01 240 90 Member
res_du host2 data01 dev/mapper/asm02 240 100 Member
res_du host2 data01 dev/mapper/asm03 240 60 Member
res_hg host2 data01 dev/mapper/asm01 240 90 Member
res_hg host2 data01 dev/mapper/asm02 240 100 Member
res_hg host2 data01 dev/mapper/asm03 240 60 Member
res_pq host2 acfs01 dev/mapper/asm04 90 30 Member
res_pq host2 acfs01 dev/mapper/asm05 90 60 Member
res_mn host2 acfs01 dev/mapper/asm04 90 30 Member
res_mn host2 acfs01 dev/mapper/asm05 90 60 Member
host2 unassigned dev/mapper/asm06 180 0 Candidate
host2 unassigned dev/mapper/asm07 180 0 Former
As you can see, disk_path is duplicated for each host..because of multiple db_instance. (Even though you see similar disk_paths for host1 & host2, but actually they are different disks from storage end.. but admins follow similar name conventions when they configure disks at host side, resulting similar disk_paths for different hosts)
My queries are, How
1. to remove duplicates for disks_path for each host?(considering only two columns Hostname & disk_path, that's how I remove duplicates in excel, I am not worried for db_instance)
2. once we remove duplicates, calculate total size of 'Member' disks… also total size of 'Candidate' and 'Former' disks combined.
3. to add another column 'Percent used', which will is result of 'disk_used'/'disk_size'*100 for each row.
Thanks in advance!
how to use two or more filter conditions in same time
You're the man!
filt = df['LanguageWorkedWith'].str.contains('Python', na= False)
Do 'contains' method can take more than one value? Suppose I want to enter Python and Java, how can i implement it using contains method.
Also, request you to upload string method video. Thanks!
Hey, how can we access the data you used?
why do we have sql,can't we do those "table" things using dataframe,and use the mentioned techniques to grab the required data.
Your tutorials are well structured and easy to understand. Thank you so much
Hey Corey, great tutorials btw! One thing I don't understand is the na = False. Is it when the objects are integers so you get an error?
Wow, Love it. Very useful. thanks👍👍🇵🇰🇵🇰
You're a great man Corey Schafer, This is the best channel ever on Youtube.
Thanks form India.
your playlist is amazing.
to be honest i have taken dozens of courses online for python on many platforms but non of those have been completed beyond 20 %. looking forward to gain the skill here at least, as its fun to hear and v understandable
i started learning python basics with you and i just cant believe how much i improved myself this year, thanks a lot!
@ 17:20, Corey, the doc for pandas 'contains' function says it has 'Regex = True' by default. I realized this when I searched for "C++" instead of "Python" when it gave me a "multiple repeat" error. In this case, it is essential to set regex to false or escape the '+' using "".
slight problem : "Passing list-likes to .loc or [ ] with any missing labels is no longer supported" in Jupyter , but when i try again it works…
At 15:40, After filtering by 'Country' col, how can we filter by row? For example, if I only want to see 0-10 rows of this filtered df?
Is there a possibility where I can replace "df['column'].str.contains('python')" the 'python key with a regex pattern?
Thank you for your videos.
I have referred to three video tutorials/courses before this and this is by far the best course on pandas with complete coverage.
You deserve a million likes for this series. This course is priceless.
Your in-depth knowledge in python is astonishing!
The thing that always amazes me is, when you finish explaining a topic and I ask myself " Now what if I try to do that?" kinda questions and within no seconds I see you explaining the same question! 😳
Do you mind read your viewers or what? 😅
Anyways, you are a legend <3
best course seriously!!! Quality content.You could have put it on Udemy and earn more but you put it on youtube. Thanks : ).Please do video on Sklearn, Tensorflow.
So far so good explanation..I have been watching the series since its started…I came again to revise.
just a quick suggestion from my side…please give a suitable name to filter variable so that it will look readable if someone wants to understand whats happening.
@Corey Schafer your content are really awesome. can you please tell me where can i practice all this concepts.
Corey, you're a god of teaching, you don't leave out anything unexplained. Thank you
i like his kinda southern accent lmao
THANKS! finally I get the filtering from pandas.
case:1 filt=(df[' last']=='Doe')
df.loc[filt ] # It works
# but when we try this…
case:2 filt=df[' last ']
df.loc[' filt ']
# It doesn't work what's the difference…in both cases
Hey! at 6:30 how to get only email not the index number
Great series. I would recommend going through these tutorials once with the given dataset, and then going through it again using a dataset you're interested in. Dw, it will go alot faster the 2nd time around, and you'll get great practice actually finding insights you care about.
this briliant.org might actually not be a bad idea..
19:22 let us know when you do a video on strings! Can't wait for that and if you need any suggestions on any of the methods in the pandas …/text.html let us know! Thanks for all your hard work Corey!