Groupby – Data Analysis with Python and Pandas p.3




[ad_1]

Hello and welcome to another data analysis with Python and Pandas tutorial. In this tutorial, we’re going to change up the dataset and play with minimum wage data now.

Text-based tutorial: https://pythonprogramming.net/groupby-python3-pandas-data-analysis/

Channel membership: https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ/join
Discord: https://discord.gg/sentdex
Support the content: https://pythonprogramming.net/support-donate/
Twitter: https://twitter.com/sentdex
Facebook: https://www.facebook.com/pythonprogramming.net/
Twitch: https://www.twitch.tv/sentdex
G+: https://plus.google.com/+sentdex

Source


[ad_2]

Comment List

  • sentdex
    December 1, 2020

    Does groupby('State') return the column name and the values in that column? If yes then that's the only way I can make sense of the for loop that is used 5:48

  • sentdex
    December 1, 2020

    I was wearing headphones and died at 10:49

  • sentdex
    December 1, 2020

    This may sound stupid, but what are we actually doing when we do that co-realtion thing?

  • sentdex
    December 1, 2020

    min_wage_corr = act_min_wage.replace(0, np.NaN).dropna(axis=1).corr().head()

    for problem in issue_df['State'].unique():
    if problem in min_wage_corr.columns:
    print("we are missing something here…")

    anyone can explain what does this means again? why are we doing the for loop for? just to check if there is a problem in issue_df?

  • sentdex
    December 1, 2020

    At 19:35, shoulden't it be == instead of != ?. If data is missing why should it be equal to != ?

  • sentdex
    December 1, 2020

    At 8:24, is it possible to return the highest number? And how would you do it?

  • sentdex
    December 1, 2020

    learned something new in this video, nice work!

  • sentdex
    December 1, 2020

    10:49 RAM exploding

  • sentdex
    December 1, 2020

    This is supposed to be a group by tutorial. It’s so frantic and buzzy that Viewers won’t be able to follow your quick turns and trains of thought. Think of what you are teaching and show examples of it, rather than chasing every thought that comes into your mind. The whole digression into the NAs is out of place.

  • sentdex
    December 1, 2020

    Thank god we didn't lose the power else we would have lost this beautiful tutorial

  • sentdex
    December 1, 2020

    9:08 can someone explain the code and what is going on because I am kind of just following along with no explanation. Thanks in advance!

  • sentdex
    December 1, 2020

    maybe it is simpler to use tail() to check if 0 data states have minimum wage in the present… great content, keep it up! 🙂

  • sentdex
    December 1, 2020

    I rate your stuff brother, I like the vibe

  • sentdex
    December 1, 2020

    I had already set the index to 'Year' while converting the csv file. So, when I tried group[["Low.2018"]].rename(columns={"Low.2018":name}) instead of group.set_index("Year")[["Low.2018"]].rename(columns={"Low.2018":name}), then I get NaN in whole table. Can you explain , why?

  • sentdex
    December 1, 2020

    I didn't know Frankenstein had a monster mug

  • sentdex
    December 1, 2020

    Not critical but using

    "encoding= 'unicode_escape'" avoids guessing.

  • sentdex
    December 1, 2020

    all these for loops could have gotten a better explanation, otherwise, a great video

  • sentdex
    December 1, 2020

    issue_df = df[df['Low.2018']=0]
    grouped_issues = issue_df.groupby("State")
    then of course, grouped_issues.get_group("Alabama")["Low.2018"].sum() == 0 always True.

  • sentdex
    December 1, 2020

    I have a quick question and hope that you respond soon, I'm working in a movie dataset right now and I have multiple values in one column
    for instance let's say that you have two values for index 1 at high.value!
    how would you get the mean of this specific column??

  • sentdex
    December 1, 2020

    Like how you do groupby and produce a new dataframe: act_min_wage.
    So I borrowed it.

    Unfortunately, my case got a KeyError: "None of ['Year'] are in the columns".
    What's wrong?

  • sentdex
    December 1, 2020

    Regarding the last point in this video, we are actually missing the information of ten states. You should sum the ['Low.2018'] in the original df instead of the issue_df.

  • sentdex
    December 1, 2020

    Thank you so much for all the videos and tutorials, they are really helpful!

  • sentdex
    December 1, 2020

    i'm trying to use get_group( ) for multiple values

  • sentdex
    December 1, 2020

    finally you updated man.

  • sentdex
    December 1, 2020

    I am not sure, if the data has been updated or so, since it is half-year later, but I checked it using the sum method and only 5 of them has 0.0 sum in the low.2018,
    and Surely Texas has a minimum wage, it's some 277 or so.

Write a comment