Python Pandas Tutorial 7. Group By (Split Apply Combine)




[ad_1]

In this python pandas tutorial you will learn how groupby method can be used to group your dataset based on some criteria and then apply analytics on each of the groups. This is similar to SQL group by. It is also called split apply combine strategy in data science.

Topics that are covered in this Python Pandas Video:
0:00 Introduction
2:19 Use groupby() method
3:04 groupby() representation internally
6:24 What is split apply combine?
8:20 Use describe() function in groupby

Link for code and data used in this tutorial: https://github.com/codebasics/py/tree/master/pandas/7_group_by

To download csv and code for all tutorials: go to https://github.com/codebasics/py, click on green button to clone or download the entire repository and then go to relevant folder to get access of that specific file.

Next Video:
Python Pandas Tutorial 8. Concat Dataframes: https://www.youtube.com/watch?v=WGOEFok1szA&list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy&index=8

Popular Playlist:
Complete python course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uv5U-Lmlnucd7gqF-3ehIh0

Data science course: https://www.youtube.com/playlist?list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV

Machine learning tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw

Pandas tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3uuASpe-1LjfG5f14Bnozjwy

Git github tutorials: https://www.youtube.com/playlist?list=PLeo1K3hjS3usJuxZZUBdjAcilgfQHkRzW

Matplotlib course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu4Lr8_kro2AqaO6CFYgKOl

Data structures course: https://www.youtube.com/playlist?list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12

Website: http://codebasicshub.com/
Facebook: https://www.facebook.com/codebasicshub
Twitter: https://twitter.com/codebasicshub

Source


[ad_2]

Comment List

  • codebasics
    December 18, 2020

    Step by step roadmap to learn data science in 6 months: https://www.youtube.com/watch?v=H4YcqULY1-Q
    Machine learning tutorials with exercises:

    https://www.youtube.com/watch?v=gmvvaobm7eQ&list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw

  • codebasics
    December 18, 2020

    Nice explanation

  • codebasics
    December 18, 2020

    at 5:45, let me correct you, statement below throws error
    SELECT * FROM CITY_DATE GROUP BY CITY
    when grouping, you can only select aggregated date rather than all columns *.

    To get what exactly you want, please write:
    SELECT * FROM CITY_DATE where CITY = 'cityname'
    Thanks for making awesome tutorials!

  • codebasics
    December 18, 2020

    Iterating a groupby object is something I liked most! wasn't aware of this before

  • codebasics
    December 18, 2020

    I am not able to find any output for %matplotlib inline than g.plot() -> as this is throuing an error "ModuleNotFounError".. in that case what need to be done.

  • codebasics
    December 18, 2020

    can anyone please tell me once it is separated into each city group how to export and write it to separate excel sheets ?

  • codebasics
    December 18, 2020

    Great

  • codebasics
    December 18, 2020

    May God Bless.Your tutorials are better than an MIT or Harvard class.

  • codebasics
    December 18, 2020

    Your Tutorials are more than Excellent and Thank you for your service to the people of the world.

  • codebasics
    December 18, 2020

    helpful!!

  • codebasics
    December 18, 2020

    Please help as i am very new on pandas. i had 4 columns. column 1, 2 both are integers (int64), column 3 are in h:m:s (object type), column 4 integers (float64). I want to use groupby on column 1 &2 Then sum on column 3 &4. Using groupby Then sum. But outcome is only the groupby columns Then sum of column 4 without column 3. Can anybody help.

  • codebasics
    December 18, 2020

    How to find nth highest salary each geoup wise
    Below example data frame

    Employee={'EMPNO':(111,112,114,115,223,226,228,300,333,345,356,320),'Salary':(4000,6000,2000,8000,2000,1000,3000,500,700,300,200,700),'EMPCODE':('MGF','MGR','MGR','MGR','CLERK','CLERK','CLERK','PEON','PEON','PEON','PEON','PEON')}

    Employee

    emp_df=pd.DataFrame(Employee)

    emp_df

  • codebasics
    December 18, 2020

    please tell me about level=0 in groupby. I searched google but i didn't understand clearly..

  • codebasics
    December 18, 2020

    Very useful thanks

  • codebasics
    December 18, 2020

    i have spent thousand of money to buy courses from diff diff vendors and at last learning from here, which is far far batter than anywhere else.

  • codebasics
    December 18, 2020

    Great video! Concise and to the point! Thank you 🙂

  • codebasics
    December 18, 2020

    Why the date column after command g.max() is completely off? It is not what it shows for Paris and New York.

  • codebasics
    December 18, 2020

    Thank you
    بارك الله فيك

  • codebasics
    December 18, 2020

    Thank for the video

  • codebasics
    December 18, 2020

    Absolutely excellent!

  • codebasics
    December 18, 2020

    this was the best explanation of groupby i've come across. thank you.

  • codebasics
    December 18, 2020

    city_df is not declared anywhere, then how for the city, city_df in g knows, city_df is referred to as the corresponding data frame?

  • codebasics
    December 18, 2020

    Fantastic

  • codebasics
    December 18, 2020

    How to use plot function in spyder??? Plz help

  • codebasics
    December 18, 2020

    Really, I am liking so much. 👌👌👌👌

  • codebasics
    December 18, 2020

    Thank you man i'm using this for my doctorate research

  • codebasics
    December 18, 2020

    Simply mind-blowing nice explanation sir

  • codebasics
    December 18, 2020

    How do j plot three dimensional plot with all three cities in one plot

  • codebasics
    December 18, 2020

    better than the coursera lectures.

  • codebasics
    December 18, 2020

    can't we do the same in SqL too?

  • codebasics
    December 18, 2020

    i am designing a project via pyqt5 and i want to import data to the QTablewidget, but i can't do it. i access the specific column yet it gives me extra detail for example: name: H2,dtype: float64, but i dont want them. i want to acces just needed data. input: df.["H2"] output: 2. how should i do? please help.

  • codebasics
    December 18, 2020

    it is a great presentation! I wish you could also add things like how to select the row with the max temperature within each city.

  • codebasics
    December 18, 2020

    what is x axis and y axis in the plot ?

  • codebasics
    December 18, 2020

    Change speed tp 1.5x and thank me later

  • codebasics
    December 18, 2020

    Thanks for the vedio…. Can we do grouping by months!!

  • codebasics
    December 18, 2020

    Very informative, thank you for this. One question I have is there a way we can run a liner regression model for each of the group separate from the other?

  • codebasics
    December 18, 2020

    Excellent way of teaching i seen so far for data science thank you so much sir

  • codebasics
    December 18, 2020

    Excellent tutorials . Thanks

  • codebasics
    December 18, 2020

    matplot lib not working!!….. What to do?

  • codebasics
    December 18, 2020

    Sir, how to find the data where windspeed > 5

  • codebasics
    December 18, 2020

    what does the X axis mean in your plot??

  • codebasics
    December 18, 2020

    Hi, beautiful people in this section when I am using g.describe( ) it shows my table in vertical format it doesn't show same as in video , do you know how to fix this problem please. :)))))

  • codebasics
    December 18, 2020

    big thanks, awesome!!, can i turned in a dataframe ?

Write a comment