Getting started in scikit-learn with the famous iris dataset




[ad_1]

Now that we’ve set up Python for machine learning, let’s get started by loading an example dataset into scikit-learn! We’ll explore the famous “iris” dataset, learn some important machine learning terminology, and discuss the four key requirements for working with data in scikit-learn.

Download the notebook: https://github.com/justmarkham/scikit-learn-videos
Iris dataset: http://archive.ics.uci.edu/ml/datasets/Iris
scikit-learn dataset loading utilities: http://scikit-learn.org/stable/datasets/
Fast Numerical Computing with NumPy (slides): https://speakerdeck.com/jakevdp/losing-your-loops-fast-numerical-computing-with-numpy-pycon-2015
Fast Numerical Computing with NumPy (video): https://www.youtube.com/watch?v=EEUXKG97YRw
Introduction to NumPy (PDF): http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf

WANT TO GET BETTER AT MACHINE LEARNING? HERE ARE YOUR NEXT STEPS:

1) WATCH my scikit-learn video series:
https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A

2) SUBSCRIBE for more videos:
https://www.youtube.com/dataschool?sub_confirmation=1

3) JOIN “Data School Insiders” to access bonus content:
https://www.patreon.com/dataschool

4) ENROLL in my Machine Learning course:
https://www.dataschool.io/learn/

5) LET’S CONNECT!
– Newsletter: https://www.dataschool.io/subscribe/
– Twitter: https://twitter.com/justmarkham
– Facebook: https://www.facebook.com/DataScienceSchool/
– LinkedIn: https://www.linkedin.com/in/justmarkham/

Source


[ad_2]

Comment List

  • Data School
    November 16, 2020

    Note: This video was recorded using Python 2.7 and scikit-learn 0.16. Recently, I updated the code to use Python 3.6 and scikit-learn 0.19.1. You can download the updated code here: https://github.com/justmarkham/scikit-learn-videos

  • Data School
    November 16, 2020

    I want you to tell me what did you do for your vocal it's natural ,it's so cool

  • Data School
    November 16, 2020

    Still very relevant there is just newer versions of the same thing (IPython is now Jupyter Notebooks and version numbers of the same library were expanded and improved ) code that will still work for a demo. Fundamentals don't change anyway and the delivery of the tutorial remains excellent to this day. I like to access data locally from my hard drive needing no internet to read IRIS data as HTML location but data frame for Pandas instead – it wouldn't hurt to read the text file into excel and convert that to .csv file instead.

  • Data School
    November 16, 2020

    your voice is amazing and explanation is also very good….you just seem like Sheldon while explaining

  • Data School
    November 16, 2020

    Hi, Kevin, thank you for all great videos. Is there any easier way to import sklearn modules? Like from sklearn import all? Thank you!

  • Data School
    November 16, 2020

    Thanks a ton brother !! one of the best tutorials for a python library. the clarity in explanation is 10/10

  • Data School
    November 16, 2020

    Hi, is this Video still relevant today? Year 2020? I guess a lot has changed since the video was recorded (5 years ago). I am new to machine learning. Can I still work with this please?

  • Data School
    November 16, 2020

    is respone and target the same?

  • Data School
    November 16, 2020

    If Im loading my csv file which as categories like gender how do i convert it?

  • Data School
    November 16, 2020

    Dear Kevin, I really love all your content! I'd like to know if I can translate your material to Portuguese to spread more knowledge. Of course, I´ll tell that is a translation and quote your name and your original work as well. Do I have authorization?

  • Data School
    November 16, 2020

    These videos are so useful. Thanks.

  • Data School
    November 16, 2020

    Clear and concise explanations. You've covered things other tutorials I've seen have missed. Thanks!

  • Data School
    November 16, 2020

    A smooth and calm voice. It is easy to absorb.

  • Data School
    November 16, 2020

    Iris.target_names print the encoding for the iris dataset. But it is not working for other dataset like mnist? Why? Is it not prdefined keyword for all the dataset?

  • Data School
    November 16, 2020

    This video helped me a lot. Thanks.

  • Data School
    November 16, 2020

    I love your video series. Can you make a video teaching the above using .csv file and not database. Also, while predicting accuracy of model made, is there any method to make train data and test data? If it's not mentioned which column is my target data then what is the approach to determine the accuracy? Please make a lecture on this.

  • Data School
    November 16, 2020

    Can you also do a tutorial on how to create your own dataset, that would be amazing 😀

  • Data School
    November 16, 2020

    thank you ! but i do not have out 2 like in the video it instead download a file to my computer

  • Data School
    November 16, 2020

    Thank you for making some really structured and good content. I have learned to use pandas in python by watching your videos. The way you explain is clear and structured. Please, keep making more videos so that most people like me will be able to learn even from halfway across the world. Keep up the good work!

  • Data School
    November 16, 2020

    I have a question, when we do a normal train test split on a dataset our x and y are pandas series objects and not numpy arrays, but they still work,pls elaborate on this point

  • Data School
    November 16, 2020

    BTW, I really enjoy this speed for education purposes. Rarely do I get an opportunity to actually digest each sentence properly.

  • Data School
    November 16, 2020

    can we use panda in place of numpy??

  • Data School
    November 16, 2020

    You remind me of Sheldon Cooper. Nice video btw.

  • Data School
    November 16, 2020

    Thank you!

  • Data School
    November 16, 2020

    watch this on 1.25 , thats much better, thank you sir

  • Data School
    November 16, 2020

    1.75x speed and its nice to watch then xd

  • Data School
    November 16, 2020

    Awesome video

  • Data School
    November 16, 2020

    This is "REALLY AWESOME"…..Great work…Thanks a lot for these awesome tutorials..

  • Data School
    November 16, 2020

    Very helpful videos- clear and precise. Thanks!

  • Data School
    November 16, 2020

    Shouldn't I transform the target into dummies, thus having a target with shape 150,3? To me it is weird to have the multiclass target with values 0,1,2 since it gives them relative value. Thanks in advance.

  • Data School
    November 16, 2020

    amazing =)

  • Data School
    November 16, 2020

    Very useful and easy to understand. Thank you

  • Data School
    November 16, 2020

    Awesome videos!!! Thank you so much

  • Data School
    November 16, 2020

    How can I make my own dataset in sci-kit learn? Do I have to make an excel file first and then download it as a csv

  • Data School
    November 16, 2020

    Are you also going to create few on unsupervised n dimension reduction concepts

  • Data School
    November 16, 2020

    YOUR VIDEOS are awesome

  • Data School
    November 16, 2020

    Hello I have an error when I typed print type(iris.target).
    The error code is "File "<ipython-input-23-b76c9574403a>", line 2
    print type(iris.data)
    ^
    SyntaxError: invalid syntax"

    How can I solve this problem?
    Thank you

Write a comment