Let’s Write a Pipeline – Machine Learning Recipes #4




[ad_1]

In this episode, we’ll write a basic pipeline for supervised learning with just 12 lines of code. Along the way, we’ll talk about training and testing data. Then, we’ll work on our intuition for what it means to “learn” from data.

Check out TensorFlow Playground: http://goo.gl/cv7Dq5

New! Follow @random_forests for updates on new episodes.

Subscribe to the Google Developers: http://goo.gl/mQyv5L
Subscribe to the brand new Firebase Channel: https://goo.gl/9giPHG
And here’s our playlist: https://goo.gl/KewA03

Source


[ad_2]

Comment List

  • Google Developers
    December 6, 2020

    Why capital X but lower-case y?

  • Google Developers
    December 6, 2020

    Good Stuff!!! I love Boring Stufffff!!!!! Nice Teacher Nice Channel And Nice Topic As well

    Woww!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  • Google Developers
    December 6, 2020

    why the title is so misleading? I was looking for pipeline tutorial.

  • Google Developers
    December 6, 2020

    but where's the pipeline !!!!!! ????

  • Google Developers
    December 6, 2020

    Hi there, Does anyone-else also getting accuracy lower than 50%?

  • Google Developers
    December 6, 2020

    any place to find the code? thank

  • Google Developers
    December 6, 2020

    Thanks but I prefer R to perform these tasks.

  • Google Developers
    December 6, 2020

    And where is the pipeline? Did not got it

  • Google Developers
    December 6, 2020

    Wonderful Series of ML! Someone recommends me another one? please

  • Google Developers
    December 6, 2020

    2:08 correction: 75% of the data will be in the training set and 25% will be in the test set unless test_size is specified by the user.

  • Google Developers
    December 6, 2020

    This has nothing to do with pipelines

  • Google Developers
    December 6, 2020

    helo

  • Google Developers
    December 6, 2020

    Thanks

  • Google Developers
    December 6, 2020

    He did not import the pipeline library, forget using it.
    Title of the video is misleading

  • Google Developers
    December 6, 2020

    print predictions should be print(predictions)

  • Google Developers
    December 6, 2020

    did I miss something ? Where is the pipeline ?

  • Google Developers
    December 6, 2020

    i thought this video was about pipeline in sklearn WTF

  • Google Developers
    December 6, 2020

    I am getting this error while doing accuracy check: accuracy_score() missing 1 required positional argument: 'y_pred'

    may someone help me to sort out.

  • Google Developers
    December 6, 2020

    Thanks excellent series good work 🙂

  • Google Developers
    December 6, 2020

    Android 5.0 questions is def spam

  • Google Developers
    December 6, 2020

    extremely helpful and simple (y) (y)

  • Google Developers
    December 6, 2020

    from sklearn.model_selection import train_test_split

  • Google Developers
    December 6, 2020

    the content is great but requires an update

  • Google Developers
    December 6, 2020

    great great videos!!

  • Google Developers
    December 6, 2020

    Are you an ML?

  • Google Developers
    December 6, 2020

    Why is "X" capital and not "y"?

  • Google Developers
    December 6, 2020

    Josh, you are not only knowledgeable of all these ML, but also a outstanding instructor. Simplified all these complicated methods. Cant thank you enough.

  • Google Developers
    December 6, 2020

    Here is a quick summary of the video:

    – scikit-learn has a handy function for splitting data sets into a training and a testing set
    – it's sklearn.model_selection.train_test_split(data_set_features,data_set_labels,test_fraction)
    – this function will return 1) training_features 2) testing_features 3) training_labels and 4) testing_labels
    – i.e. it returns a tuple of 4 elements
    – note, the test_fraction argument specifies the fraction of the data you want to use for testing
    – so if you put 0.5, it means you want to use half the data for testing (and the other half for training obviously)

    – recall that the .predict() method returns a list of predictions for the list of examples you pass it
    – you can use sklearn.metrics.accuracy_score(test_labels,predicted_labels) to compare two list of labels essentially

    – supervised learning is also known as function approximation because ultimately what you are doing is finding a function that matches your training examples well
    – you start with some general form of the function (e.g. y = mx+b) and then you tune the parameters such that it best describes your training examples (i.e. change m and b until you get a line that best splits your data)

    Key thing to take away from the video:
    Supervised learning is just function approximation. You start with a general function and then tweak the parameters of the function based on your training examples until your function describes the training data well.

Write a comment