Splitting Training and Test Data for Machine Learning Using Python and Scikit Learn tutorial




[ad_1]

Welcome to the video series on Introduction to Machine Learning with Scikit Learn and Python. This is Chapter -7 and in this chapter, we will talk about how to judge the performance of our machine learning algorithm.

This is a video series on scikit learn tutorial. In this series I’m talking about using scikit learn machine learning for our implementations

Machine learning Algorithm selection faces a unique catch22 situation where you get the data to train but need unseen(new)data to test the algorithm which is available only with production.

To avoid this situation and understand the performance of the selected Machine Learning algorithm, we need to generate TEST DATASET from the available DATA Set.

We can do the same by segregating the available dataset in Training Data Set and Testing Data Set. Scikit Learn provides a utility function called train_test_split which can help us to achieve this goal

This video explains the usage of train_test_split function and how we can generate training and testing datasets.

#python #Machinelearning #scikitlearn #ArtificialIntelligence #python #softwaredevelopment #programming #pandas #scikitlearn #datascience #dataanalytics

Hi I am Deepak k Gupta (nickname – Daksh and Preferred). This channel is for budding as well as experienced software developers who are willing to explore the awesome world of programming.

Subscribe to my Youtube channel here https://bit.ly/Sub_CodesBay

Here is the brief list of things which you can find in my Youtube channel

1. C++ programming (latest specification C++17 and C++20 ), create high performance system applications using this one.
2. Create microservices designed for multiple CPU cores using my golang tutorial
3. Create web applications as well as backend application using my Javascript tutorial and node js
4. Create cross platform mobile apps using my flutter tutorial
5. Learn Python Programming, the language in demand and learn to do effective ways of doing Data Science and Machine Learning. My python tutorials includes but not limited to supervised and unsupervised learning, logistic regression, gradient descent. You will also be able to create neural networks using my Pytorch Tutorial
6. Learn source control with my git tutorial, which is one of the most widely used decentralized source control. Learn how to create branch using git branch, merge changes using git merge, checkout a branch using git checkout and commit your changes using git commit
7. Learn about persistent nosql databases like mongodb using my mongodb tutorial as well as in memory nosql databases like redis using my redis tutorial. you’ll also learn about using redis nodejs
8. Understand the concept of handling large data using my big data tutorial and using databases like apache spark
9. Learn about graph theory and graph database and how to make use of graph databases like neo4j

Source


[ad_2]

Comment List

  • Code Sports
    December 29, 2020

    Thanks a lot for the clear explanation.

  • Code Sports
    December 29, 2020

    x_train is not defined why

  • Code Sports
    December 29, 2020

    How to split own image dataset with xtrain n ytrain

  • Code Sports
    December 29, 2020

    Come to the point dude

  • Code Sports
    December 29, 2020

    how can i split train and test dataset for speech signal (other language)?

  • Code Sports
    December 29, 2020

    Hello thank you for the video It was very clear to understand!!

    I need a little help, I've created recommender algorithù and I don't know how to evaluate it, I've seen many people evaluate their models using predefined recommender algorithms from libraries but not with their own algorithms, I'd appreciate it if you can help, Thank you!

  • Code Sports
    December 29, 2020

    May I have ur mail-id sir

  • Code Sports
    December 29, 2020

    When i train model then how to test with different dataset. I have 2 dataset one for train and other for test.

  • Code Sports
    December 29, 2020

    Sir, How to split the data in loan prediction datasets
    What parameters we can use?
    In train dataset the parameters are- Loan Id, Gender, Married, Dependents, Education, Self Employed, ApplicantIncome, CoapplicantIncome, LoanAmount, LoanAmountTerm, Credit History, Property Area, Loan Status

    X_train, X_test, y_ train, y_test=train_test_split()

    Plz help me

  • Code Sports
    December 29, 2020

    Hello sir, can you please tell me how to evaluate final data on test dataset?

  • Code Sports
    December 29, 2020

    sir, if we have images data then?
    how can we split it

Write a comment