The purpose of this section is to help students who are completely new to data science get warmed up and ready to go. We will cover various topics including how to set up your computer to do data science, using GitHub and Kaggle to share your work, using the command line in your OS, etc. Are you ready?
Introduction to SQL
This is an introduction to the Structured Query Language (SQL), which is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables. SQL is a powerful tool for creating, updating, deleting, and requesting information from databases. It is an essential skill for any data scientist because relational databases are one of the most important data sources for any data science process.
Introduction to Python
Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects. Python is absolutely the most important skill for any data scientist to master, since most of the popular data science modules and libraries are built on top of Python nowadays. In this module, you will get to learn the basics of Python programming and build a solid foundation for later modules where you will learn all the cool things about the Python data science packages.
After you are familiar with the basic concepts in Python programming, it is time to take your skills to the next level! As a data scientist, mastering the intermediate level Python coding is extremely beneficial since it allows you to work on more complicated problems and better leverage the power of Python.
Introduction to Numpy
NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. NumPy provides Python with a powerful array processing library and an elegant syntax that is well suited to expressing computational algorithms clearly and efficiently. We'll introduce basic array syntax and array indexing, review some of the available mathematical functions in NumPy, and discuss how to write your own routines. Along the way, we'll learn just enough about Matplotlib to visualize results from our examples.
Introduction to Pandas
The Pandas library is a data scientist's best friend and arguably one of the most essential tools in a data science toolbox. Pandas officially stands for ‘Python Data Analysis Library’. It is an open-source Python library that allows users to explore, manipulate, and visualize data in an extremely efficient manner. It is literally Microsoft Excel in Python but way more powerful.
Learn the Basics of Machine Learning
Machine learning, the field of computer science that gives computer systems the ability to learn from data, is one of the hottest topics in data science. Machine learning is transforming the world: from spam filtering in social networks to computer vision for self-driving cars, the potential applications of machine learning are vast.
This section covers the foundational machine learning concepts and tools that will help you advance in your career. Whether you’re trying to analyze a dataset using machine learning, or you’re a data analyst trying to upgrade your skills, this is the best place to start.
Deep Learning for Beginners
Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before.
Natural Language Processing
Natural Language Processing or NLP is a field of Artificial Intelligence that gives the machines the ability to read, understand and derive meaning from human languages. It is a discipline that focuses on the interaction between data science and human language, and is scaling to lots of industries. Today NLP is booming thanks to the huge improvements in the access to data and the increase in computational power, which are allowing practitioners to achieve meaningful results in areas like healthcare, media, finance and human resources, among others.