The purpose of this section is to help students who are completely new to data science get warmed up and ready to go. We will cover various topics including how to set up your computer to do data science, using GitHub and Kaggle to share your work, using the command line in your OS, etc. Are you ready?

Introduction to SQL?

This is an introduction to the Structured Query Language (SQL), which is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables. SQL is a powerful tool for creating, updating, deleting, and requesting information from databases. It is an essential skill for any data scientist because relational databases are one of the most important data sources for any data science process.

Introduction to Python?

Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects. Python is absolutely the most important skill for any data scientist to master, since most of the popular data science modules and libraries are built on top of Python nowadays. In this module, you will get to learn the basics of Python programming and build a solid foundation for later modules where you will learn all the cool things about the Python data science packages.

Intermediate Python?

After you are familiar with the basic concepts in Python programming, it is time to take your skills to the next level! As a data scientist, mastering the intermediate level Python coding is extremely beneficial since it allows you to work on more complicated problems and better leverage the power of Python.

Introduction to Numpy?

NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. NumPy provides Python with a powerful array processing library and an elegant syntax that is well suited to expressing computational algorithms clearly and efficiently. We'll introduce basic array syntax and array indexing, review some of the available mathematical functions in NumPy, and discuss how to write your own routines. Along the way, we'll learn just enough about Matplotlib to visualize results from our examples.

Learn the Basics of Machine Learning?

Machine learning, the field of computer science that gives computer systems the ability to learn from data, is one of the hottest topics in data science. Machine learning is transforming the world: from spam filtering in social networks to computer vision for self-driving cars, the potential applications of machine learning are vast. This section covers the foundational machine learning concepts and tools that will help you advance in your career. Whether you’re trying to analyze a dataset using machine learning, or you’re a data analyst trying to upgrade your skills, this is the best place to start.

Deep Learning for Beginners?

Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before.

Natural Language Processing?

Natural Language Processing or NLP is a field of Artificial Intelligence that gives the machines the ability to read, understand and derive meaning from human languages. It is a discipline that focuses on the interaction between data science and human language, and is scaling to lots of industries. Today NLP is booming thanks to the huge improvements in the access to data and the increase in computational power, which are allowing practitioners to achieve meaningful results in areas like healthcare, media, finance and human resources, among others.
[Optional] Recommended Reading – Introduction to SQL: Mastering the Relational Database Language

If you feel like diving deeper into the world of relational databases and SQL, here is a book that you might wanna consider getting. Please note that purchasing this book is only recommended and completely optional. This book is not a required part of our data science curriculum here.

Introduction to SQL: Mastering the Relational Database Language 4th Edition

Buy this book from Amazon

SQL was, is and always will be the database language for relational database systems such as Oracle, DB2, MySQL, and Microsoft SQL Server. Introduction to SQL describes in depth the full capacity of SQL as it is implemented by all the major commercial databases, without neglecting the most recent changes to the standard. Unique in the extent of its coverage, this book takes you from the beginning to the end of SQL, the concepts to the practice, the apprentice to the master. You don’t just learn how to do something, but you learn why SQL works the way it does. Van der Lans illustrates each aspect of SQL with many practical examples, and provides exercises in each chapter to make sure you have grasped the concepts. Previous editions have been well received, and have sold well. This edition brings it up to date for the most recent SQL standard, as well as with the most recent versions of Microsoft SQL Server, Oracle, DB2, and MySQL. The author is an insider on the SQL standard committee.