QTM 151 - Introduction to Statistical Computing II

Lecture 01 - Introduction and Course Overview

Danilo Freire

Department of Quantitative Theory and Methods
Emory University

Welcome to QTM 151 - Introduction to Statistical Computing II! 🥳🎉

Lecture Overview

  • Introduction: Hello everyone! 🙋🏻
  • Motivation: Why learn Python and SQL?
  • Class Logistics: Course objectives, grading, and late policy
  • Computing Set up: Anaconda, Jupyter, VSCode

Course Materials

Course repository: https://github.com/danilofreire/qtm151-summer

Course website: https://danilofreire.github.io/qtm151-summer

  • This course is hosted on GitHub, where you will find lecture materials, code samples, our discussion space, assignments, and final project instructions.
  • Canvas will be used for course management, including assignment submissions, grades, and announcements. Please familiarise yourself with both platforms, and reach out if you have any questions.

Note

Please remember to check the course website regularly for updates and announcements 😉

Nice to meet you! 👋

Instructor

A bit about me

Visiting Assistant Professor in the QTM

MA from the Graduate Institute Geneva, PhD from King’s College London, Postdoc at Brown University, Senior Lecturer at the University of Lincoln, UK.

Research interests: computational social science, experimental methods, policy evaluation, political violence, organised crime.

My teaching philosophy

What you can expect from me


  • I love teaching and aim to make learning fun! 😄
  • Classes where students participate are the best
  • Hands-on activities help you learn better
  • I am always available to help and answer questions. And I mean it!
  • Feel free to ask questions during class, office hours, or via email 😊

Office hours: What for and what not for


  • What office hours are meant for:
    • Applying tools in practice
    • Discussion of issues related to the assignments
    • Boosting your knowledge of data science
  • What these sessions are not meant for:
    • Solving the assignments for you
    • Taking care of developing your coding skills

Class etiquette

  • Coding can be tough and push you out of your comfort zone. If the course pace is too fast, let us know. I expect your commitment, but I do not want anyone to fail
  • You are all keen on data science, but your backgrounds vary. That is great! Some sessions might be more engaging than others. If you are bored, help others or explore new data science areas
  • Always be respectful to each other
  • Ask questions whenever you need to!

Why Python and SQL? 🐍🗄️

Why Python

Why Python

Great community and easy to learn

There are thousands of Python user groups worldwide

The Python community is very active and welcoming!

  • Java:
public class Welcome {
    public static void main(String[] args) {
        System.out.println("Welcome to QTM151!");
    }
}
  • Python:
print("Welcome to QTM151!")

Why Python

Salaries are good!

Why SQL

  • SQL is the standard language for relational database management systems
  • Easy to learn
  • Standardised
  • Manage huge amounts of data
  • Widely used in industry
  • Great for data analysis

Why SQL

Salaries are good too!

Course Logistics 📚

Course Objectives

  1. Perform basic operations and write functions in Python
  2. Conduct data wrangling and manipulate data using Python libraries such as Pandas
  3. Merge and manage databases using SQL
  4. Create visualisations to effectively communicate data insights
  5. Implement linear models and understand the principles of time series analysis
  6. Use Jupyter Notebooks for reproducible research
  7. Develop problem-solving skills relevant to data analysis and statistical computing

Grades and Late Policy

  • Assignments (x5): 50%
    • Practice class concepts​
  • Quizzes (x3): 50%
    • Questions are based on the lecture notes and assignments
    • Quizzes are open-book and open-notes (including the internet)
  • All materials are available on the course website​ and GitHub​ repository
  • Late assignments will automatically be graded for half-credit​
  • Watch out for the assignments to install software. You will need these to be able to use the lectures notes

Computing Set Up 🖥️

Our Class in a Nutshell

Installing Python using Anaconda

Anaconda has (almost) all the libraries we need


  • Follow the instructions on our GitHub website
  • We are using Anaconda virtual environments for this class (I will cover this in more detail soon)​
  • For now: Anaconda comes with a full Python installation​ with everything you need to follow along with the course
  • Questions?

Installing VSCode and Connecting Anaconda​

  • Follow the instructions Installing Visual Studio Code and Connecting it with Anaconda from GitHub
  • For now: know that “base” is the Anaconda virtual environment that comes by default with the installation​
  • The next step is to check if the connection between VSCode and Anaconda worked
    • In VSCode, click on “View” and then “Command Palette”
    • Type “Python: Select Interpreter” and select the Anaconda environment you just created
    • I recommend using the “base” environment for this course
  • Next step: we will create a new folder for the QTM151 course and download our virtual GitHub folder and opening it in VSCode

Git and GitHub 🐙

Git and GitHub

  • Git is a version control system that allows you to track changes in your code and collaborate with others
  • Imagine something like Google Docs plus Microsoft Word track changes, but for code (and much more powerful!)
  • GitHub is a web-based platform that uses Git for version control and collaboration
  • GitHub is a great way to share your code with others and collaborate on projects
  • GitHub is also a social network for developers. You can follow other developers, star their projects, and contribute to open-source projects
  • Create a free student account on GitHub here: https://github.com/education/students

Jupyter Notebooks 📘

Jupyter Notebooks

  • Jupyter Notebooks are a great way to combine code, text, and visualisations!
  • If you have used R Markdown, you will find Jupyter Notebooks very similar
  • They are widely used in data science and machine learning, and are a great way to share your work with others
  • Please install the Jupyter extension for VSCode

  • If you are using Anaconda, Jupyter Notebooks should be installed by default. If not, you can install it using the Anaconda Navigator or the command line
conda install jupyter

Jupyter Notebooks

  • We will use Jupyter Notebooks a lot for our classes!
  • Quizzes and assignments will be in this format
  • Our website has a tutorial on how to use them too
  • There are both Jupyter Notebooks and lecture notes for each class in the course repository
  • So you can choose which one you prefer to use, or use both!
  • Lecture notes are designed to be followed along, and there will be many “try it yourself” exercises throughout the lectures!
  • In case you have any trouble with the installation, you can also use an online version of Jupyter Notebooks included in our website: https://danilofreire.github.io/qtm151-summer/
  • Just click on “Jupyter Lite” and it will open a Jupyter Notebook in your browser

Next Class

  • Let me know if you have any questions about the course or the material or how to set up your computer
  • We will start with the basics of Jupyter Notebooks
  • We will also cover the basics of Anaconda and VSCode
  • Please remember to check the course repository and the website at https://github.com/danilofreire/qtm151-summer and https://danilofreire.github.io/qtm151-summer
  • And please do not forget:
    • Coding ability can be developed
    • Academic skills and abilities are acquired through hard work, mistakes, and perseverance. Coding is no different
    • My only goal here is that you learn the material. Please ask me questions! 😊

Questions?

Thank you very much for your attention! 😃 🥳