Python Tools for Management Research

0a: Python, Notebooks, and the Course Workflow

Jason T. Kiley

The key

Python and the tools we use let us make our workflows into research evidence.

We are learning a carefully curated subset of Python to use modern research tools to get reproducible results, verifiable procedures, and sharable artifacts.

Course path

Day 0

Python foundations

Notebooks, Markdown, basic syntax, and the first working habits.

Day 1

Polars

Read, inspect, transform, and save research data.

Day 2

Projects

Git, GitHub, Codespaces, and environment management.

Day 3

Quarto outputs

Documents, manuscripts, slides, and code-generated results.

Why Python?

  • Approachability: Python is a well-designed modern language that handles a lot for us.
  • Features: many useful tools are already built, so we often glue pieces together.
  • Learning resources: wide use gives us extensive documentation, examples, and help.
  • Scalability: similar workflows can move from a laptop, to the cloud, to a computing cluster.

What do you really need to know?

Basics

Core focus: Python interpreter, notebooks, variables, strings, numbers, lists, dictionaries, packages, and documentation.

Data Preparation

Core focus: reading data, inspecting structure, transforming columns, merging, querying, and writing results.

Good-enough Programming

Supporting focus: functions, loops, reusable patterns, readable code, and handling errors without panic.

Software Engineering

Warning sign: useful, but lower ROI for most research workflows.

Examples: classes and inheritance, package development, testing and continuous integration, cross-version support, open-source contributions.

Compared with familiar tools

Excel

  • Can destroy data, such as Twitter IDs.
  • Slows considerably with otherwise small datasets.
  • Leaves no great audit trail on changes.

Stats packages and R

  • Stata (for example) is great at modeling and not much else.
  • R is also great at modeling, though less versatile than Python.
  • Python shines as general-purpose glue around data, automation, and outputs.

Why environments matter

Packages are tools. Research projects need the right tools, in compatible versions, kept together with the project.

We use devcontainers and environments as recipes that we can ship with the project, so we can reproduce the computing environment later, on another computer, or for a coauthor.

The course environment

GitHub Codespaces gives everyone the same working environment in the browser.

The devcontainer defines the Python version, Quarto, VS Code extensions, and course packages as project infrastructure rather than one-off setup steps.

That matters for research because things change: coauthor writing code or verifying results, another computer, new computer, OS upgrade, and so on.

Why notebooks?

What they hold

  • Reference
  • Narrative
  • Procedure
  • Output
  • Good-enough sharing

Why that helps

The explanation, code, and results live near each other. That makes the notebook useful for planning, decision-making, doing, outputting, and sharing.

Errors are information

Programming languages are very literal. They attempt to evaluate exactly what we give them.

The best part is that they do exactly what we say. The worst part is that they do exactly what we say.

NameError: name 'polarz' is not defined

It’s fine. They make us resolve ambiguity, and they tell us exactly what is wrong. We can read the error, fix it, and move on. And repeat. A lot.

Hands-on

Open notebooks/0a_intro.ipynb.