Python Tools for Management Research

1b: Polars II

Jason T. Kiley

Polars II

“Wow, the data assembly was su much of the analytical work” - Me, as a doc student.

This segment is about combining component datasets into something closer to analysis-ready data.

Today

Aggregate

Summarize article-level data to firm-year.

Join

Add identifiers, stock prices, and article summaries to firm-year data.

Check

Inspect row counts, duplicate keys, missing matches, and anti joins.

Polars-native move

When we have record-level data that need to become firm-year data, we often:

  1. summarize the component data once;
  2. join the summary onto the base data;
  3. check whether the join behaved.

That is usually clearer, faster, and easier to audit than repeating the same query one row at a time.

Hands-on

Open notebooks/1b_polars.ipynb.