Skip to content

Latest commit

 

History

History
35 lines (28 loc) · 3.4 KB

README.md

File metadata and controls

35 lines (28 loc) · 3.4 KB

JSON Processing in the Database

Getting, processing, and reshaping JSON data using PostgreSQL 9.4

This repository contains materials for my talk at the Data Wranglers DC meetup on January 7, 2015.

I've also given prior talks on SQL to the same group:

The Talk

The talk consists of two major pieces:

  • Introductions to the following (via slide deck and presentation):
  • A few examples (via PostgreSQL 9.4 flavored SQL code and demonstration):
    • Some demonstration code that shows how to use the JSON operators and JSON processing functions in PostgreSQL 9.4. Data used in the demo code was generated with Mockaroo.
    • An example that pulls job opening data from the DigitalGov Jobs API and processes it into tabular data.
    • An example that pulls weather data from the OpenWeatherMap API and processes it into tabular data.

Note that the focus of this talk is on data preprocessing tasks (also called wrangling or munging), and not on analysis tasks. The goal of this work is to demonstrate some ways to use PostgreSQL's capabilities in version 9.4 (released December 18, 2014) to process JSON formatted data into tabular datasets amenable to future analysis.

Contents

Folders are as follows:

  • A slide deck (./slides) in Apple Keynote, PDF and HTML formats
  • A set of PostgreSQL 9.4 flavor SQL scripts (./code) that create the local PostgreSQL database objects demonstrating working with the json and jsonb types, including:
    • Some demonstration code that shows how to use the JSON operators and JSON processing functions in PostgreSQL 9.4. Mockaroo settings and mock data are also included.
    • An example that pulls job opening data from the DigitalGov Jobs API and processes it into tabular data.
    • An example that pulls weather data from the OpenWeatherMap API and processes it into tabular data.

Where do I start?

I recommend that anyone wishing to understand what I've done and presented here should tackle these pieces in order, starting with the slide deck. The code is best followed in the order presented: demo code, followed by Jobs API code, followed by OpenWeatherMap API code.

Disclaimer

This work and the opinions expressed here are my own, and do not purport to represent the views of my current or former employers.