PowerSocrata enables users to connect to and access data from the Socrata open data portals, including the Socrata Open Data API (SoDA). With PowerSocrata, users can programmatically access a wealth of open data resources from governments, non-profits, and NGOs around the world.
PowerSocrata is a series of M language functions that provide a way for Power BI Desktop and Excel 2016 users to easily access, analyze, and visualize these open data resources. PowerSocrata aims to provide the same SoDA functionality found in some of the more popular programming languages[1][2][3]. The M language is used by the Power Query user experience found in Power BI Desktop and Excel 2016.
- Create a new Blank Query
Excel Users: Data > Get & Transform Data > Get Data > From Other Sources > Blank Query
Power BI Users: Home > External Data > Get Data > Blank Query
- Open the Advanced Query Editor dialog and paste the following code in its entirety:
let
Source =
Expression.Evaluate(
Text.FromBinary(
Web.Contents(
"https://raw.githubusercontent.com/tonmcg/powersocrata/master/M/Socrata.ReadData.pq"
)
),
#shared
)
in
Source
-
Click Done.
-
Power Query will ask you to specify how to connect to this file. Click Edit Credentials.
-
In the Access Web Content dialog, choose to connect anonymously. Click Connect.
-
Rename this query to
ReadSocrata
.
This code snippet returns the first 1,000 records from the San Francisco Police Department Calls for Service dataset:
let
data = ReadSocrata("https://data.sfgov.org/resource/fjjd-jecq.json", null, null)
in
data
We can also ask the Socrata API to return a subset, summary, or specific sorting of the data by utilizing its distinctive query language, Socrata Query Language or SoQL. SoQL clauses are parameters that define the criteria by which the dataset provider will filter, summarize, or sort our desired result.
For example, the following query returns the first 100K calls since 2016 categorized as "Homeless Complaint" from the same dataset:
let
data = ReadSocrata("https://data.sfgov.org/resource/fjjd-jecq.json?$where=original_crimetype_name='Homeless+Complaint'+AND+call_dttm>'2016-01-01T00:00:00.000'", <APP TOKEN>, 100000)
in
data
In the example above, we supplied a SoQL $where
clause within the first parameter of the ReadSocrata
function, which asked the dataset provider to filter both the original_crimetype_name
and call_dttm
columns to our defined criteria. We also defined "100000" as the third parameter in the ReadSocrata
function, which further limited the results to the first 100K records.
By the way, did you notice the APP TOKEN
parameter in the function above? Any PowerSocrata query that returns more than 1,000 records requires the use of a unique Socrata Open Data API application token (app token). For more information on obtaining an app token, consult the Application Tokens page on the Socrata API Developer site.
How do we use the app token? We supply it to our query in one of two ways:
- As the second parameter in the
ReadSocrata
function like we did above - As a
$$app_token
parameter within your request URL string, as shown below
In this example, we supply our app token within the SoQL $$app_token
clause in the URL string. This should return the same dataset as above:
let
data = ReadSocrata("https://data.sfgov.org/resource/fjjd-jecq.json?$where=original_crimetype_name='Homeless+Complaint'+AND+call_dttm>'2016-01-01T00:00:00.000'&$$app_token=<APP TOKEN>", null, 1000000)
in
data
- Open Data Network
- Socrata Open Data API Developer Documentation
- Power Query M Documentation
- Power Query M Language Specification
- Power BI Developer Center
1 See the R client here.
2 See the Python client here.
3 See the Ruby client here. ↩