For more information on how to access this dataset via the Socrata Open Data API, read on. Additional information on query functionality, code samples, and libraries can be accessed using the navigation above.
Good to go! You're already using the latest version of this dataset API.
Taxi trips reported to the City of Chicago in its role as a regulatory agency. To protect privacy but allow for aggregate analyses, the Taxi ID is consistent for any given taxi medallion number but does not show the number, Census Tracts are suppressed in some cases, and times are rounded to the nearest 15 minutes.
Due to the data reporting process, not all trips are reported but the City believes that most are.
Seehttp://digital.cityofchicago.org/index.php/chicago-taxi-data-releasedfor more information about this dataset and how it was created.
Seehttp://dev.cityofchicago.org/open%20data/data%20portal/2019/04/12/tnp-taxi-privacy.htmlfor further discussion of the approach to privacy in this dataset.
Seehttp://dev.cityofchicago.org/open%20data/data%20portal/2017/10/24/taxi-data-behind.htmlfor a note on data freshness.
All communication with the API is done through HTTPS, and errors are communicated through HTTP response codes. Available response types include JSON, XML, and CSV, which are selectable by the "extension" (.json, etc.) on the API endpoint or through content-negotiation with HTTP Accepts headers.
This documentation also includes inline, runable examples. Click on any link that contains a gear symbol next to it to run that example live against the Taxi Trips API. If you just want to grab the API endpoint and go, you'll find it below.
Learn more about:
All requests should include an app token that identifies your application, and each application should have its own unique app token. A limited number of requests can be made without an app token, but they are subject to much lower throttling limits than request that do include one. With an app token, your application is guaranteed access to it's own pool of requests. If you don't have an app token yet, click the button to the right to sign up for one.
Once you have an app token, you can include it with your request either by using the X-App-Token HTTP header, or by passing it via the $$app_token parameter on your URL.
Each column in the dataset is represented by a single field in its SODA API. Using filters and SoQL queries, you can search for records, limit your results, and change the way the data is output. For example, you could filter this dataset by its trip_id field using a query like the following:
For richer filtering, you can combine filters together by stacking parameters on your URL or by using SoQL queries. Learn more about about each of the fields in this dataset by clicking the headers below, or read more about the SODA API using the navigation at the top of the page.
Learn more about:
trip_id A unique identifier for the trip.
The trip_id column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
taxi_id A unique identifier for the taxi.
The taxi_id column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
When the trip started, rounded to the nearest 15 minutes.
The trip_start_timestamp column is of the floating_timestamp datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the floating_timestamp datatype.
When the trip ended, rounded to the nearest 15 minutes.
The trip_end_timestamp column is of the floating_timestamp datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the floating_timestamp datatype.
trip_seconds Time of the trip in seconds.
The trip_seconds column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
trip_miles Distance of the trip in miles.
The trip_miles column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
pickup_census_tract The Census Tract where the trip began. For privacy, this Census Tract is not shown for some trips. This column often will be blank for locations outside Chicago.
The pickup_census_tract column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
dropoff_census_tract The Census Tract where the trip ended. For privacy, this Census Tract is not shown for some trips. This column often will be blank for locations outside Chicago.
The dropoff_census_tract column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
pickup_community_area The Community Area where the trip began. This column will be blank for locations outside Chicago.
The pickup_community_area column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
dropoff_community_area The Community Area where the trip ended. This column will be blank for locations outside Chicago.
The dropoff_community_area column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
fare The fare for the trip.
The fare column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
tips The tip for the trip. Cash tips generally will not be recorded.
The tips column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
tolls The tolls for the trip.
The tolls column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
extras Extra charges for the trip.
The extras column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
trip_total Total cost of the trip, the total of the previous columns.
The trip_total column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
payment_type Type of payment for the trip.
The payment_type column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
company The taxi company.
The company column is of the text datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text datatype.
pickup_centroid_latitude The latitude of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_latitude column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
pickup_centroid_longitude The longitude of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_longitude column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
pickup_centroid_location The location of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_location column is of the point datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the point datatype.
dropoff_centroid_latitude The latitude of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_latitude column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
dropoff_centroid_longitude The longitude of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_longitude column is of the number datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number datatype.
dropoff_centroid_location The location of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_location column is of the point datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the point datatype.
The following are grab-and-go code code samples you can use with popular programming languages and data science tools.
jQuery makes it super simple to fetch and parse JSON from an API endpoint.
$.ajax({
url: "https://data.cityofchicago.org/resource/wrvz-psew.json",
type: "GET",
data: {
"$limit" : 5000,
"$$app_token" : "YOURAPPTOKENHERE"
}
}).done(function(data) {
alert("Retrieved " + data.length + " records from the dataset!");
console.log(data);
});
The following resources might also be helpful:
Python package using Pandas to easily work with JSON data
#!/usr/bin/env python
# make sure to install these packages before running:
# pip install pandas
# pip install sodapy
import pandas as pd
from sodapy import Socrata
# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.cityofchicago.org", None)
# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.cityofchicago.org,
# MyAppToken,
# userame="user@example.com",
# password="AFakePassword")
# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("wrvz-psew", limit=2000)
# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)
The following resources might also be helpful:
PowerShell to extract data from SODA
$url = "https://data.cityofchicago.org/resource/wrvz-psew"
$apptoken = "YOURAPPTOKENHERE"
# Set header to accept JSON
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Accept","application/json")
$headers.Add("X-App-Token",$apptoken)
$results = Invoke-RestMethod -Uri $url -Method get -Headers $headers
The following resources might also be helpful:
The City of Chicago and community maintains a great RSocrata package on Github.
## Install the required package with:
## install.packages("RSocrata")
library("RSocrata")
df <- read.socrata(
"https://data.cityofchicago.org/resource/wrvz-psew.json",
app_token = "YOURAPPTOKENHERE",
email = "user@example.com",
password = "fakepassword"
)
The following resources might also be helpful:
SAS is a tried and true application suite for data analysis and visualization. The following snippet brings Socrata data into a SAS.
filename datain url 'http://data.cityofchicago.org/resource/wrvz-psew.csv?$limit=5000&$$app_token=YOURAPPTOKENHERE';
proc import datafile=datain out=dataout dbms=csv replace;
getnames=yes;
run;
The following resources might also be helpful:
The soda-ruby gem is a simple wrapper around the SODA APIs that makes usage with Ruby more natural.
#!/usr/bin/env ruby
require 'soda/client'
client = SODA::Client.new({:domain => "data.cityofchicago.org", :app_token => "YOURAPPTOKENHERE"})
results = client.get("wrvz-psew", :$limit => 5000)
puts "Got #{results.count} results. Dumping first results:"
results.first.each do |k, v|
puts "#{key}: #{value}"
end
The following resources might also be helpful:
SODA.NET is a Socrata Open Data API client library for .NET
using System;
using System.Linq;
// Install the package from Nuget first:
// PM> Install-Package CSM.SodaDotNet
using SODA;
var client = new SodaClient("https://data.cityofchicago.org", "YOURAPPTOKENHERE");
// Get a reference to the resource itself
// The result (a Resouce object) is a generic type
// The type parameter represents the underlying rows of the resource
// and can be any JSON-serializable class
var dataset = client.GetResource("wrvz-psew");
// Resource objects read their own data
var rows = dataset.GetRows(limit: 5000);
Console.WriteLine("Got {0} results. Dumping first results:", rows.Count());
foreach (var keyValue in rows.First())
{
Console.WriteLine(keyValue);
}
The following resources might also be helpful:
Copy and paste the following to import this dataset into Stata
clear
. import delimited "https://data.cityofchicago.org/resource/wrvz-psew.csv?%24limit=5000&%24%24app_token=YOURAPPTOKENHERE"