For more information on how to access this dataset via the Socrata Open Data API, read on. Additional information on query functionality, code samples, and libraries can be accessed using the navigation above.
Good to go! You're already using the latest version of this dataset API.
Taxi trips reported to the City of Chicago in its role as a regulatory agency. To protect privacy but allow for aggregate analyses, the Taxi ID is consistent for any given taxi medallion number but does not show the number, Census Tracts are suppressed in some cases, and times are rounded to the nearest 15 minutes.
Due to the data reporting process, not all trips are reported but the City believes that most are.
Seehttp://digital.cityofchicago.org/index.php/chicago-taxi-data-releasedfor more information about this dataset and how it was created.
Seehttp://dev.cityofchicago.org/open%20data/data%20portal/2019/04/12/tnp-taxi-privacy.htmlfor further discussion of the approach to privacy in this dataset.
Seehttp://dev.cityofchicago.org/open%20data/data%20portal/2017/10/24/taxi-data-behind.htmlfor a note on data freshness.
All communication with the API is done through HTTPS, and errors are communicated through HTTP response codes. Available response types include JSON
, XML
, and CSV
, which are selectable by the "extension" (.json
, etc.) on the API endpoint or through content-negotiation with HTTP Accepts
headers.
This documentation also includes inline, runable examples. Click on any link that contains a gear symbol
next to it to run that example live against the Taxi Trips
API. If you just want to grab the API endpoint and go, you'll find it below.
Learn more about:
All requests should include an app token that identifies your application, and each application should have its own unique app token. A limited number of requests can be made without an app token, but they are subject to much lower throttling limits than request that do include one. With an app token, your application is guaranteed access to it's own pool of requests. If you don't have an app token yet, click the button to the right to sign up for one.
Once you have an app token, you can include it with your request either by using the X-App-Token
HTTP header, or by passing it via the $$app_token
parameter on your URL.
Each column in the dataset is represented by a single field
in its SODA API. Using filters and SoQL queries, you can search for records, limit your results, and change the way the data is output. For example, you could filter this dataset by its trip_id
field using a query like the following:
For richer filtering, you can combine filters together by stacking parameters on your URL or by using SoQL queries. Learn more about about each of the fields in this dataset by clicking the headers below, or read more about the SODA API using the navigation at the top of the page.
Learn more about:
trip_id
A unique identifier for the trip.
The trip_id
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
taxi_id
A unique identifier for the taxi.
The taxi_id
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
When the trip started, rounded to the nearest 15 minutes.
The trip_start_timestamp
column is of the floating_timestamp
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the floating_timestamp
datatype.
When the trip ended, rounded to the nearest 15 minutes.
The trip_end_timestamp
column is of the floating_timestamp
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the floating_timestamp
datatype.
trip_seconds
Time of the trip in seconds.
The trip_seconds
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
trip_miles
Distance of the trip in miles.
The trip_miles
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
pickup_census_tract
The Census Tract where the trip began. For privacy, this Census Tract is not shown for some trips. This column often will be blank for locations outside Chicago.
The pickup_census_tract
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
dropoff_census_tract
The Census Tract where the trip ended. For privacy, this Census Tract is not shown for some trips. This column often will be blank for locations outside Chicago.
The dropoff_census_tract
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
pickup_community_area
The Community Area where the trip began. This column will be blank for locations outside Chicago.
The pickup_community_area
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
dropoff_community_area
The Community Area where the trip ended. This column will be blank for locations outside Chicago.
The dropoff_community_area
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
fare
The fare for the trip.
The fare
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
tips
The tip for the trip. Cash tips generally will not be recorded.
The tips
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
tolls
The tolls for the trip.
The tolls
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
extras
Extra charges for the trip.
The extras
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
trip_total
Total cost of the trip, the total of the previous columns.
The trip_total
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
payment_type
Type of payment for the trip.
The payment_type
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
company
The taxi company.
The company
column is of the text
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the text
datatype.
pickup_centroid_latitude
The latitude of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_latitude
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
pickup_centroid_longitude
The longitude of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_longitude
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
pickup_centroid_location
The location of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The pickup_centroid_location
column is of the point
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the point
datatype.
dropoff_centroid_latitude
The latitude of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_latitude
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
dropoff_centroid_longitude
The longitude of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_longitude
column is of the number
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the number
datatype.
dropoff_centroid_location
The location of the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. This column often will be blank for locations outside Chicago.
The dropoff_centroid_location
column is of the point
datatype.
Please wait while we fetch an example for you...
For more details on what query options you have for this column, see the detailed docs on the point
datatype.
The following are grab-and-go code code samples you can use with popular programming languages and data science tools.
jQuery makes it super simple to fetch and parse JSON from an API endpoint.
$.ajax({
url: "https://data.cityofchicago.org/resource/wrvz-psew.json",
type: "GET",
data: {
"$limit" : 5000,
"$$app_token" : "YOURAPPTOKENHERE"
}
}).done(function(data) {
alert("Retrieved " + data.length + " records from the dataset!");
console.log(data);
});
The following resources might also be helpful:
Python package using Pandas to easily work with JSON data
#!/usr/bin/env python
# make sure to install these packages before running:
# pip install pandas
# pip install sodapy
import pandas as pd
from sodapy import Socrata
# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.cityofchicago.org", None)
# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.cityofchicago.org,
# MyAppToken,
# userame="user@example.com",
# password="AFakePassword")
# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("wrvz-psew", limit=2000)
# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)
The following resources might also be helpful:
PowerShell to extract data from SODA
$url = "https://data.cityofchicago.org/resource/wrvz-psew"
$apptoken = "YOURAPPTOKENHERE"
# Set header to accept JSON
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Accept","application/json")
$headers.Add("X-App-Token",$apptoken)
$results = Invoke-RestMethod -Uri $url -Method get -Headers $headers
The following resources might also be helpful:
The City of Chicago and community maintains a great RSocrata
package on Github.
## Install the required package with:
## install.packages("RSocrata")
library("RSocrata")
df <- read.socrata(
"https://data.cityofchicago.org/resource/wrvz-psew.json",
app_token = "YOURAPPTOKENHERE",
email = "user@example.com",
password = "fakepassword"
)
The following resources might also be helpful:
SAS is a tried and true application suite for data analysis and visualization. The following snippet brings Socrata data into a SAS.
filename datain url 'http://data.cityofchicago.org/resource/wrvz-psew.csv?$limit=5000&$$app_token=YOURAPPTOKENHERE';
proc import datafile=datain out=dataout dbms=csv replace;
getnames=yes;
run;
The following resources might also be helpful:
The soda-ruby
gem is a simple wrapper around the SODA APIs that makes usage with Ruby more natural.
#!/usr/bin/env ruby
require 'soda/client'
client = SODA::Client.new({:domain => "data.cityofchicago.org", :app_token => "YOURAPPTOKENHERE"})
results = client.get("wrvz-psew", :$limit => 5000)
puts "Got #{results.count} results. Dumping first results:"
results.first.each do |k, v|
puts "#{key}: #{value}"
end
The following resources might also be helpful:
SODA.NET
is a Socrata Open Data API client library for .NET
using System;
using System.Linq;
// Install the package from Nuget first:
// PM> Install-Package CSM.SodaDotNet
using SODA;
var client = new SodaClient("https://data.cityofchicago.org", "YOURAPPTOKENHERE");
// Get a reference to the resource itself
// The result (a Resouce object) is a generic type
// The type parameter represents the underlying rows of the resource
// and can be any JSON-serializable class
var dataset = client.GetResource("wrvz-psew");
// Resource objects read their own data
var rows = dataset.GetRows(limit: 5000);
Console.WriteLine("Got {0} results. Dumping first results:", rows.Count());
foreach (var keyValue in rows.First())
{
Console.WriteLine(keyValue);
}
The following resources might also be helpful:
Copy and paste the following to import this dataset into Stata
clear
. import delimited "https://data.cityofchicago.org/resource/wrvz-psew.csv?%24limit=5000&%24%24app_token=YOURAPPTOKENHERE"