Google Spreadsheets and Python

This post is inspired by Patrick McKenzie’s reminder that sometimes you don’t need a database:

In this tutorial, we’ll use the gspread Python package to read, write, and delete data from a Google Spreadsheet with just a few lines of code. 

Google Drive API and Service Accounts

At the risk of being Captain Obvious, you’re going to need a spreadsheet if you want to follow along with this post. If you don’t have one on hand that’s full of juicy data, might I suggest you make a copy of this spreadsheet with contact information for all United States legislators(Side note: Ian Webster uses this data in conjunction with Twilio to make it easy for citizens to call congress).

 

To programmatically access your spreadsheet, you’ll need to create a service account and OAuth2 credentials from the Google API Console. If you’ve been traumatized by OAuth2 development before, don’t worry; service accounts are way easier to use.

Follow along with the steps and GIF below. You’ll be in and out of the console in 60 seconds (much like Nic Cage in your favorite Nic Cage movie).

  1. Go to the Google APIs Console.
  2. Create a new project.
  3. Click Enable API. Search for and enable the Google Drive API.
  4. Create credentials for a Web Server to access Application Data.
  5. Name the service account and grant it a Project Role of Editor.
  6. Download the JSON file.
  7. Copy the JSON file to your code directory and rename it to client_secret.json

 

There is one last required step to authorize your app, and it’s easy to miss!

Find the  client_email inside client_secret.json. Back in your spreadsheet, click the Share button in the top right, and paste the client email into the People field to give it edit rights. Hit Send.

If you skip this step, you’ll get a gspread.exceptions.SpreadsheetNotFound error when you try to access the spreadsheet from Python.

We’re done with the boring part! Now onto the code.

Read Data from a Spreadsheet with Python

With credentials in place (you did copy them to your code directory, right?) accessing a Google Spreadsheet in Python requires just two packages:

  1. oauth2client – to authorize with the Google Drive API using OAuth 2.0
  2. gspread – to interact with Google Spreadsheets

Install these packages with:

Then paste this code into a new file called spreadsheet.py:

Run python spreadsheet.py and marvel at the glorious, well-formatted data.

 

Insert, Update, and Delete from a Spreadsheet with Python

We’ve just scratched the surface of gspreads’ well documented and comprehensive functionality. 

For instance, we extracted the data into a list of hashes, but you can get a list of lists if you’d prefer:

Or you could just pull the data from a single row, column, or cell:

You can write to the spreadsheet by changing a specific cell:

Or you can insert a row in the spreadsheet:

You can also delete a row from the spreadsheet:

And find out the total number of rows:

Check the gspread API reference for the full details on these functions along with a few dozen others. 

Using Google Spreadsheets with Python opens possibilities like building Flask apps with a spreadsheet as the persistence layer, or importing a data from a Google spreadsheet into Jupyter Notebooks and doing analysis in Pandas. But, I’m sure you’ve got better ideas than that. 

If you build something cool, please let me know. You can find me at gb@twilio.com or @greggyb. And if this post was helpful, please share it with someone else who might dig it. 

  • Jed

    Keep an eye on https://github.com/nithinmurali/pygsheets which supports v4 of the sheets API. Might already help speed up inserts into large sheets, for example.

  • pizzapanther

    The big downside I’ve found using this a lot in the past is that Google has no SLA for sheets. Sure its Google so it has high availability but things can quite often fail with no support from Google. So just be careful of building anything mission critical with this. The API can be wonky every once in a while. At one point it even took them like 6 months to fix an issue I had with the Apps Script API. If my business relied on that functionality, I would have been screwed.

    The Jupyter Notebook use case sounds great but if you move to something that is more critical use a real data store.

    • ZoubIWah

      same. you need to handle a lot of random error conditions in your scripts because google can go from basically cutting you off (“connection lost”) to random 500 errors and even 404 for the exact same request.

      And of course you then need to pool until the API works again.

      In addition to that some of the queries are extremely slow.

      Basically it feels like a 3rd class citizen at best. Now then again I fully use it for years and do not intend to stop.