Today I would like to talk about how to implement materialized views in PostgreSQL, utilizing them in a Ruby on Rails application.

PostgreSQL, known commonly as Postgres, is an multi-platform, open source SQL database system. It is one of the most popular database systems and second-most used open source database in the world. The development for Postgres is being done under the PostgreSQL Global Development Group. If you’ve never used Postgres I’d recommend trying it on your next project, it’s really cool.

The tutorial is split into three parts as follows:

  1. What is a Database View.
  2. What is a Materialized View.
  3. How to add a Materialized View to a Rails application.

What is a Database View?

A view is basically a stored set of queries against the underlying tables. Whenever a view is invoked, it’ll execute a series of pre-stored queries against the underlying data, returning the resultant data. Unlike a table, it doesn’t take physical space, but the schema of a view is stored and accessed from memory instead.

When the underlying data on which the view is constructed changes, it is reflected in subsequent invocations the view.

Some of the advantages a view provides are:
1. Data abstraction – A view can be used to abstract a set of data from the underlying tables.
2. Optimization – A view, just like any query, goes through an optimizer which improves it as much as possible based on the operation performed on the result of the view.
3. Simplicity – As mentioned above, a view can abstract away the data while at the same time simplifying querying multiple tables. A view is often comprised of one or more joins between tables, making the data of these joins look like a simple table.

Almost all the major database providers have a view feature, and it is the single way of querying certain NoSQL databases.

What is a Materialized View?

A materialized view, also referred to as a “matview”, is a form of database view that was first introduced in Oracle. It has since been adapted by many major databases.

From Wikipedia,

A materialized view is a database object that contains the results of a query.

Basically, a materialized view is similar to a database view except it is stored physically on disk and updated manually. A matview stores the results of a query into a table-like structure of its own, from which the data can be queried. It is not possible to add or delete rows, but the rest of the time it behaves just like an actual table.

Also, when the data behind a materialize view changes, the matview must be manually refreshed before the new data is included in the stored matview. This is both a positive and a negative. Storing the results of the query that makes the materialized view allows it to work like a table. Queries are faster, and the matview itself can be indexed. However, since the data that comprises the results is basically a snapshot, it can be stale and you have to know when to refresh the data.

At the time of this writing, matviews are natively available in Oracle DB, PostgreSQL, Sybase, IBM DB2, and Microsoft SQL Server. MySQL doesn’t provide native support for matviews, unfortunately, but there are open source alternatives to it.

Matviews in PostgreSQL

Materialized views were introduced in Postgres version 9.3. In version 9.4 an option to refresh the matview concurrently (meaning, without locking the view) was introduced.

The syntax for creating a matview in Postgres in similar to creating a table:

CREATE MATERIALIZED VIEW ex_matview AS SELECT col1, col2 FROM mytable WHERE col3 = condition;

Adding a Materialized View to a Rails app

For our simple example, I’ve created a Rails app and generated 1M rows in the database used by the application. The code is available on Github. The app has three tables called channel, item, and sales that mimic the sales data for a retailer. The schema for the sample app is:

schema

Let’s say we have a dashboard page that provides sales information for the past year, along with the following details:

1) Amount of sales a channel has done on a per-item basis, we can query it like so:

Sale.joins(:channel).joins(:item).where('channels.name = ?', 'web').group('items.name').sum(:amount)

2) Amount of sales each channel has done overall:

Sale.joins(:channel).group('channels.name').sum(:amount)

3) Amount of sales on per day basis:

Sale.group("DATE_TRUNC('day', sold_date)").sum(:amount)

These queries aren’t complicated, but they are expensive, even after many optimizations. Also, the data will likely grow more and more. This is a good candidate for a materialized view, since the data needn’t be real time and we’re abstracting only a subset of data.

To get started, let’s create a migration:

rails g migration CreateSalesMatview

Add the following lines to the migration file:

I am just picking up the necessary data that are needed for the reports.

After saving the file, run bin/rake db:migrate and voila! the matview is created. The changes for this migration won’t be available in the schema file, by the way, since it’s not a table.

Creating a Model

One awesome thing by using a materialized view with Rails/ActiveRecord is we can create models for it, just like any other table in the schema. The model in our case will be in a filed called sales_matview.rb. As such, the model name is SalesMatview. You’re welcome to use any name you want, but it’s advised to follow the naming conventions of Rails.

Let’s create a file called app/models/sales_matview.rb, and add the following lines to it:

You’ll notice that the model is bound to the underlying matview using self.table_name. Also, we’ve created two methods:

  • readonly? which locks the model from writing data making, it a read only table.
  • self.refresh which will be called to refresh the materialized view when needed.

We’re all set to use the materialized view we’ve just created. You can query it exactly how you would in a normal ActiveRecord query.

Now, let’s look at our previous queries:

  1. Amount of sales a channel has done on per item basis:

    SalesMatview.where('channel = ?', 'web').group(:item).sum(:amount)
  2. Amount of sales each channel has done overall:

    SalesMatview.group(:channel).sum(:amount)
  3. Amount of sales on a per day basis:

    SalesMatview.group(:date).sum(:amount)

The queries are similar, except without those joins, but you’ll notice a big difference on how these perform:

Channel per item
       user     system      total        real
AR   (251.5ms)  0.040000   0.070000   0.110000 (  0.412950)
MV   (2.3ms)    0.000000   0.010000   0.010000 (  0.012010)

Channel overall
       user     system      total        real
AR   (374.4ms)  0.000000   0.000000   0.000000 (  0.376352)
MV   (3.3ms)    0.000000   0.000000   0.000000 (  0.006813)

Sales per day
       user     system      total        real
AR   (976.4ms)  0.020000   0.000000   0.020000 (  0.990258)
MV   (6.2ms)    0.020000   0.000000   0.020000 (  0.026783)

That is a HUGE savings, to say the least! Over 100 times improved! Give it a try and see for yourself.

Refreshing the Materialized Views

Now we need to tackle refreshing the materialized view. We can create a rake task to refresh the matview to update the data regularly. Add the below line to the lib/tasks/sales.rb file and it’s done.

desc 'Update the Sales materialized view'
task :update_sales_view => :environment do
  SalesMatview.refresh
end

For larger datasets where the refreshes may take too much time, there is an option in Postgres to perform the refresh concurrently. Just change the statement in the refresh method to:

REFRESH MATERIALIZED VIEW CONCURRENTLY sales_matview

Keep in mind, a unique key will be required in the matview to perform concurrent refreshes.

Also, since the materialized view is read only if you want to change the schema it is constructed upon you have to drop the matview and recreate it. Just add the following line to your migration before everything else to drop the matview:

DROP MATERIALIZED VIEW IF EXISTS sales_matview

Conclusion

All the code that is used in the examples are available here –Github. It also has a seed file for generating 1 million rows for the mentioned schema.

I hope this has been helpful and informative for you and I thank you for reading. Let me know your suggestions in comments.

Credits to:
http://www.postgresql.org/docs/9.4/static/rules-materializedviews.html
http://en.wikipedia.org/wiki/Materialized_view

Vinoth is a Server Administrator turned Full stack web developer. He works at Real Image media technologies on movie related products. He loves to try his hands on multiple programming languages but his primary programming language of choice is Ruby. He also codes in Python and Javascript on a daily basis. In his free time you can find him coding or playing CS.

Get your free chapter of Level Up Your Web Apps with Go

Get a free chapter of Level Up Your Web Apps with Go, plus updates and exclusive offers from SitePoint.


  • http://careersreport.com mai.morrison

    This is how it is possible to get eighty-five dollars /hour… After being unemployed for six-months , I started making cash over this website and today I possibly can not be more happy. After 3 months doing this my income is around $5000month -Check internet-website Learn more by visiting my profile

  • http://wrburgess.com/ Randy Burgess

    So, is there a good way to test Material Views with rspec? I’ve tried a couple of times, but the testing environment doesn’t seem to know what a Material View is.

    • https://github.com/Fire-Dragon-DoL Francesco Belladonna

      Unfortunately, testing anything related to a database is “complicated”, means the only way is to build an actual dataset and compare it to an expected resulting dataset. Couldn’t find a different way.

      • http://wrburgess.com/ Randy Burgess

        I’m used to needing to do that…my issue is that I cannot seem to get the Mat Views to be recognized as existing, even if I populate data in the test db. I think it has to do with the aspect of views that don’t represent actual tables.

        • https://github.com/Fire-Dragon-DoL Francesco Belladonna

          Are you using schema.sql instead of schema.rb?

          • http://wrburgess.com/ Randy Burgess

            nope, using schema.rb

          • Court Simas

            For our project, we pulled the sql out into a separate .sql file, and the migration runs that – but what that allowed us to do, was run those sql files individually. So we have some methods that load up the sql and create the views before the test suite is ran (rspec).

          • https://github.com/Fire-Dragon-DoL Francesco Belladonna

            You should try using an sql schema instead (the rb don’t support views, can’t work at all with them!)

        • http://about.me/tamouse Tamara Temple

          I have a thought, completely untested or tried (maybe when I get some time), to put the view table creation and initial load into an initializer.

  • http://nruth.tumblr.com Nick Rutherford

    I’d like to see more discussion of the schema dumping problem. It’s prevented me from using this so far (after seeing a few other blogs on it, and implementing a feature using them in a topic branch).

    “The changes for this migration won’t be available in the schema file, by the way, since it’s not a table.”
    This is a bigger problem than it’s presented as. Your schema is no longer the canonical representation of the database your application requires. How are you going to run tests against this now? How do you fix the database after db:schema:load, which will leave these view-backed-models complaining that their table does not exist?

    Relying on running migrations to recreate the database may work, but isn’t how the tests are run, and leads to debugging migrations if you have done any data manipulation in your migrations through your application models (which of course change later in the software’s life). It’s not the recommended usage of migrations.

    Switching to SQL schema dumping seems to be the suggested solution, and this is what I’d like to see discussed more. It warns that you will no longer be able to switch to other databases, but there seems to be more to it than that, as the rails/rake commands change, and there seems to be less automation and more room for error. That and nobody seems to be doing it.

    SchemaPlus is another option, but it hasn’t implemented Materialized Views yet, only normal ones.

Related books & courses
Rails Deep Dive

Available on SitePoint Premium. Check it out now!