Using a reverse proxy to integrate WordPress into Rails

Sometimes you need to get two separate applications to play nicely together under the same domain.

If one of these applications is especially greedy with regards to the routes/URLs it wants to control, then it can become a tricky project. Using subdomains can often help, as the subdomain can explicitly address the other application. However, if you need content from both applications under the same domain, and you don’t have access to or the desire to fumble with the routing (e.g. you use Heroku) then a different approach might be required.

A reverse proxy sounds complicated but it’s a very simple idea: a browser requests something from your server (Server 1). When the request arrives, instead of serving the resource directly, the Server 1 makes a second request to Server 2. Once Server 1 recieves the data from Server 2, it then returns it to the browser. From the perspective of the browser, this second request is invisible: it looks as though the content was living on Server 1 all along.

This approach works well for websites which run on Ruby on Rails, but want to use a third-party hosted Wordpress instance to serve and manage their blog. A subdomain is probably off the cards because of the SEO benefits of having the blog under the apex domain. We want the blog to live mydomain.com/blog.

Most recently, I tried using rails-reverse-proxy to solve this problem. It looked simple enough, but I got bogged down with endless OpenSSL::SSL::SSLError - SSL_connect returned=1 errno=0 state=error: sslv3 alert handshake failure errors in development and production. However, when I made the requests manually via cURL or the openssl command line, there was no issue. I started modifying the gem code to debug headers, cipher suites and parameters and then stopped… it shouldn’t be this hard!

So I added to some routes to catch WordPress page requests. Fortunately, these can all be nested under one folder (e.g. /blog), making routing much simpler.

Here are the new routes in routes.rb:

  # Intercept requests to WordPress pages
  match 'blog' => 'wordpress#index', via: [:get, :post, :put, :patch, :delete]
  match 'blog/*path' => 'wordpress#index', via: [:get, :post, :put, :patch, :delete]

And here is the controller to handle the requests:

# This action will catch any requests to blog/* and reverse proxy them.
# Essentially, it fetches the WordPress page HTML from the Wordpress instance, 
# and serves the resulting HTML as if it was coming from this server.
#
class WordpressController < ApplicationController
  WP_URL = 'mywordpressblog.dream.press'.freeze
  COMPANY_URL = 'mydomain.com'.freeze

  def index
    if params['path'].present?
      request_url = "https://#{WP_URL}/blog/#{params['path']}/" # Trailing slash required by WP
    else
      request_url = "https://#{WP_URL}/blog/"
    end

    # Make the request to the Wordpress blog
    blog_html = Net::HTTP.get(URI(request_url))
    render html: rewrite_links(blog_html)
  end

  # Update all page links to our official blog URL so that users don't
  # end up back on the original Wordpress URL
  def rewrite_links(html)
    parsed_html = html.gsub("#{WP_URL}/blog", "#{COMPANY_URL}/blog")
    parsed_html.html_safe
  end
end

Notes

This approach requires you to add /blog to the end of the Site Address (URL) in the Wordpress settings, to make sure everything lives under /blog. You can do this in the Wordpress Admin.
rewrite_links is important to replace links to the original Wordpress instance URL with our preferred URL. Otherwise, as soon as someone clicks a link, they’ll be back at the Wordpress URL. This keeps Google bots on track too.
This logic is only required for pages. Assets and such, located in /wp-content, can still be loaded via their original URL. You might hit some CORS issue: these are usually simple to fix with the right .htaccess

Issues

After posting a comment, a user lands back on the original blog URL! This wasn’t a major issue for us at the time, but it’s one you might want to investigate.