[RUBY] Multi-tenant with Rails using PostgreSQL's Row Level Security Policy

I am making a SaaS called Clipkit. It is a multi-tenant system with a multi-schema method, but since the number of tenants is approaching 1,000 and it is becoming difficult, we are verifying other methods.

LD;TR

I tried a multi-tenant implementation using PostgreSQL's Row Level Security Policy.

I was able to implement it without any problems and it seemed to work well. However, in the end, I forgot to hire him this time. The RDB multi-tenant method has advantages and disadvantages and is difficult.

Personally, I thought it would be better to start with the multi-schema method instead of RLS.

Introduction

In SaaS-type Web services, there is a method in which independent applications for each customer coexist in one system. This is called multi-tenancy.

RDB multi-tenant method

First of all, when you think about it normally, you can think of a design that mixes data from multiple tenants in a table. However, if there is a bug in the program, it may cause a very big security problem such as the data of other tenants being visible.

Therefore, it is necessary to think of a method to firmly separate data for each tenant so that crosstalk never occurs.

There are roughly three ways to achieve multi-tenancy with RDB.

Multi-instance (silo)

Use an independent DB instance (virtual machine, etc.) for each tenant. High independence but small cost and maintainability benefits.

Single instance multi-schema (bridge)

Prepare a schema for each tenant in a single DB instance. Since each tenant has an independent table, managing the table definition is complicated.

Single schema (pool)

Mix data from all tenants in a table in a single schema. It is the most resource efficient, but if there is a bug in the program, there is a big risk that the data of other tenants will be mixed.

Multi-schema method can be easily realized

Rails has a gem called Apartment. With this, simultaneous migration to all tenants will be done without permission.

Disadvantages of multi-schema method

By separating the schemas for each tenant, when changing the structure of the table, it is necessary to perform the migration for all schemas in the same way. Even if the migration process takes 2 to 3 seconds, it will become more difficult when the number of tenants exceeds several thousand. There is also a high management cost to ensure that the migration is complete for all tenants.

Therefore, if access control can be ensured, I feel that the single schema method is ideal.

Single schema method using Row Level Security Policy

Overview

PostgeeSQL 9.5 and later has a feature called Row Level Security Policy (RLS). This is a feature that allows access only to rows with pre-specified conditions depending on the user's role and run-time parameters.

Setting method

Specifically, set as follows.

Example) I want to make the tenant_id column of the users table invisible except for records with a specific value.

Set RLS. (This is a setting that controls according to run-time parameters)

ALTER TABLE users ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_isolation_policy ON users FOR ALL USING (tenant_id = current_setting('tenant.id')::BIGINT);

After that, if you set the runtime parameters as follows, you will not be able to access anything other than the record with tenant_id = 999.

SET tenant.id = 999;

Implementation in Rails (draft)

Create a tenants table (Tenant model) that manages tenants. (* Since this is an explanation example, the table definition is omitted.)

Implement so that tenants can be switched with the Tenant # switch method. It is also useful to be able to get the current tenant with Tenant.current.

class Tenant < ApplicationRecord
  def switch
    ActiveRecord::Base.connection.execute("SET tenant.id = #{id}")
  end
  def self.current
    find(ActiveRecord::Base.connection.execute('SHOW tenant.id').getvalue(0, 0))
  end
end

ʻThe before_action of ApplicationController` causes the tenant to switch according to the domain of the request.

class ApplicationController < ActionController::API
  before_action :switch_tenant

  def switch_tenant
    Tenant.find_by(domain: request.host).switch
  end
end

Now you can only touch the data of your tenant.

However, when adding data, you have to enter the tenant_id yourself. Since this is troublesome, implement it in the base class (ʻApplicationRecord`) of Model so that it will be entered automatically.

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
  after_initialize :set_tenant_id

  def set_tenant_id
    if new_record?
      if has_attribute?(:tenant_id)
        self.tenant_id = Tenant.current.id
      end
    end
  end
end

Now you can access data transparently without being aware of the tenant.

important point

RLS works only for general users

The RLS restriction is invalid for users who have CREATE TABLE and SUPERUSER. So, migration is executed by SUPERUSER, and the application is started by a general user. And so on.

Let's give general users the necessary privileges as follows.

GRANT SELECT, UPDATE, INSERT, DELETE ON ALL TABLES IN SCHEMA public TO PUBLIC;
GRANT SELECT, UPDATE ON ALL SEQUENCES IN SCHEMA public TO PUBLIC;

Note UNIQUE constraints

The UNIQUE constraint must be a composite index with tenant_id. (Since it is not visible from the application, there is no need to make a compound condition for validation)

Precautions for INSERT

SELECT only sees transparently constrained records, but you have to set tenant_id yourself when doing INSERT. (In the above implementation plan, it was automatically entered using the base class ʻApplicationRecord` of Model)

Migration notes

The CREATE POLICY required when adding a table will be wanted to be done by migration, but in that case it will not be reflected in schema.rb, so db: reset / db: setup cannot be used. (Db: migrate: reset is ok)

Demerit

This time, I didn't actually operate it with this implementation, but I will raise the possible disadvantages.

The table grows

RDB becomes difficult to handle when the number of records becomes huge. Even if the index is set, it does not get on the memory and suddenly becomes heavy.

Therefore, we will consider using the partitioning (table partitioning) function. There is a method called list partition that divides according to the value of the column, so you will probably use that.

Partition the table for each tenant_id. I first come up with the strategy, but in general it seems that it is not expected to create a child table that exceeds 100 in partitioning, and there are reports that performance problems occur (actually try it). I don't have one). This approach doesn't seem very realistic.

It seems that the strategy will be to divide it manually as the data increases. It's annoying.

Troublesome to delete tenants

It seems to be difficult because I have to erase the records of all the tables. With the multi-schema method, it was easy because you just deleted the schema.

Difficult to migrate data from other environments

Although it is a SaaS type service, it is also provided on-premises. In such a case, it seems to be difficult when data migration from on-premises to SaaS is required. This is because the id of each table changes. In the case of the multi-schema method, all you have to do is dump and restore.

RLS decided to forgo

After all the bloated table seems to be painful. I couldn't get rid of my concern.

The difficulty of migrating with the multi-schema method is a problem only when deploying, and I feel that it is much better than worrying about performance on a daily basis.

There is almost no problem with the migration if the number of tenants is about several hundreds, so depending on the number of tenants, the multi-schema method may be used at the startup stage. I thought.

Apartment has a function to change the DB server according to the tenant, so this is more secure in terms of performance.

Other solutions

Citus https://www.citusdata.com/

A PostgreSQL extension that makes multi-tenancy feel good.

Since it is OSS, it can be installed on EC2, but it cannot be used on RDS ...

It seems that there was a service called Citus Cloud that provides managed services on AWS from 2016 onwards.

However, Microsoft acquired Citus in 2019. Citus Cloud is closed. It seems that it can be used in Azure instead. Ah"~

AWS is a must, so it's tough ...

Apartment development is stagnant

Alright, I will continue to go to the Apartment! I tried to use it in a new project (Rails 6), but it didn't work.

I think it's a fairly major Gem, but at this point (July 2020) it wasn't yet compatible with Rails 6.

There was a Fork version that was actively maintained, so it seems okay to use this for the time being.

Recommended Posts

Multi-tenant with Rails using PostgreSQL's Row Level Security Policy
Try using view_component with rails
Japaneseize using i18n with Rails
Notes on using FCM with Ruby on Rails
Using Material Design for Bootstrap with Rails 5.2