Dirigo's Approach to Multitenancy and Scalability08/02/2016
We're having lots of discussions on multitenancy and scalability so I thought I would lay-down some of our inside-thinking in a blog post. For now this is a fragment or unfinished work. Prompting this post is a prototype that we've built in ASP.NET 4.5 & MS SQL 2016 without much thought for multitenancy and a thin time-line to turn it into a multitenant scalable application. We're going to outline our approach from another of our projects which has had almost 2 solid years of development.
Multitenancy and scalability are pretty tightly tied together. We want to have applications that can host multiple Clients (e.g. Parent Company) so that each client can have multiple Sites (e.g. Location1, Location2, Location3), and we want the ability to spread Client and Sites out across multiple databases -- and ideally multiple database servers. In this example or scenario, ClientID represents Parent Company and SiteID represents Site such that there is a one-to-many relationship between Client and Site, and between Site and various other objects (e.g. Products).
The database of Client login information (that allows a Client into the administration area) should be separate from the database where site-specific data is stored, because in the case of multiple Database instances (e.g. Database_1 and Database_2) we want a single User (particularly a Dirigo employee) to have access to multiple database instances. So we want the ASP.NET Identity tables in a separate database.
The tables containing End-User login info or data (e.g. the customers the Current Company Site should be stored either in a separate database or in each individual Database instance; there are pros and cons to each approach. But they shouldn’t be combined with the client login data, because there’s really no reason to store them together, and it presents a potential security concern. One of the challenges is that ASP.NET Identity expects email address to be unique across the entire database of Users, and in the multitenancy scenario we want the email address to be unique only within a particular Site, or a particular Client (I’m not sure which: perhaps one or the other depending upon settings?)
In order to make the customer email address unique only within a particular Site or Client, we’ll change the default ASP.NET Identity code. We’ll also modify the database schema so that there’s a relationship between the User and the corresponding Client or Site.
In our Database on STAGING we want the identity columns to have a seed value of 2^23+1. Then we’ll create Check Constraints on each ID column, to make sure that data never gets stored in the wrong database. The constraint is something like this:
ALTER TABLE [dbo].[Sites] WITH CHECK ADD CONSTRAINT [CK_Site_SiteId] CHECK (([SiteId]>=(2147483649.) AND [SiteId]<=(4294967295.))) ALTER TABLE [dbo].[Sites] CHECK CONSTRAINT [CK_Site_SiteId]
We’ll also name our database to be Database_1 or something like that, so that it’s programmatically easier to target a specific database based on its range of ID values (i.e. “Database_” + (int)(idvalue / (2^32))). The way to create a new database instance number is to script an existing database, then do a simple search and replace to find all instances of (2^32*oldInstanceNumber)+1 (e.g. 2147483649) and replace it with (2^32*newInstanceNumber)+1 (e.g. 4294967297 for database instance 2); and also replace all instances of (2^32*(oldInstanceNumber+1)-1 (e.g. 4294967295) with (2^32*(newInstanceNumber+1)-1 (e.g. 6442450943 for database instance 2).
We might not need to create multiple database instances in Production any time soon, but we recommend creating databases with different ID ranges for the different environments (Dev, QA, Staging, & Production) just to make sure that no data for one environment ever ends up in another, and also so that in extraordinary circumstances an entire database from one environment could be used in another environment (e.g. in the QA environment we could host both Database_1 and Database_2, even though Database_1 originated in the Production environment).
Once it’s time to modify the code to actually support multiple database instances in a single environment (coming shortly), we’ll migrate away from simply instantiating a DbContext object using the default connection string, and instead use a factory class with a method that accepts any ID value and then returns the appropriate DbContext value; it will do this by taking the ID value, dividing by 2^32, converting to an integer, grabbing the appropriate connection string (so there would be separate connection strings for Database_1 and Database_2), and instantiating a new DbContext using a constructor that accepts that connection string. It is easy to implement this logic from the outset, but it’s certainly possible to add it later.
There might be other tables, in addition to the client-facing ASP.NET Identity ones, that should be centralized rather than instance-specific. Any tables that are in those shared databases don’t need to use 64-bit integers, since there’s only one copy of that data; so regular 32-bit integers with ID values starting at 1 would be fine.
I'll continue adding to this post as we dig deeper into this subject over the coming weeks.