Let’s say you’re implementing the requirement that rows should never be deleted from particular tables in a relational database. Perhaps this is a requirement from regulators. Or maybe the schema declares foreign key constraints (and — yes! — it should, at least in dev environments) but, for reasons such as performance and safety, without cascading deletes.
You’re not trying to use the RDBMS to store an audit trail of every modification, which is usually a bad idea, but you don’t want data to completely disappear, either.
What’s the best way to virtually delete (or deactivate) rows in a RDBMS?
The obvious solution is to add a column that holds a deactivated flag:
This is not the best way
deactivated TINYINT NOT NULL DEFAULT 0
Now consider adding a secondary unique index to the table. If you deactivate a row and later attempt to insert a row with the same natural key, the unique index won’t allow it.
This sounds like a good thing for most scenarios, but for pure relationship tables, it can be desirous to insert new rows instead of reactivating them. Please, reactivating rows is cumbersome and forces your developers to change their coding patterns.
There’s a better way: reverse the flag’s meaning and use NULL to indicate whether the row has been deactivated:
This is better!
active TINYINT NULL DEFAULT 1
NULL means the row is inactive, instead of 0.
The trick: Add the active column to unique indexes
When active is NULL, the index will ignore the entire row. Multiple rows therefore can have the same natural key, so long as active is not-NULL in at most one row.
Store the deactivation time
Also consider adding a column to hold a timestamp when the row was deactivated. This column will be useful in case you need to audit the database.
Implement a view layer to prevent bugs
Create one view per table that excludes rows in which active is NULL. By selecting from the views, you will never have to worry about a developer forgetting to add “active is NOT NULL” to every WHERE and JOIN clause.