Arrival
or what follows a departure?
This all started when a frontend engineer at WeTransfer exclaimed:
“Don’t touch the transfers table”
The background of my colleagues’ request is quite honest: At WeTransfer we have some tables that
- have a lot of concurrent reads/writes
- broke (past tense) the whole site during maintenance
My initial reaction when engineers fear a part of the system: Change it more often.
In this case that means: change a database table, even if some of us are afraid of it. Ask the help of engineers experienced in the field.
The models of our site are ActiveRecord models, from the Rails framework. Changing the schema of the models is as easy as creating and running a migration.
The downside of migrations
While changing the schema can done with code, that doesn’t mean the change can be applied in the blink of an eye.
The problem with tables that multiple Central problem, solved long ago: Migrations take long, causing hot tables (and their associations) to blow up a running app.
1. Migrations (specifically `ALTER TABLE`)
1. What is a migration?
1. DDL (Data Definition Language)
A data definition or data description language (DDL) is a syntax similar to a computer programming language for defining data structures, especially database schemas. DDL statements create and modify database objects such as tables, indexes, and users. Common DDL statements are `CREATE`, `ALTER`, and `DROP`
source: [ddl]
1. DML (Data Manipulation Language)
A data manipulation language (DML) is a computer programming language used for adding (inserting), deleting, and modifying (updating) data in a database.
source: [dml]
1. What is so special about `ALTER TABLE`?
1. ~Algorithms : `COPY` and `INPLACE` see [alter table rtfm]~
1. What are the different locks that play a role? (see [locks])
1. diagram for (aqcuiring) a write lock
1. diagram for (aqcuiring) a read lock
1. diagram for (acquiring) an exclusive lock
1. TODO: What is a meta data lock?
1. How does a regular `ALTER TABLE` migration run in MySQL?
1. Is a regular alter table sugar coated by ActiveRecord?
1. Is this good or bad? Can we do better?
## Solution 1
1. Alternatives to a regular migration
1. LHM
1. operational graphic
1. pros/cons
1. `departure` and `pt-osc`
1. operational graphic
1. departure and pt-osc:
1. pros
- not 1 long, but 2 short moments of read locks (meta data lock?)
- `INSERT`s and `UPDATE`s while migrating
1. cons
- requires more disk space
- triggers (they execute as part of your transaction)
- Story: When a relational database is forced to choose between earning a million but having to be inconsitent vs. being consitent but having to work for the rest of its live, it will choose the latter.
- Now you have 2 tables - one should be dropped.
## Solution 2
1. Alternative 3: `gh-ost`
1. Graph
1. Roll your own active record library (and open source it / aka arrival)
1. `arrival` and `gh-ost`
1. pros
- trigger free
- when paused, it really adds no strain on the master db
1. cons
- preferred mode of operation is based on a replica
- if all tables are replicated, it takes 4 times as much disk space (double on main, double on (all) replicas).
- Now you have 2 tables on the main and **all** replicas The old one should be dropped.
## The new problem
Dropping a large table (with many buffered pages) doesn't go unnotices (aka delays/ downtime)
*or "If a tree falls in a forest and no one is around to hear it, does it make a sound?"*
1. Delays / outage for dropping a big table
1. **Cheap attempt 1**: Wait till it has no buffered pages open anymore
- 🥊 fail
1.
### resources
[alter table rtfm]: https://dev.mysql.com/doc/refman/5.7/en/alter-table.html
[ddl]: https://en.wikipedia.org/wiki/Data_definition_language
[dml]: https://en.wikipedia.org/wiki/Data_manipulation_language
meta data lock: [meta data lock]: https://dev.mysql.com/doc/refman/5.7/en/metadata-locking.html
23/12/2019 22:15pt-
locking issues: https://dev.mysql.com/doc/refman/5.7/en/locking-issues.html
internal locking: https://dev.mysql.com/doc/refman/5.7/en/internal-locking.html
transactions don't work with ALTER TABLE: https://stackoverflow.com/questions/55563562/mysql-ddl-query-stuck-in-waiting-for-table-metadata-lock-when-wrapped-in-transac
> From that article:
> Bottom line: you cannot rely on transactions to handle concurrency when running an ALTER TABLE statement. As a consequence, you dont want to ALTER a table while it is busy on some other transaction: this is not safe. Before running such statement, you want to make sure that no lock is set on the table. Yes, that surely incurs painful constraints when dealing with production-like databases, but it looks like it's just how MySQL does it...
triggers live inside a transaction https://dev.mysql.com/doc/refman/5.7/en/trigger-syntax.html:
> For transactional tables, failure of a statement should cause rollback of all changes performed by the statement. Failure of a trigger causes the statement to fail, so trigger failure also causes rollback. For nontransactional tables, such rollback cannot be done, so although the statement fails, any changes performed prior to the point of the error remain in effect. </code>