Add Safe Ecto Migration guides#720
Conversation
|
|
||
| ### Good | ||
|
|
||
| **Strategy 1** |
There was a problem hiding this comment.
We should call these **Option 1** and so on for consistency with the rest of the guide?
There was a problem hiding this comment.
Alternatively rename Option 1 to Strategy. :)
|
|
||
| Here's how we'll manage the backfill: | ||
|
|
||
| 1. Create a "temporary" table. In this example, we're creating a real table that we'll drop at the end of the data migration. In Postgres, there are [actual temporary tables](https://www.postgresql.org/docs/12/sql-createtable.html) that are discarded after the session is over; we're not using those because we need resiliency in case the data migration encounters an error. The error would cause the session to be over, and therefore the temporary table tracking progress would be lost. Real tables don't have this problem. Likewise, we don't want to store IDs in application memory during the migration for the same reason. |
There was a problem hiding this comment.
Instead of a temporary table, couldn't we use a temporary column? Or is the issue that removing the column later would be expensive?
There was a problem hiding this comment.
It assumes the backfill is about one column, and if so that could be another option, but it doesn't have to be: the user might be doing multiple fixes and write to multiple columns.
The point is that we need to store a list of records that a generic operation needs to run on.
There was a problem hiding this comment.
I think Jose means you add a column to say "fix applied: yes or no". Then when you iterate through your table doing the updates you use that column to prevent yourself from applying the fix more than once in case of restarts. And whatever filter you used to populate your temporary table can be used to stop the iterating.
There was a problem hiding this comment.
Ah I understand now. Yeah that could be a good approach but yes also have to consider the same gotchas with adding columns with defaults on large existing tables.
I find a separate temporary table safer and less coupled. Less likely to fool with application logic
|
Thank you @dbernheisel! This looks excellent and I do have some quick feedback:
|
|
Absolutely. I'll go through feedback today |
|
The gotchas are important however for each flavor. The MyXQL adapter uses advisory locks so there are less issues with transactions. The Postgres adapter does not default to advisory locks (maybe it should?) so that it can avoid some gotchas MSSQL I have less experience so more verification is needed Sqlite3 is different enough where I don't suspect these same gotchas apply, but again more verification is needed. I think the recipes could be formatted better to have callouts per adapter? It does cover MySQL and Postgres, but is silent on the others. |
Yeah, I was not asking to remove those, those are really great! It was mostly the debugging locks bits and the reference material, so I pushed my changes already. We could have a separate document on debugging PG locks but as I said, easier to do in a future PR. :) From my point of view, nothing else needs to removed. There are still some PG specific commands still but it is very easy for someone to consult those for other databases. |
|
Sorry I left it stale. I took care of some feedback. Let me know what else could be improved |
No description provided.