xyzzy_plugh 18 hours ago

This is exactly backwards. You should have declarative schemas but inferring migrations is crazy. Only pain will follow.

Instead, I am a fan of doing both: either committing the resulting schema of a migration, or hand writing it aside the migration. Then have tests to ensure the database schema matches the expected schema after a migration.

Generating these artifacts is fine, but in TFA's case there is no chance I wouldn't inspect and possibly modify the generated "diff" migration. It's significantly easier to go the other way: write the migration and show me the resulting schema diff.

  • evanelias 16 hours ago

    As a concept, declarative schema management isn't crazy at all. Several thousand companies use this approach, including some huge names like Google, Meta, and GitHub, but many smaller companies too.

    When implemented well, with appropriate guardrails and linters, it's perfectly safe. And it has many benefits over imperative schema migrations, such as a substantially better Git versioning flow, and ability to synchronize environments / solve schema drift natively.

    The only major conceptual downsides are the inability to handle row data migrations, and inability to handle renames. These can be major hurdles for some smaller companies, but are more irrelevant at companies with large DB infrastructure, who require special-case handling for those operations anyway. In other words, if you have large tables, row data migrations are already substantially more complex than running a single UPDATE statement, and you can't do them using a traditional imperative migration tool anyway.

    > there is no chance I wouldn't inspect and possibly modify the generated "diff" migration

    Of course, you're generally supposed to do that with these tools. Database changes should always be reviewed carefully. Same with non-database-related infrastructure-as-code tools.

    • kiwicopple 16 hours ago

      > there is no chance I wouldn't inspect and possibly modify the generated "diff" migration

      > Of course, you're generally supposed to do that with these tools

      this seems to be the crux of the comments - we'll try to make it much clearer than declarative schemas don't skip any migrations/review process, they just provide another way of generating the migrations (note: you can also generate them directly from the database using the CLI)

  • williamdclt 16 hours ago

    Completely agree.

    When writing a migration, the resulting schema is usually much, much less important than the characteristics of the migration itself. When I review a migration, my first question isn’t “is this the right schema” but “is this migration going to bring downtime”. I’d much rather a smooth migration to an incorrect schema than having a critical table locked for minutes/hours.

    I think that updates of stateful components should be imperative (explicit migrations), not declarative (implicit migrations). For example I don’t think Terraform is great tool to manage RDS: it doesn’t tell you the consequences of changing an attribute (database restart or other downtime-inducing stuff), I’d much rather I had to explicitly say how to get from state A to state B.

    Similarly, I don’t think SQL migrations are perfect: they’re still declarative, you still need implicit knowledge to know if a migration will take a lock and what will be the consequences. I’d much rather have to code “take explicit lock; alter table xxx;”.

    This tool probably allows editing migrations, but I don’t think it’s a step in the right direction. Maybe it’s a step towards databases being much better at migrations (so that we can rely on never having downtime), but even then I think it’ll get worse before it gets better

    • sweatybridge 16 hours ago

      > it doesn’t tell you the consequences of changing an attribute (database restart or other downtime-inducing stuff)

      Modern diff tools are designed to provide better guardrails in these situations. For eg, pg-schema-diff [0] tries to generate zero downtime migrations by using lock-free migrations and warns you about potentially hazardous migrations.

      I think it's good direction to bake these best practices into the tooling itself, rather than relying purely on the experiences of engineers.

      [0] https://github.com/stripe/pg-schema-diff

      • evanelias 16 hours ago

        This is exactly the key point. Declarative schema management is typically better at catching hazardous operations, because declarative schema management tools inherently require the ability to introspect your desired CREATE definitions and also introspect the current database state.

        Once a tool has those abilities, adding linters and destructive-action guardrails is much easier. Especially when compared to a traditional migration tool, which often has no underlying understanding of the ALTER TABLE operations being requested.

  • layer8 16 hours ago

    Yes. In particular, migrations expressed as SQL statements are strictly more powerful than a diff of two schemas, so there are situations where you can’t infer the former from the latter (but you can always infer the latter from the former).

    I also object to the use of “declarative” here. Either we are talking about the schema as present in the database, then it’s neither declarative nor non-declarative, it’s just whatever is in the database. Or we are talking about a schema definition, and then I really don’t know what a non-declarative schema definition would look like, in contrast to what is called “declarative” here. Thirdly, the traditional “declarative” SQL schema definition is really a series of imperative SQL statements, so arguably not declarative.

    What they seem to mean is a minimal sequence of statements that results in the desired schema, as opposed to a longer history of schema-altering statements. However, the minimal version is technically still a series of schema-altering statements, starting from (presumably) an empty schema.

    • evanelias 16 hours ago

      "Declarative" is the correct word, as the input to the tool is a desired-state set of CREATE statements, and the tool figures out what DDL is actually necessary to make the target DB reach that desired state.

      In other words, the engineer is declaring "make my schema look like this" and the tool makes that happen. That's the key definition of a declarative system, consistent with how this is described in CS classes.

      Meanwhile traditional migration tools are "imperative", as the engineer must tell the tool exactly what operations (e.g. ALTER) to be run.

  • skybrian 18 hours ago

    It seems like generating the diffs from the schema's version history is equivalent to doing it the opposite way, provided that each diff is tested to make sure the database upgrade works. Not all diffs will correspond to feasible database upgrades, so some patches would have to be rejected.

    • bootsmann 16 hours ago

      Not all migrations are as simple as adding and removing columns.

    • pounderstanding 17 hours ago

      Migrations give more control.

          Alter table foo add column bar;
          Update foo set bar=baz;
          Alter table foo modify column bar NOT NULL;
      • skybrian 16 hours ago

        True. Perhaps annotating the schema would help for simple things?

  • perrygeo 14 hours ago

    > You should have declarative schemas but inferring migrations is crazy. Only pain will follow.

    Inferring migrations isn't crazy. Automagically applying those migrations without review is crazy.

webster451 16 hours ago

I think we are getting close to the peak of "declarative"—or rather, I hope we are near the peak.

In my experience, declarative APIs are very powerful abstractions for specific cases where finding the path to the declared state is better left to a machine. This is seldom the case - in most cases, offering the programmer control over the changes leads to better behaviors.

Kubernetes and IaC tools lead the way to a declarative state of infrastructure and these add a ton of value. But, they were also incredibly hard to build - it took many years before Kubernetes eventing and control loop abstracts were rock solid. Most CRD-backed implementations suffer from tons and tons of bugs, and most CRDs are not declarative - they abstract away an imperative operation! I guess this is nothing new - "anything in excess is bad".

Anyways, I think an imperative approach offers much higher control and predictability at a lower cost. The world inherently is imperative.

  • carlmr 16 hours ago

    >In my experience, declarative APIs are very powerful abstractions for specific cases where finding the path to the declared state is better left to a machine. This is seldom the case

    I've been waiting for this top comment for longer than you can imagine. The declarative madness has always bothered me. Sometimes it's easier to maintain when you see the process. And harder to declare the final state. It might look neat, but maintainability beats neatness every day.

  • Onavo 16 hours ago

    I mean they just reinvented Prisma and Django

    • evanelias 15 hours ago

      Yes, although the article isn't claiming to have invented declarative schema management. They're just saying it is now available as a feature in Supabase. (Personally I think that's great!)

      Regarding prior art: Django migrations are indeed declarative, and were very early in this space. But they're tied to Python model definitions in the ORM, which is a bit more of a special-case than the native SQL CREATE based approach described here.

      As for Prisma Migrate, they directly copied several innovations from my tool Skeema [1] which has been available since 2016, so they can be equally accused of "reinventing" things :)

      Not that I invented pure-SQL declarative schema management either, by any stretch. I was largely inspired by the workflow at Facebook, who adopted declarative table management company-wide back in ~2012. FB found that having a declarative reconciliation loop is an operational necessity with a massively sharded system, given some hardware just dies every day. And centralized SQL-based schema management is advantageous when applications are written in many different programming languages.

      [1] https://github.com/skeema/skeema

bartvk 18 hours ago

So to summarize.

In the old situation, you write CREATE TABLE statement at the start of the project. And when you add a feature, you have to write an ALTER TABLE script.

In this new situation, you just change the CREATE TABLE script. And Supabase uses migra to figure out the difference and it automatically alters the table.

What's interesting is that in your SQL code, there's no longer any difference between creating a new database, and updating an existing database.

  • infogulch 17 hours ago

    I've used this for Microsoft SQL Server and SQL Database Projects. It's basically as you say: write it as if creating a new database, then deploy it in CI where it does a diff on the live database to come up with the actual migration strategy on the fly. If you're clever you add a manual review stage in the pipeline and have the db engineers approve the generated migration script before deployment is completed automatically. https://learn.microsoft.com/en-us/sql/tools/sql-database-pro...

    I helped set this up in a fortune 500 company a few years ago. They were using a team of db engineers to execute manually written change scripts, with manual reviews, control processes, and deployment schedules. You'd be lucky if you got your schema change to prod in a month. When they started integrating this tool on some trial candidates they found SO many inconsistencies between environments: server settings differences, extraneous or missing indexes, vestigial "temp" tables created during previous migrations, enum tables that should be static with extra or missing rows, etc, etc. All the environment differences meant that deployments had to be manual in the past. Once they got through the initial pain of syncing up the environments the whole department got way more efficient.

    • pounderstanding 16 hours ago

      > they found SO many inconsistencies between environments

      This implies somebody with admin rights makes alterations in ad-hoc way without first doing it in test env.

      If they continue with adhoc stuff, then it means auto-generated migrations will be different in test vs prod. (I prefer to test exactly same thing that will be used in prod)

      • evanelias 16 hours ago

        > This implies somebody with admin rights makes alterations in ad-hoc way without first doing it in test env.

        Not necessarily. With a large team/org using the same database schema, it can just mean multiple people were trying to make changes to an environment around the same time, e.g. the migrations were applied in a different order in staging vs prod.

        Some migration tools provide extra checks for strict ordering, but many do not. There's often no guarantee that the migration file naming scheme ordering, Git commit ordering, and actual DB apply ordering line up -- that's 3 different possible sources of truth, or more since the DB state varies by environment (dev/stage/prod etc).

        Late-night hot-fixes (to solve an emergency outage) can be another source of inconsistencies / drift.

        > If they continue with adhoc stuff, then it means auto auto-generated migrations will be different in test vs prod

        That depends on the declarative tool and whether it fully syncs the schema each time, or just generates migrations which are frozen into a plan which is executed as-is in all environments. Not that full-sync is bad, but yes in that case it will generate different things in each env. Although the end result is that it will solve the drift, and give you the same end state in all environments. And that's likely what you want to happen: after running the tool, the database state will match the desired state which was expressed by the CREATE statements in your schema repo.

        That said, the declarative tooling should have sufficient safety checks to ensure it doesn't do anything destructive in prod without very loudly telling you and requiring manual confirmation. That way, you won't be harmed when trying to synchronize an environment that had unexpected out-of-band changes.

    • sweatybridge 17 hours ago

      > When they started integrating this tool on some trial candidates they found SO many inconsistencies between environments: server settings differences, extraneous or missing indexes, vestigial "temp" tables created during previous migrations, enum tables that should be static with extra or missing rows, etc, etc. All the environment differences meant that deployments had to be manual in the past. Once they got through the initial pain of syncing up the environments the whole department got way more efficient.

      That was exactly our experience too.

      Perhaps we didn't highlight enough in the blog post that schema diff was not meant to replace manual review. It simply provided a good starting point for us to iterate on the migration, which often boosts efficiency.

    • rjbwork 17 hours ago

      >If you're clever you add a manual review stage in the pipeline and have the db engineers approve the generated migration script before deployment is completed automatically.

      This is how I've set it up at my current employer. It works well. We modeled it after the Terraform Plan/Apply steps, and double check that the script generated by the "apply" step matches the script generated by the "plan" step, since these can occur at significant temporal distances, and fail it if not, just so that we can be sure what we've read and approved matches what gets executed.

      • kiwicopple 17 hours ago

        > db engineers approve the generated migration script

        yeah - this is definitely the intended flow here. We won't be recommending anyone blindly applying generated migrations.

        As you mention, it is expected that you generate & review on your local development machine, check into source control, push & merge. We've also been using this internally for ~2 years now and it works great

        • pounderstanding 16 hours ago

          Do you keep the history of applied migrations? (Just in case subtle issue need to be investigated later)

          • kiwicopple 16 hours ago

            yeah, migrations are generated from the Declarative files. For example, the steps are for adding a new column:

            1/ Add a new column to the declarative file

            2/ Generate a new migration: `supabase db diff -f my_new_migration`

            3/ Review/edit the generated migration, check it into git

            4/ Apply migration to database

  • prmph 17 hours ago

    But if you have data to migrate, it is not always possible to infer the diff, no?

    Say I have a users table with a name column. Then I alter the table and split the name column into two new columns: first name and last name.

    How is it possible to infer this change, just from seeing the new and old columns?

    • efxhoy 17 hours ago

      The best practice way to swap fullname for firstname, lastname would be to:

        1. Migration that adds firstname and lastname columns will all nulls
        2. Deploy application code change to start populating firstname and lastname alongside fullname, still reading fullname in the code.
        3. backfill the firstname and lastname values with a script/command/migration
        4. change app code to read firstname and lastname and stop writing fullname
        5. drop the fullname column
      
      I don't think there's a safe way to do all that in a single migration unless all your app code also lives in the database so it can be atomically deployed. If you have multiple app servers and do rolling deploys with no downtime I think it has to be done in these 5 steps.
      • MrMcCall 16 hours ago

          6. ensure there are no nulls in firstname and lastname
          7. alter the columns to be NOT NULL
        
        Because no non-statistician uses nullable columns, right?

        Of course, some dbs (SQLServer?) infer NULL from the empty string, or am I misremembering?

        Always having the columns be NOT NULL is a fundamental cheat, after always having a PK, or is that too old school for 2025?

        • efxhoy 14 hours ago

          There's nothing wrong with nullable fields when it's appropriate. When kids are born they don't have names. Not all users want to tell you their names. A null value is data too.

          • MrMcCall 14 hours ago

            > when it's appropriate

            Yes, it just requires extra care when querying and handling the rows.

            It's always just easier, if you can, to make it NOT NULL after prepopulating all rows' columns to the empty string (or real data).

            Sometimes NULL is truly different than the empty string, but that, like you said, is just a kind of data.

    • evanelias 16 hours ago

      With large tables, you can't safely make data changes using migration tools either anyway. If you run a single UPDATE against a large table, you end up with a very long transaction with substantial MVCC implications (old row versions that the db needs to clean up) and can basically break production easily.

      Side note, but why do folks always bring up this "firstname lastname" example? It is not ever possible to implement that correctly in an automated fashion: some people have spaces in their first names, and some people have spaces in their last names. (I don't mean to single you out, as this exact example comes up multiple times in every declarative vs imperative discussion!)

      • prmph 16 hours ago

        Oh, I am aware of the problems with this scheme for names. I just used it as a simple example. I wouldn't design my users table this way.

        If I can, I even avoid storing actual names of users at all. If I have to, I use a simple free form "display_name" column; what is put there is up to the user.

neutralino1 17 hours ago

It seems to me Rails has been doing this but better for years. It definitely keeps atomic and historical migrations, but also maintains a schema.sql file that can be loaded as a one-off (e.g. for mock DBs in tests).

  • kiwicopple 17 hours ago

    > doing this but better

    I'm curious what makes it better than the approach in the blog?

    If it's this:

    > also maintains a schema.sql file that can be loaded as a one-off

    That's exactly what the declarative files provide

joshAg 17 hours ago

We built something similar for the managed DB we use, but i think it's a mistake to autogenerate the migration scripts instead of autogenerating the schema from the migration. Things like changing an enum, adding a nonnull column that shouldn't have a default to an existing table that already has data in it, and migrating data from one representation to another (eg, 'oh hey, we definitely shouldn't have made our users table have an fname and an lname field. let's change to full_name and preferred_name') are easily done in a migration script but hard, if not impossible, to infer from just schema changes.

Guillaume86 14 hours ago

Could this be moved into a standalone CLI tool? Is there anything supabase specific about it? I've always wanted SSDT SQL projects for postgres (SSDT is MS declarative schema management solution for SQL Server).

SSDT can also sync db projects (nicely organized DDL .sql files representing the schema) and databases (one way or the other), with the IDE support you can do stuff like "find all references" on a column or any other DB object, and build the project to check for errors. Linting the schema becomes possible, etc I have a hard time when I have to go back to imperative schema management...

ucarion 18 hours ago

Sorry if asked and answered: can you hand-edit the generated migrations? Like, what if I want to do a create index concurrently or something?

geocar 17 hours ago

You're going to regret this.

The thing you need to be doing is testing your migrations, and some dumbass on your team is going to generate the migration during CI and load it into your database as a merge step, and you won't realise what a mistake this was until possibly years later.

The good news, is you might be able to pay someone an enormous amount of money to unfuck things. Not good for you, I mean, obviously. Good for whoever you just bought a car.

  • mylons 17 hours ago

    we at tahoe labs dot io are here for you in that case.

xaas 17 hours ago

I use ZenStack. In future releases they are moving away from supabase…

mylons 17 hours ago

am i missing something? what does this offer over raw sql? it honestly looks very similar

  • kiwicopple 17 hours ago

    It is raw sql. The update here is more of a "workflow" change.

    In the past we only offered tools to create a sql migration to make a change to your database. Now you can write the state of your database into files, then the migration is generated for you.

    This is very similar to something like Rails/Phoenix/whatever, where you write your models and then it generates a change. The difference is that here you write your models in raw SQL, rather than an ORM