Commit graph

6 commits

Author SHA1 Message Date
4d24a9577f Only run public_data:pull if there are no pending migrations
Oh this was a fun little dev environment bug: I ran `public_data:pull`
on my laptop before migrating my database, so the `items` table pulled
as the latest production version, which included the migrations, but
they hadn't been marked as "run" yet.

So Rails was still telling me I needed to run them, but the migrations
themselves were crashing, with stuff like "there's already a column
with this name!"

This change ensures that `public_data:pull` won't run until migrations
are done, to prevent silly accidents like that.
2024-06-18 14:52:54 -07:00
d3d0cda81f Oops, fix symlink for /public-data/latest.sql.gz
Oh whoops, I was symlinking to the *full* path of the latest dump,
which includes the site version directory in it. This meant that, if 5
new versions of the app were deployed since the most recently public
data commit (and so that version is deleted), the symlink fails.

In this change, we just symlink to the filename, which behaves as a
relative path and should be completely resilient to deploys changing
where these files ostensibly live!!
2024-05-29 19:01:23 -07:00
c751173c52 Fix public_data:commit's symlinking on some platforms
Huh, curious, I think what I'm seeing is: on my development machine,
`File.exist?` returns true for symlinks, but, on our production
machine, `File.exist?` returns false for symlinks.

I imagine this is a difference in the implementation of the underlying
system calls? Curious!

This new check should work more reliably across platforms. I considered
checking both `exists?` and `symlink?`, but decided that, in the
unexpected case that `latest.sql.gz` exists but is an actual file
instead of a symlink like we expect, it's probably best to avoid
overwriting it anyway, and a crash on the `symlink` attempt is a
reasonable way to do that.
2024-05-02 13:10:30 -07:00
7c09b76b5e Require fewer db privileges to run public_data:commit
In newer versions of MySQL, `mysqldump`'s default behavior requires
accessing some privileged `INFORMATION_SCHEMA` tables, which requires
the global `PROCESS` permission.

Rather than require that, we can just skip this step, by adding the
`--no-tablespaces` argument. This was the guidance I found when looking
up this issue! https://dba.stackexchange.com/a/274460/289961
2024-05-02 13:06:27 -07:00
98dd9ec782 Create rails public_data:pull task, to load up the latest public data
Yay, it works! Easy peasy! Love this way of integrating shell and Ruby,
it's cute!
2024-03-01 13:18:58 -08:00
8dc11f9940 Create rails public_data:commit task, to share public data dumps
I'm starting to port over the functionality that was previously just,
me running `yarn db:export:public-data` in `impress-2020` and
committing it to Git LFS every time.

My immediate motivation is that the `impress-2020` git repository is
getting weirdly large?? Idk how these 40MB files have blown up to a
solid 16GB of Git LFS data (we don't have THAT many!!!), but I guess
there's something about Git LFS's architecture and disk usage that I'm
not understanding.

So, let's move to a simpler system in which we don't bind the public
data to the codebase, but instead just regularly dump it in production
and make it available for download.

This change adds the `rails public_data:commit` task, which when run in
production will make the latest available at
`https://impress.openneo.net/public-data/latest.sql.gz`, and will also
store a running log of previous dumps, viewable at
`https://impress.openneo.net/public-data/`.

Things left to do:
1. Create a `rails public_data:pull` task, to download `latest.sql.gz`
   and import it into the local development database.
2. Set up a cron job to dump this out regularly, idk maybe weekly? That
   will grow, but not very fast (about 2GB per year), and we can add
   logic to rotate out old ones if it starts to grow too far. (If we
   wanted to get really intricate, we could do like, daily for the past
   week, then weekly for the past 3 months, then monthly for the past
   year, idk. There must be tools that do this!)
2024-02-29 14:30:33 -08:00