I'm starting to port over the functionality that was previously just,
me running `yarn db:export:public-data` in `impress-2020` and
committing it to Git LFS every time.
My immediate motivation is that the `impress-2020` git repository is
getting weirdly large?? Idk how these 40MB files have blown up to a
solid 16GB of Git LFS data (we don't have THAT many!!!), but I guess
there's something about Git LFS's architecture and disk usage that I'm
not understanding.
So, let's move to a simpler system in which we don't bind the public
data to the codebase, but instead just regularly dump it in production
and make it available for download.
This change adds the `rails public_data:commit` task, which when run in
production will make the latest available at
`https://impress.openneo.net/public-data/latest.sql.gz`, and will also
store a running log of previous dumps, viewable at
`https://impress.openneo.net/public-data/`.
Things left to do:
1. Create a `rails public_data:pull` task, to download `latest.sql.gz`
and import it into the local development database.
2. Set up a cron job to dump this out regularly, idk maybe weekly? That
will grow, but not very fast (about 2GB per year), and we can add
logic to rotate out old ones if it starts to grow too far. (If we
wanted to get really intricate, we could do like, daily for the past
week, then weekly for the past 3 months, then monthly for the past
year, idk. There must be tools that do this!)
I hadn't realized for a while that we weren't already doing this lol, I
had noticed that `bundle install` in production was slower than I
expected when adding new stuff, but it was when we did this big recent
`bundle update` that I really noticed the difference.
Fixed now, I think! Though the real test will come when we actually
have a new gem to install, since this was a no-op case.
As the comment in `deploy.yml` explains, this was a multi-step process,
but it went very smoothly as planned, hooray!!
I noticed again while making this change that Bundler doesn't seem to
be availing itself of the checked-in dependencies in `vendor/cache`. I
think I know the fix for this, I'll toss it into an upcoming change and
see if it works!
Looking back at this now I'm just like. Oh right, of course, we don't have passwordless access to *become root*, so of course Ansible's strategy of becoming root and then running the playbook step was failing!
I did some refactoring while here too, of pulling the deploy scripts out of `package.json` and into `bin`, to be a bit more canonically Rails-y. (idk how canonical the colon thing is but, probably fine??)
Okay, this is much simpler than the impress-2020 version where we symlinked node_modules and stuff - Bundler is just a lot better at this lol
Right now, the app is failing to start because we don't install Node—I wasn't sure whether we'd need to and whether I was gonna precompile the assets etc
Though now that I say that out loud, I guess part of the issue might be that I'm not sure the app is running in RAILS_ENV=production, I wonder if it still wants Node in that case?? I'll flip that switch in the service file now, then commit to save my place for the day, then try again with starting the app sometime and see what it says!