Lol ok, as I had kinda predicted, the memory bounds I set last time
were not tight enough, and it stalled out again! (It was at 75% and
fully just not working.)
Let's try this tighter bound instead!
So, Dependabot correctly reported that this version of puma is
vulernable, which I fixed in the main app already—but I didn't notice we
also use that version in this cute tiny placeholder app we use early in
the deployment process.
There's not a real security need to upgrade this, as this placeholder
app has no access to useful data when it is run, but I think it's better
to resolve this by fixing it than by silencing Dependabot! May as well!
I just restarted the impress app in production! First I logged in to see
why it wasn't responding, and I saw that there was almost no free RAM
left, and that the Rails app had grown to eat it all up!
So in this change, we set a memory limit: if the impress app is taking
up more than 75% of the machine's RAM, systemctl will try to shrink it
down; if it can't, then it will kill the app at 80%.
I'm not totally sure whether these bounds are tight enough? I didn't
look closely enough at the numbers to see what the app's actual usage
was according to systemctl at the time (`sudo systemctl status
impress`), so my hope is this is enough. But if we run into a memory
leak crash like that again, because it turns out even existing at 75%
RAM freezes the machine when running alongside its other processes, we
can decrease these numbers!
I also don't know the nature of the memory leak, and that could be worth
investigating—the app pretty cleanly fits into ~500–600MB when it starts
up, but then does seem to slowly but steadily grow. If it could be kept
at that size, it's possible we could downgrade the server and save some
costs—but that's a question for another day, since making sure we handle
memory leaks when they *do* happen is a more important robustness fix!
Rails already creates little pre-gzipped `.gz` copies of all our assets
in the `public/assets` directory when we build. This configures nginx to
send those when available!
We weren't doing *any* gzip stuff before, so this helps a lot with those
bigger JS files, like the `wardrobe-2020` stuff. It's now at ~.5MB with
compression, which is still a bit big, but nowhere near as offensive as
the 4.5MB pre-anything, or 1.5MB post-minification, lol.
We're now all-in on impress.openneo.net for this box!
One little wrinkle is that certbot was initially upset that I had
already uploaded the copy-pasted certs from the other box to here, at
the file path it expected to get to manage. So, I moved those to
`/srv/impress/shared/temp-certs`, and changed the nginx config
accordingly; and then deleted the original and let certbot control it!
The usual stuff! Installed the new gem and its new deps, ran
`bin/rails app:update` and did my best to manually merge the dev/prod
config files with the new canonical defaults, deleted some migrations I
don't think are relevant to us, and yeah!
Also, Rails 7.1 seems to need `libyaml-dev` installed, so I added that
to the `deploy/setup.yml` playbook!
One thing to note is that, while I was here, I turned on some settings
relating to our use of SSL that technically weren't on before. This
should be fine and helpful? But if stuff breaks, well, check those!
Now, if I run `sudo -i -u impress` on the production server, it opens a
login bash shell, with all of the app's environment variables exported,
straight to `/srv/impress`.
This will let me quickly `cd current; bin/rails console` to start poking
at whatever needs poked!
I'm not sure why this was causing problems? especially why *now*? But I was seeing errors in systemctl of it trying to parse this comment as an environment variable soooo ok!
Could just be an intermittent thing where like, a byte got dropped last time we transferred this file or something? but whatever, this has fixed it and also is reasonable comment placement!
Looking back at this now I'm just like. Oh right, of course, we don't have passwordless access to *become root*, so of course Ansible's strategy of becoming root and then running the playbook step was failing!
Uhhh I think I must have made a mistake here where like… I must have left this in the service file for a while then accidentally deleted it from the Ansible playbook but not the live server? I had tested with this, then tested again without it and thought it wasn't necessary, but it turns out to have been necessary I guess? Ok!
This instructs Rails's ExecJS library to not bother looking for Node or something similar, because the app doesn't actually need to run any JS, even though the `react-rails` library (?) seems to be pretty eager about the possibility that we'll need to server-side-render stuff. (We should consider whether we want to though tbh? But… idk that would be a pretty different arch than what we've done with `jsbundling-rails` so like. idk whatever)
Mm, something in Rails was getting upset when working with session cookies because the `Host` header was `127.0.0.1:3000` instead of `beta.impress.openneo.net`. I only saw this log entry on important actions like login, so my hope is that this is why login is failing??
I was intentionally omitting these to start, because I didn't understand them well and didn't want to add things I didn't understand. But now I've checked in on them more and they seem standard and reasonable. Ok!
```
HTTP Origin header (https://beta.impress.openneo.net) didn't match request.base_url (http://127.0.0.1:3000)
```
Source: https://stackoverflow.com/a/73198861/107415
I did some refactoring while here too, of pulling the deploy scripts out of `package.json` and into `bin`, to be a bit more canonically Rails-y. (idk how canonical the colon thing is but, probably fine??)
Okay, this is much simpler than the impress-2020 version where we symlinked node_modules and stuff - Bundler is just a lot better at this lol
Right now, the app is failing to start because we don't install Node—I wasn't sure whether we'd need to and whether I was gonna precompile the assets etc
Though now that I say that out loud, I guess part of the issue might be that I'm not sure the app is running in RAILS_ENV=production, I wonder if it still wants Node in that case?? I'll flip that switch in the service file now, then commit to save my place for the day, then try again with starting the app sometime and see what it says!
Yay it's working! We set up the box, install Ruby, upload a placeholder app, set it up as a service, and get it hooked up to nginx!
Next, we'll add the script to upload the latest version of the site. We just need to slot it into `/srv/impress/current`, run `bundle install`, and that should basically be that! (Oh, and we need to compile production assets—I wonder if it's useful to do that on the dev machine instead of on the target? That might save us from needing to install Node. Or maybe we'll have to anyway!)