I used the new profiler tools on this page, and noticed a lot of
allocations in the Globalize library, which we use for translating
database records. I realized that we were loading all of the fields of
not just all of the items on the page, but all of their translation
records in all locales! We used to scrape data for lots of languages, so
that can be quite a lot!
Unfortunately, Rails's `includes` method to efficiently preload related
records always loads all fields, and simply can't be overridden.
So, in this change we write manual preloading code, to identify the
records we need, load them in big bulk queries, and assign them back to
the appropriate associations. Basically just what `includes` does, but
written out a bit more, to give us the chance to specify SELECT and
WHERE clauses!
It shows up in development always, and if you're logged in as Me
Specifically in production!
I'm using this to poke at memory usage for pages that seem suspicious.
I don't know why our app reliably grows so large in RAM, but my hunch is
that maybe there are some pages that just use a truly large amount to
begin with - and I've learned Ruby doesn't release memory back after
it's GC'd, it just grows the process and keeps the free space to itself
in its own heap!
So I'm just eyeing pages that I know *can* have a lot going on, and
seeing what I find!
Well, sitting at the `MemoryHigh` limit still grinds the app to a halt
anyway, lmao. I guess it's a feature designed for well-behaved processes
and not for outright leaking ones?
Let's try just having systemd basically reset the app regularly when the
RAM hits a certain threshold. I think that's what this config will do,
we'll find out!
Lol ok, as I had kinda predicted, the memory bounds I set last time
were not tight enough, and it stalled out again! (It was at 75% and
fully just not working.)
Let's try this tighter bound instead!
So, Dependabot correctly reported that this version of puma is
vulernable, which I fixed in the main app already—but I didn't notice we
also use that version in this cute tiny placeholder app we use early in
the deployment process.
There's not a real security need to upgrade this, as this placeholder
app has no access to useful data when it is run, but I think it's better
to resolve this by fixing it than by silencing Dependabot! May as well!
Oh yeah right, let's `yarn install` and build as part of the `post-create.sh` script!
I didn't have a command to do a one-time dev build of the assets yet (just one-time prod, or dev with reloading), so I added `yarn build:dev`!
Okay, so I've kept `database.yml` using the new username and password
`impress_dev`, cuz I like that it helps clarify that it's dev stuff and
is impossible to confuse with prod. (I updated my local setup to match!)
But hardcoding `host: db` into here breaks my local setup where the
database _isn't_ at the hostname `db`. So I add a way for new optional
database URL environment variables to get merged in with these settings,
and then configured the dev container to use that—and just in the most
limited override possible, to avoid duplicating stuff we don't need to.
(I could've just used the same names `DATABASE_URL_{PRIMARY,OPENNEO_ID}`
for this, but idk, I think it's confusing to have the same one for both
dev and prod, even though Rails _does_ basically do this; see below.)
Normally, the environment variable `DATABASE_URL` just _does_ this, and
you don't need to include a `url` key at all to get this behavior. But
since we've got the legacy two-database thing going on, we do this
instead! If we were to merge the `openneo_id.users` table into the
primary database, we could simplify this!
I just restarted the impress app in production! First I logged in to see
why it wasn't responding, and I saw that there was almost no free RAM
left, and that the Rails app had grown to eat it all up!
So in this change, we set a memory limit: if the impress app is taking
up more than 75% of the machine's RAM, systemctl will try to shrink it
down; if it can't, then it will kill the app at 80%.
I'm not totally sure whether these bounds are tight enough? I didn't
look closely enough at the numbers to see what the app's actual usage
was according to systemctl at the time (`sudo systemctl status
impress`), so my hope is this is enough. But if we run into a memory
leak crash like that again, because it turns out even existing at 75%
RAM freezes the machine when running alongside its other processes, we
can decrease these numbers!
I also don't know the nature of the memory leak, and that could be worth
investigating—the app pretty cleanly fits into ~500–600MB when it starts
up, but then does seem to slowly but steadily grow. If it could be kept
at that size, it's possible we could downgrade the server and save some
costs—but that's a question for another day, since making sure we handle
memory leaks when they *do* happen is a more important robustness fix!
I missed two things:
1) `rake` wasn't available in the path (surprising but ok!), so I replaced it with the more modern and more portable `bin/rails` invocation.
2) Without specifying a `host`, Rails was trying to connect with the database over a socket instead of over a port. Here, we tell it where to actually connect!
I think I'll need to make some tweaks here, since this isn't compatible with my _own_ local setup—maybe I'll revert `database.yml`, and have the dev container use the `DATABASE_URL` environment variable to override it?
But whatever, this is working for now, and that's exciting to me!
Note: On my machine, the install step hangs for a loooong time, to the point where I usually give up. On Codespaces, it also took a while at the same step (`Installing devise-encrpytable`), but eventually got through it. Maybe on my machine it would work if I'm more patient? Idk! But it's good to see it working on something!!
I cleared the references to the Neopia server (old thing responsible
for pet loading without blocking the Rails app) out of here a while ago,
but didn't clear out this config value!
This also clears the `tmp` directory out of the initial repository's
state! Rails should auto-generate it when it wants to make temporary
stuff.
`touch tmp/restart.txt` is the way to restart a Rails app deployed with
Phusion Passenger. I'm, uhh, not sure how a copy of it ever made its way
into the codebase, maybe I was confused about how to restart my local
app?
Look, I'll be real, I have literally not run these automated tests in
probably like a whole decade. Most of these files are empty, the ones
that aren't seem basically trivial, and I bet half of it would fail
anyway.
If I wanted to do real automated testing, I would basically want to
start from scratch anyway, and apply coverage I can trust to the areas
I actually care about.
Until then, I feel like these gems and files are mostly just clutter,
and I don't like them being One More Barrier To Entry. Goodbye, unused
complexity!
Oh right, we intentionally fail if there's no SECRET_TOKEN provided, but
that's not really useful for development!
Here, we add a SECRET_TOKEN only used in development - which doesn't
need to be secret, because it doesn't guard actual user sessions!
In production, the behavior is unchanged.
Rails already creates little pre-gzipped `.gz` copies of all our assets
in the `public/assets` directory when we build. This configures nginx to
send those when available!
We weren't doing *any* gzip stuff before, so this helps a lot with those
bigger JS files, like the `wardrobe-2020` stuff. It's now at ~.5MB with
compression, which is still a bit big, but nowhere near as offensive as
the 4.5MB pre-anything, or 1.5MB post-minification, lol.
We're now all-in on impress.openneo.net for this box!
One little wrinkle is that certbot was initially upset that I had
already uploaded the copy-pasted certs from the other box to here, at
the file path it expected to get to manage. So, I moved those to
`/srv/impress/shared/temp-certs`, and changed the nginx config
accordingly; and then deleted the original and let certbot control it!
The usual stuff! Installed the new gem and its new deps, ran
`bin/rails app:update` and did my best to manually merge the dev/prod
config files with the new canonical defaults, deleted some migrations I
don't think are relevant to us, and yeah!
Also, Rails 7.1 seems to need `libyaml-dev` installed, so I added that
to the `deploy/setup.yml` playbook!
One thing to note is that, while I was here, I turned on some settings
relating to our use of SSL that technically weren't on before. This
should be fine and helpful? But if stuff breaks, well, check those!
We used to do this for weird clever caching tricks that I don't think
were actually very effective. We stopped using this a few months ago,
and now I'm finally cleaning up this supporting code!
Huh, Arel can *sometimes* handle just having an attribute stand in as
"X is true" in a condition, but sometimes gets upset about it. I guess
this changed in Rails since we recently wrote this?
Specifically, item search would crash on "is:nc" (but *not* "is:np"),
saying:
```
undefined method `fetch_attribute' for #<struct Arel::Attributes::Attribute relation=#<Arel::Table:0x0000000109a67110 @name="items", @klass=Item(…), @type_caster=#<ActiveRecord::TypeCaster::Map:0x0000000109a66e90 @klass=Item(…)>, @table_alias=nil>, name="is_manually_nc">
```
The traceback was a bit misleading (it happened at the part where we
merge all the scopes together), but that hinted to me that it working
with an attribute in a place where it expected a conditional. So I
converted the attribute in the `is_nc` scope to a conditional, and made
the matching change in `is_np`, and that fixed it! Ok phew!
Idk why, but unlike my previous experience with Rails devcontainers, this time the setup process is running so wildly slowly?
Might just be a transient issue on my machine, maybe something that would be improved with a restart and trying again another time? Or could be something about the MySQL image that doesn't run great in this context?
In any case, I'm just gonna set this down for now!
The URL anchors were getting like. double-encoded? The `closet[]` part
was encoding as `closet%255B%255D`. Maybe a thing in Rails, where you
need to mark them `html_safe` to insert them in a URL like that?
Well anyway, those URLs are redundant now, I just have it link straight
to the same outfit page as the big link!
Now, like in DTI 2020, opening an outfit will go straight to the editor.
I'm not 100% on whether this is actually like. the superior behavior?
But I think it's good enough, and it's what the wardrobe-2020 code
expects, so let's just roll with it for now!
Ohh I see, I made a mistake converting this from Next.js routing. It's
not that we had a URL search parameter named `outfitId`; it's that if
you were coming from the `/outfits/:outfitId` route, it would use that!
I still haven't gotten the rest of the site to point that route to this
page, but I'll do that in a later change.
Notable things:
- We used to have the parameters in the hash (`#`) part of the URL.
- We used to use the key `outfit=123` instead of `outfitId=123`.
In this change, we add backwards-compatibility for these things, while
still keeping the latest behavior too, with no change to the URLs we
generate!
Looks like the version of Prettier I just installed is v3, whereas our
last run in the impress-2020 repo was with v2. I don't think we had any
special config in that project, I think these are just changes to
Prettier's defaults, and I'm comfortable accepting them! (Mostly seems
like a lot of trailing commas.)
I'm trying out a new editor setup, and it noticed that Prettier isn't
obviously installed! I think it makes sense to put it in dev deps, even
if there's not a direct hook calling it—though tbh maybe I should add it
to `yarn dev` somehow?
Now, if I run `sudo -i -u impress` on the production server, it opens a
login bash shell, with all of the app's environment variables exported,
straight to `/srv/impress`.
This will let me quickly `cd current; bin/rails console` to start poking
at whatever needs poked!
Idk if this used to be different or what, but it looks like the current
behavior is: if you delete a closet list, it'll leave the hangers
present, but Classic DTI would not show them anywhere; but Impress 2020
(until recently) would crash about it.
Now, we use `dependent: :destroy` to delete the hangers when you delete
the list (which I think makes sense, and is different than what I
decided in the past but that's ok, and is what the current behavior
*looks* like to people!), and we add a migration that deletes orphaned
hangers.
The migration also outputs the deleted hangers as JSON, for us to hold
onto in case we made a mistake! I'm also backing up the database in
advance of running this migration, just in case we gotta roll back HARD!
This is an important workflow for people doing art stuff, I'm told! They used to use the Classic DTI broken image UI for this, but now that that's uhh Fully Gone, let's add this more explicitly!
Ah right, the CSS reset only applies in the ScopedCSSReset container, which doesn't work for elements portaled out with the <Portal> component (which a LOT of Chakra components use for things like tooltips etc).
Here, we take advantage of <Portal> having a hardcoded classname .chakra-portal, and applying it to them too!
This was used by the Neopia server to send us the modeling data it requested out-of-band. But now we do all our modeling requests back in-app again, so we don't need this!
Okay, this is a process that idk if it's even been working for a while anyway, I don't think Neopets translates item names anymore?
And it's crashing when I try to model stuff now, so like. yeah ok I'm fine with just skipping this, it's a shame to lose out on potential data going forward but *I think there just isn't data to get anyway*