Oh yeah right, let's `yarn install` and build as part of the `post-create.sh` script!
I didn't have a command to do a one-time dev build of the assets yet (just one-time prod, or dev with reloading), so I added `yarn build:dev`!
Okay, so I've kept `database.yml` using the new username and password
`impress_dev`, cuz I like that it helps clarify that it's dev stuff and
is impossible to confuse with prod. (I updated my local setup to match!)
But hardcoding `host: db` into here breaks my local setup where the
database _isn't_ at the hostname `db`. So I add a way for new optional
database URL environment variables to get merged in with these settings,
and then configured the dev container to use that—and just in the most
limited override possible, to avoid duplicating stuff we don't need to.
(I could've just used the same names `DATABASE_URL_{PRIMARY,OPENNEO_ID}`
for this, but idk, I think it's confusing to have the same one for both
dev and prod, even though Rails _does_ basically do this; see below.)
Normally, the environment variable `DATABASE_URL` just _does_ this, and
you don't need to include a `url` key at all to get this behavior. But
since we've got the legacy two-database thing going on, we do this
instead! If we were to merge the `openneo_id.users` table into the
primary database, we could simplify this!
I just restarted the impress app in production! First I logged in to see
why it wasn't responding, and I saw that there was almost no free RAM
left, and that the Rails app had grown to eat it all up!
So in this change, we set a memory limit: if the impress app is taking
up more than 75% of the machine's RAM, systemctl will try to shrink it
down; if it can't, then it will kill the app at 80%.
I'm not totally sure whether these bounds are tight enough? I didn't
look closely enough at the numbers to see what the app's actual usage
was according to systemctl at the time (`sudo systemctl status
impress`), so my hope is this is enough. But if we run into a memory
leak crash like that again, because it turns out even existing at 75%
RAM freezes the machine when running alongside its other processes, we
can decrease these numbers!
I also don't know the nature of the memory leak, and that could be worth
investigating—the app pretty cleanly fits into ~500–600MB when it starts
up, but then does seem to slowly but steadily grow. If it could be kept
at that size, it's possible we could downgrade the server and save some
costs—but that's a question for another day, since making sure we handle
memory leaks when they *do* happen is a more important robustness fix!
I missed two things:
1) `rake` wasn't available in the path (surprising but ok!), so I replaced it with the more modern and more portable `bin/rails` invocation.
2) Without specifying a `host`, Rails was trying to connect with the database over a socket instead of over a port. Here, we tell it where to actually connect!
I think I'll need to make some tweaks here, since this isn't compatible with my _own_ local setup—maybe I'll revert `database.yml`, and have the dev container use the `DATABASE_URL` environment variable to override it?
But whatever, this is working for now, and that's exciting to me!
Note: On my machine, the install step hangs for a loooong time, to the point where I usually give up. On Codespaces, it also took a while at the same step (`Installing devise-encrpytable`), but eventually got through it. Maybe on my machine it would work if I'm more patient? Idk! But it's good to see it working on something!!
I cleared the references to the Neopia server (old thing responsible
for pet loading without blocking the Rails app) out of here a while ago,
but didn't clear out this config value!
This also clears the `tmp` directory out of the initial repository's
state! Rails should auto-generate it when it wants to make temporary
stuff.
`touch tmp/restart.txt` is the way to restart a Rails app deployed with
Phusion Passenger. I'm, uhh, not sure how a copy of it ever made its way
into the codebase, maybe I was confused about how to restart my local
app?
Look, I'll be real, I have literally not run these automated tests in
probably like a whole decade. Most of these files are empty, the ones
that aren't seem basically trivial, and I bet half of it would fail
anyway.
If I wanted to do real automated testing, I would basically want to
start from scratch anyway, and apply coverage I can trust to the areas
I actually care about.
Until then, I feel like these gems and files are mostly just clutter,
and I don't like them being One More Barrier To Entry. Goodbye, unused
complexity!
Oh right, we intentionally fail if there's no SECRET_TOKEN provided, but
that's not really useful for development!
Here, we add a SECRET_TOKEN only used in development - which doesn't
need to be secret, because it doesn't guard actual user sessions!
In production, the behavior is unchanged.
Rails already creates little pre-gzipped `.gz` copies of all our assets
in the `public/assets` directory when we build. This configures nginx to
send those when available!
We weren't doing *any* gzip stuff before, so this helps a lot with those
bigger JS files, like the `wardrobe-2020` stuff. It's now at ~.5MB with
compression, which is still a bit big, but nowhere near as offensive as
the 4.5MB pre-anything, or 1.5MB post-minification, lol.
We're now all-in on impress.openneo.net for this box!
One little wrinkle is that certbot was initially upset that I had
already uploaded the copy-pasted certs from the other box to here, at
the file path it expected to get to manage. So, I moved those to
`/srv/impress/shared/temp-certs`, and changed the nginx config
accordingly; and then deleted the original and let certbot control it!
The usual stuff! Installed the new gem and its new deps, ran
`bin/rails app:update` and did my best to manually merge the dev/prod
config files with the new canonical defaults, deleted some migrations I
don't think are relevant to us, and yeah!
Also, Rails 7.1 seems to need `libyaml-dev` installed, so I added that
to the `deploy/setup.yml` playbook!
One thing to note is that, while I was here, I turned on some settings
relating to our use of SSL that technically weren't on before. This
should be fine and helpful? But if stuff breaks, well, check those!
We used to do this for weird clever caching tricks that I don't think
were actually very effective. We stopped using this a few months ago,
and now I'm finally cleaning up this supporting code!
Huh, Arel can *sometimes* handle just having an attribute stand in as
"X is true" in a condition, but sometimes gets upset about it. I guess
this changed in Rails since we recently wrote this?
Specifically, item search would crash on "is:nc" (but *not* "is:np"),
saying:
```
undefined method `fetch_attribute' for #<struct Arel::Attributes::Attribute relation=#<Arel::Table:0x0000000109a67110 @name="items", @klass=Item(…), @type_caster=#<ActiveRecord::TypeCaster::Map:0x0000000109a66e90 @klass=Item(…)>, @table_alias=nil>, name="is_manually_nc">
```
The traceback was a bit misleading (it happened at the part where we
merge all the scopes together), but that hinted to me that it working
with an attribute in a place where it expected a conditional. So I
converted the attribute in the `is_nc` scope to a conditional, and made
the matching change in `is_np`, and that fixed it! Ok phew!
Idk why, but unlike my previous experience with Rails devcontainers, this time the setup process is running so wildly slowly?
Might just be a transient issue on my machine, maybe something that would be improved with a restart and trying again another time? Or could be something about the MySQL image that doesn't run great in this context?
In any case, I'm just gonna set this down for now!
The URL anchors were getting like. double-encoded? The `closet[]` part
was encoding as `closet%255B%255D`. Maybe a thing in Rails, where you
need to mark them `html_safe` to insert them in a URL like that?
Well anyway, those URLs are redundant now, I just have it link straight
to the same outfit page as the big link!
Now, like in DTI 2020, opening an outfit will go straight to the editor.
I'm not 100% on whether this is actually like. the superior behavior?
But I think it's good enough, and it's what the wardrobe-2020 code
expects, so let's just roll with it for now!
Ohh I see, I made a mistake converting this from Next.js routing. It's
not that we had a URL search parameter named `outfitId`; it's that if
you were coming from the `/outfits/:outfitId` route, it would use that!
I still haven't gotten the rest of the site to point that route to this
page, but I'll do that in a later change.
Notable things:
- We used to have the parameters in the hash (`#`) part of the URL.
- We used to use the key `outfit=123` instead of `outfitId=123`.
In this change, we add backwards-compatibility for these things, while
still keeping the latest behavior too, with no change to the URLs we
generate!
Looks like the version of Prettier I just installed is v3, whereas our
last run in the impress-2020 repo was with v2. I don't think we had any
special config in that project, I think these are just changes to
Prettier's defaults, and I'm comfortable accepting them! (Mostly seems
like a lot of trailing commas.)
I'm trying out a new editor setup, and it noticed that Prettier isn't
obviously installed! I think it makes sense to put it in dev deps, even
if there's not a direct hook calling it—though tbh maybe I should add it
to `yarn dev` somehow?
Now, if I run `sudo -i -u impress` on the production server, it opens a
login bash shell, with all of the app's environment variables exported,
straight to `/srv/impress`.
This will let me quickly `cd current; bin/rails console` to start poking
at whatever needs poked!
Idk if this used to be different or what, but it looks like the current
behavior is: if you delete a closet list, it'll leave the hangers
present, but Classic DTI would not show them anywhere; but Impress 2020
(until recently) would crash about it.
Now, we use `dependent: :destroy` to delete the hangers when you delete
the list (which I think makes sense, and is different than what I
decided in the past but that's ok, and is what the current behavior
*looks* like to people!), and we add a migration that deletes orphaned
hangers.
The migration also outputs the deleted hangers as JSON, for us to hold
onto in case we made a mistake! I'm also backing up the database in
advance of running this migration, just in case we gotta roll back HARD!
This is an important workflow for people doing art stuff, I'm told! They used to use the Classic DTI broken image UI for this, but now that that's uhh Fully Gone, let's add this more explicitly!
Ah right, the CSS reset only applies in the ScopedCSSReset container, which doesn't work for elements portaled out with the <Portal> component (which a LOT of Chakra components use for things like tooltips etc).
Here, we take advantage of <Portal> having a hardcoded classname .chakra-portal, and applying it to them too!
This was used by the Neopia server to send us the modeling data it requested out-of-band. But now we do all our modeling requests back in-app again, so we don't need this!
Okay, this is a process that idk if it's even been working for a while anyway, I don't think Neopets translates item names anymore?
And it's crashing when I try to model stuff now, so like. yeah ok I'm fine with just skipping this, it's a shame to lose out on potential data going forward but *I think there just isn't data to get anyway*
I think we used this for both conversion to image, and also for CORS stuff when rendering Flash-based previews… let's trash it, I don't want to be growing our hard drive with files I don't think we use anymore!
If I'm wrong and it turns out we do use them for something, then like. hey I'm sure we'll find out soon enough, and it's very recoverable operation.
I hope this doesn't cause problems! But yeah, with Puma doing threading, and maybe switching to Falcon someday to get even better concurrency properties, I feel like this will probably be fine?
And it makes the UX a loootttt better, to be back in the world where all these forms just work, whew.
Oh okay, I was misinterpreting the error: it was that our NEOPETS_URL_ORIGIN secret value isn't the real Neopets.com IP address anymore, so amfphp requests were just plain *always* failing in production. Oops!
I've remove that environment variable from our production config, and now modeling is working in the bulk thing!
Also I'm noticing that we're using puma these days, which does good threading stuff. I think there might be merit to switching over to Falcon because of just how async-y our stuff is, but having 5 threads going is honestly probably good enough that I don't need to worry too much about mutual blocking, and could probably just write stuff to get Neopia out of the picture like *right now*. Neat!
Okay so… I'm worried about this because of Rails whole single-threaded situation, which doesn't really let it handle blocking on external network requests very well.
Ultimately I think we're gonna have to do a clever thing but idk quite what?
I should look into whether like, puma + the new async stuff can enable Rails to be more tolerable about this, and handle a few requests at once, instead of having to have the Neopia server doing it. (Right now, the Neopia server isn't really doing its job quite right, because it depends on the Rails app being *local* to send stuff to it.)
But for now, let's just extend the timeout, cuz it's basically always getting hit in production—because there's currently no other way to do modeling, oops lol
I'm not sure why this was causing problems? especially why *now*? But I was seeing errors in systemctl of it trying to parse this comment as an environment variable soooo ok!
Could just be an intermittent thing where like, a byte got dropped last time we transferred this file or something? but whatever, this has fixed it and also is reasonable comment placement!