Well, holding onto the browser instance seems to be a source of bugs (my previous fix didn't seem to fix it), and I'm not _sure_ what the perf characteristics are, so let's just try a fresh instance each time!
I don't actually know what a browser "instance" in Playwright really is, I'm not sure it even necessarily creates a new process, I just don't know
and I saw some Vercel example code take this approach, which is definitely simpler, and I guess must not be _overtly_ bad perf if it's idiomatic?
So, like, ok, cool, let's see if this stops 500ing us with "Browser closed"! 😅
Whoops, `Promise.race` isn't quite what I wanted here. This meant that, if the image promise _fails_ before the movie _succeeds_, the outfit would crash even though it doesn't need to. (And this was happening too often, due to a bug in /api/assetImage!)
Now, we accept whichever _successful_ result loads first, or reject if they _both_ fail.
I tested this by having /api/assetImage always throw, and confirmed that it crashed the outfit before this change, and no longer does after this change!
Been seeing this in testing in prod, the first few images worked great, but then eventually they all started saying the browser was disconnected.
Here, we add a check to reconnect if it goes missing. This is actually kinda hard to test in dev, because the dev server creates a new process every time the function runs, so fingers crossed!
I also added explicit logic to close the page when we're done with it, I'm worried we crashed the browser by exceeding the RAM limit by leaving pages open. Not sure quite how their model works and whether things eventually get flushed out on their own!
We update /api/assetImage to accept size as a parameter (I make it mandatory to push people into HTTP caching happy paths), and we update the GraphQL thing to use it in those cases too!
This also means that, if these images seem to go well, we could swap Classic DTI over to them… I want to turn off those RAM-heavy image converters on the VPS lol
In this change, we start using our new API endpoint for movie image URLs, instead of the Classic DTI image.
This should make the little fade-in phase for certain movies a little bit less jarring (the part where we preload the image before the movie loads), though I suppose that won't necessarily load as fast until it gets into the cache the first time lol. (A good reason to maybe put a more long-lived cache like Fastly in front of this stuff long-term?)
Not doing it for the smaller image sizes yet, I'm a bit worried that I don't 100% know how to teach /api/assetImage to resize without tipping over the function limit…
…oh! I should have the webpage render at different sizes! Yeah that's a great idea lol
Okay cool, this one worked! We use this special Chrome package with AWS Lambda support, and then we use normal Playright in dev, and then we exclude `playwright` from the deployment (even though it got auto-detected by `require("playwright")`) to just barely sneak in under the 50MB limit for this function. Phew!
The preview deploys for this seem to be, actually working? So that's exciting!
So, just using normal playwright was crashing with this error: https://github.com/microsoft/playwright/issues/5862
I didn't understand why everyone was using playwright-core until I read the comments more carefully, and saw that it was because folks were using playwright-aws-lambda, because that's where Vercel functions run. (It has some special compat stuff.)
So I'm figuring that maybe the special case in Vercel's builder that fixes this for playwright-core maybe doesn't apply to normal playwright? But that people don't actually run into that issue in practice, because they're all using playwright-core for playwright-aws-lambda instead?
Idk, let's see how it goes! My hope is that this both fixes the immediate crasher about browsers.json being missing, _and_ fixes a problem we were _gonna_ have down the line about normal playwright not working in an AWS Lambda setting.
Marking this glitch on the Yellow Lutari head today, and oops there isn't UI copy for it yet! Added!
Also fixed some bugs in here, like old text about the position of the pose picker relative to the glitch badge, and I noticed while debugging that `layerUsesHTML5` returns a truthy string instead of a boolean which seems error-prone!
Hm, okay, so the documented way to not instrument anything doesn't actually stop them from patching Module._load. But this undocumented option sure does! So, woo, let's try it! lol
Huh, well, I can't figure out what in our production env stopped working with Honeycomb's automatic instrumentation… so, oh well! Let's try disabling it for now and see if it works.
This means our Honeycomb logs will no longer include _super helpful_ visualizations of how HTTP requests and MySQL queries create a request dependency waterfall… but I haven't opened Honeycomb in a while, and this bug is blocking all of prod, so if this fixes the site then I'm okay with that as a stopgap!
Btw the error message was:
```
Unhandled rejection: TypeError: Cannot read property 'id' of undefined at exports.instrumentLoad (/var/task/node_modules/honeycomb-beeline/lib/instrumentation.js:80:14) at Function._load (/var/task/node_modules/honeycomb-beeline/lib/instrumentation.js:164:16) at ModuleWrap.<anonymous> (internal/modules/esm/translators.js:199:29) at ModuleJob.run (internal/modules/esm/module_job.js:169:25) at Loader.import (internal/modules/esm/loader.js:177:24)
```
Oh also, this is the first time eslint has looked at scripts/build-cached-data.js I guess, so I fixed some lint errors in there.
I tried a "Redeploy" from the Vercel console, but it didn't seem to fix
the bug. That could mean that my theory about Node version as the
culprit is wrong? Or it could mean that we need fresh code for that
config change to take effect. So, this commit should simulate fresh
code! lol 😅
In production we're suddenly getting errors in module wrapping in honeycomb-beeline. I wonder if it's like, an incompatibility with Vercel's version of Node?
Well, this new version seems to still be playing nice on dev, so hopefully that's all it is and this fixes it! I give it like a 35% chance lol :p
Hmm, the item page in prod is slower than it is in dev? In dev, most items are satisfied by the preloading in ItemPagePreview, but in prod, those same items need to send a separate OutfitItemsAppearance query _way_ after (which, I think just due to queueing, waits for all the items to wait too).
There's an obvious issue in the case of all the Maraquan items lately, because we just don't do the clever cache lookups for non-standard colors at all. But I don't understand why even standard items like the 17th Birthday Party Hat are struggling!
These are just some simple debug statements, hopefully they'll tell us something about the basics of what's happening!
I didn't want to use the word "basic", since "basic colors" generally means like Blue, Red, Green, Yellow… but it was the only one that fit in the space lol
I tried a lot of stuff with "Fits standard pets" and stuff and couldn't get it to work well
Just a little display bug on the homepage. For an item like the "Evil Coconut Half Mask", which was specifically drawn for the standard _and_ major special colors, our previous logic would have said "Baby only" or "Maraquan only" or whatever special color it happened to find first.
Now, we only show the case "Baby only" if it _doesn't_ fit standard pets too.
Note that the Maraquan case is tricky, because the Blue Mynci can also wear Maraquan items lol! For this reason, we check for two standard bodies before declaring that it's meant for standard pets.
I wasn't sure how to fill the space for items that are fully modeled, then realized some basic at-a-glance "who does this fit" would help!
The load time isn't great, I think I need to break out that dependent subquery, but maybe the stale-while-revalidate will cover it well enough at first.
Add a skeleton stripe for the modeling data! Won't show up in most cases because we load fast, but it helps things a lot when it does. (Also, will we keep loading fast with the cache changes on this query?)
To make this fast, I had to tweak the GraphQL resolver a bit to run a filtered version of the query for `newestItems` instead of scanning the full database! But yeah, looking good!
I think I'm gonna want to swap out "Fully modeled" for some insight about who it fits
Okay cool, I noticed that "A Warm Winters Night Background" sometimes animates when other things are playing, but the animations aren't _detected_. (Huh, I actually thought we just didn't schedule ticks in that case? But maybe I'm missing something.)
Anyway, some movies don't use the built-in frames construct to animate, and instead use tweens that hook into the timeline and mutate the stage. Okay! Now we detect those.
This _did_ enable the Play/Pause button on some items that don't actually animate in practice, like the "#1 Fan Room Background", which seems to have an animated string of lights in the corner that got layered incorrectly. Maybe we should add a new glitch type, to flag movies that don't actually animate?
Doing this for two reasons! One is that I want the movie layer component to be a bit thinner in general - I think we might even want to move the fallback image logic out, too.
The second is that I want the onError for something else soon!
Huh, weird. So I reversed the manifest, because you want to get the *last* movie. And I figured that semantic probably extended to PNGs and SVGs too?
But actually, PNGs sometimes have *other* PNGs in the manifest that aren't the relevant asset at all, and are just reference art.
Again, I'm really not sure what the underlying semantic is here? Does the Neopets customizer just display them all, and for the items with this problem, they happen to layer in a way that's not broken?? I would really like to not do that, and I would really like to know the real semantic, but I can't find it >.>
So um, I'm going ahead and using the best semantic that solves the problems I know about? Which is, use the last movie, and use the first PNG. Fingers crossed lol!
I also didn't test this change extensively, because I'm on a train lol
I'm just trusting that this push will be better than what we had before. I tested it on the Dandan MME, which has two JS files, and it took the latter; and the Pathway of Petals Background, which has two PNG files, and it took the former. Success? 😬🤞
Oops, my inbox was getting full of uncaught promise rejections of `loadImage`!
I'm pretty sure they're caused when multiple images in a movie fail to load (e.g. network problems), but we fail to cancel them. So, the first failure would be caught as a part of `Promise.all` in `loadMovieLibrary`, but then subsequent failures wouldn't be caught by anything, and would propagate up to the console and to Sentry as uncaught errors.
In this change, we make a number of improvements to cancellation. The most relevant change for this bug is that `loadMovieLibrary` will now automatically cancel all resource promises when it throws an error! But this improved robustness also enabled us to finally offer a simple `cancel()` method on movie library promises, which we now available ourselves of at call sites, too.
Experiment! Let's see if them being more prominent like this is helpful or annoying 😅
I think this is clunkier in the HTML5 Green Happy Path, but worth it for bringing attention in the error cases.
But I feel like we might tweak this over time!
Huh, so it turns out sometimes the manifest will include old broken conversion attempts!
This fixed the "MiniMME18-S2c: Holomorphic Foliage and Dandan Set", the "Electric Dress" on various species (incl. Aisha), and yeah!
What an interesting discovery 😂
This is because I want to try adding a search footer to the two-column layout, like in Classic DTI—and so I want more screens that _can_ support two-column layout to use it.
Right, cool, yes, this is the thing about partial data; you need to define the loading condition as "relevant data is missing, _and_ loading is still happening".
Ah right okay, when the `ItemSearchResultV2` doesn't have an `id`, Apollo Cache isn't quite so strong about caching conflicting-y fields, like the different parameterizations of `items`.
With this change, we give the search result object an ID, which helps Apollo cache more confidently!
It's just a serialization of the relevant search fields 😅
That's the last itemSearch call site! I'll probably keep it up for other clients for a while though, esp since it doesn't depend on any additional loaders or anything, it's pretty small overall
Updated the comments to reflect this, and also remembered to make them real docstrings lol!
Now, when you click Prev/Next, we show the page number while the items load, rather than blink it in and out!
This is because we're using itemSearchV2, which makes `numTotalItems` cacheable separately from the paginated `items`. Apollo Cache pretty much does this with zero config, we just have to ask for `returnPartialData`!
The main change is that we restructure the query, so that only the parts that are actually affected by pagination depend on those variables!
This will enable the Apollo Cache to trivially cache and show `numTotalItems` while waiting for other pages to load.
I'm gonna make this a bit more powerful later, but just for now, the text "Page 1 of 27" shows up!
I also don't like that the page number has to blink out while we load the new stuff; there are multiple solutions, but tbh I think the Apollo Cache should be the one to handle this, and that we can do it by refactoring the query structure a bit!
I'm seeing uncaught promise rejections in `loadImage`? It's hard to know exactly where it's actually coming from, those _should_ be caught?
My guess is that it's coming from canceled images, which are throwing errors even after loading? I don't totally understand how, because looking back, I don't think the `cancel` method was actually called???
Anyway, I fixed it so cancel actually _is_ called, and that we don't throw errors when the canceled image _correctly_ fails to load.
This should be more robust either way, but hopefully it also stops the flow of errors?
Right, when there are zero layers, we shouldn't say we're loading!
This is a consequence of the HACK below. If we didn't short-circuit the effect when length == 0, then we would go through and successfully load 0 layers.