Commit graph

11 commits

Author SHA1 Message Date
19f1ec092e Turn on Honeycomb instrumentation again
Well, instrumentation seems to be working fine again! The bug we ran into during commit e5081dab7e is gone. Cool!

I want to be able to see what's making the new box slow. My hypothesis was (and it seems to be right) that communication with the database on the Classic DTI server is slow.

But now that they're on the same Linode account and region, I think I can set up a private VLAN to make them muuuch faster. We'll try it out!
2021-11-26 23:41:22 -08:00
5039371a1d Simplify the page pool
Yeah ok, let's just run one browser instance and one pool.

I feel like I observed that, when I killed chromium in prod, pm2 noticed the abrupt loss of a child process and restarted the whole app process? which is rad? so maybe let's just trying relying on that and see how it goes
2021-11-13 02:27:24 -08:00
0c2939dfe4 Use Puppeteer instead of Playwright
We used Playwright in the first place to try to work around a Vercel deploy issue, and I'm not sure it really ended up mattering lol :p

But yeah, I'm putting the new Puppeteer code through the same prod stress test, and it just doesn't seem to be getting into the same broken state that Playwright was. I'm guessing it's just that Puppeteer has more investment in edge-case handling? (There's also the fact that we're no longer running things as root, which could have been a fucky problem, too?)
2021-11-13 02:16:58 -08:00
20fff261ef Switch to a small page pool
Hmm I am really not managing to keep the processes under control… maybe I'll try Puppeteer and see if it's just a bit more reliable??
2021-11-13 01:45:27 -08:00
18bc3df6f4 Use browser pooling for /api/assetImage
I tried running a pressure test against assetImage on prod with the open-source tool `wrk`:

```
wrk -t12 -c20 -d20s --timeout 20s 'https://impress-2020-box.openneo.net/api/assetImage?libraryUrl=https%3A%2F%2Fimages.neopets.com%2Fcp%2Fitems%2Fdata%2F000%2F000%2F522%2F522756_2bde0443ae%2F522756.js&size=600'
```

I found that, unsurprisingly, we run a lot of concurrent requests, which fill up memory with a lot of Chromium instances!

In this change, we declare a small pool of 2 browser contexts, to allow a bit of concurrency but still very strictly limit how many browser instances can actually get created. We might tune this number depending on the actual performance characteristics!
2021-11-12 23:35:30 -08:00
3ec0ae7557 Use localhost in /api/assetImage
Just another VERCEL_URL removal!
2021-11-12 22:08:06 -08:00
9753cbe173 /api/assetImage fixes in production
Now that we're not on Vercel's AWS Lambda deployment, we can switch to something a bit more standard!

I also tweaked up our version of Playwright, because, hey, why not?

Getting the package list was a bit tricky, but we got there! Left a comment to explain where it's from.
2021-11-12 21:39:35 -08:00
5470a49651 Use utf8 in API error messages
I noticed this when Playwright was trying to draw cute ASCII art and it wasn't showing up right! Not a big deal, but it's a bit more correct to do this, so let's do it!
2021-11-12 21:17:20 -08:00
171558a64f Move assetImage and outfitImage back into Nextjs
This should let us actually start working with them locally and in new prod!
2021-11-03 17:07:25 -07:00
f45ae20471 Fix /api/assetImage for real :p 2021-11-02 01:21:32 -07:00
8dab442929 [WIP] API routes working for Next.js
Things seemed to mostly work at first glance! I haven't tested outfitPageSSR because we'll need to redo it though, and also the outfit image routes aren't working anymore (vercel.json isn't how next.js works)
2021-11-01 22:25:43 -07:00
Renamed from api/assetImage.js (Browse further)