I want to start running this on a regular cron, and making the script faster (stop sending redundant queries) and clearer (# actually updated) is super useful for that!
Originally, this was sorta a cache warmup script: we wanted to fill in manifests that we hadn't checked for yet.
But now, I want to _also_ check previous cache misses, that we stored in the db as an empty string. Maybe it's been converted now!
I was finding the script too slow running on my local machine, because the SQL RTTs were too slow - and with one connection, they were essentially a serial bottleneck, not taking much advantage of our concurrency.
Here, I instead add a `--dump` option, which outputs SQL to stdout. I then uploaded the resulting SQL to the DTI box, and ran it up there. Doing the network part fast on my machine, and the SQL part fast on the cloud machine!
I first considered uploading this script to the cloud machine, but it's an old Ubuntu and I couldn't figure out how to install a recent NodeJS onto it 🙃
Just gonna bulk load all those manifests into the db, and then that should make most loads notably faster by removing the net request! 🤞
We'll still load manifests inline sometimes, but only the first time anyone pulls up the layer in impress-2020. After that, it should be cached forever!