At this point, I've gone through all the assets, and the only ones
without manifests are:
1. The ones that truly have no manifest yet (that we know of)
2. The ones where execution happened to time out
I think the 5-second timeout is a very reasonable default for starting
the backfill, in a way that prioritizes moving forward; but now that we
have most things, I'd rather be able to re-run it with a more generous
timeout. So here we are!
I tried to port the Rainbow Pool ones forward, but ran into issues with the
service that uses browser-specific stuff to check that traffic is valid :/
Incidentally, those were the only places we were using `rest-client`.
Goodbye!
I noticed a thing with like, an asset that I think referenced an item that
doesn't exist, which caused an error in the `body_specific?` validation
step?
Tbh that validation step needs fixed up in a number of ways, but I'm
scared to, since it's hard to know what will break modeling lol.
But in any case, more graceful handling is nice! If something happens,
I'd rather leave it null and try again later than have the job crash!
It's not just that none of them were 200 OK, it's that they were all 404.
In the event that something returns not-200 and not-404, we immediately
abort, so we shouldn't get to this case unless they were all 404!
Okay, I've simplified the migration to *just* add the column, and
instead added a task to find assets without manifest URLs and backfill
them.
Performance is a lot better now, using the `async-http` library, which
as I understand it supports both persistent connections when invoked
like this, and maybe also HTTP/2 multiplexing?? (Though I'm not
actually sure images.neopets.com does lol)
I'm not sure about the number of concurrent tasks I picked here, 100
seems okay for an internet thing and for such small requests, but I
worry that the CDN is gonna get annoyed or something. Well, we'll see!
This task is very resumable if it turns out we get frozen out or
something.
I don't think these work anymore, and our volunteers get new items into the db fast anyway, Impress 2020 is doing better spidering these days. And then we get to remove the cron job `whenever` gem!
Yay, we've deleted all our background tasks!
We'll probably want to replace some of the basic functionality like certain caching? But we can deal with that as we run into it.
The direct motivation here was a seeming version conflict between Rails 4.2's rack dependency and latest Resque's rack dependency... but this is just nice complexity elimination regardless, we want this anyway :3
NOTE: This doesn't boot yet! There's something changed in the `devise` API that we'll need to fix!
```
/vagrant/config/initializers/devise.rb:46:in `block in <top (required)>': undefined method `encryptor=' for Devise:Module (NoMethodError)
```
But yeah, we navigated the gem upgrades, and also I ran `rake rails:update` and hand-processed the suggestions it had for our config files.
includes allowing null on some item fields, and putting the swf_assets
type and id index in an actual migration, or this commit would have removed
it upon migrating