Again I'm just not convinced of the perf on this, and it enables us to delete some whole infra over it, we can improve it another time if it's useful to!
Just removing some caching and the expiration of it! There's still more superfluous(?) caching on the item page to audit, but these seem a bit more sensible about avoiding loading extra data.
In the interest of clearing out Resque, I'm just gonna remove a lot of our more complex caching stuff, and we can do a perf pass for things like big item list pages once everything's upgraded. (I'm hopeful that the upgrades themselves improve perf; and if not, that some improved sensibilities 10 years later can find simpler approaches.)
We uninstalled Flex, our Elasticsearch gem, to replace item search with direct DB queries; but I forgot these calls, oops!
I also kinda want to see about deleting the resque tasks altogether, since I'm not sure how to get Resque installed on latest rails bc there seems to be a conflict over the version of Rack? And it'd be nice to get rid of the complexity if we can.
Not doing the tricks with `is_positive` anymore, instead just calling different functions altogether at the call site.
Also, instead of classes, I feel like this is a lot more concise to just write as class methods that create certain instances of a trivial `Filter` data class. Without the tricks of `is_positive` in play, the value of classes goes way down imo.
The "fits:8-bit-chomby" search filter was being read as color=8, species=bit.
Now, we split from the right-hand side of the filter instead.
Still a problem for anyone who explicitly types the Spanish/Portuguese
ordering of "fits:chomby-8-bits", but I'm okay with this cheap fix, since
I bet literally nobody has done that in the past month, if ever :P
We used get_multi when preparing the proxies to decide which to
load from the database, but then sent multiple get requests to
Memcache to re-fetch the same data from that get_multi. Silly!
Use the data that's already stored on the proxy anyway.
Some lame benchmarking on my box, dev, cache classes, many items:
No proxies:
Fresh JSON: 175, 90, 90, 93, 82, 88, 158, 150, 85, 167 = 117.8
Cached JSON: (none)
Fresh HTML: 371, 327, 355, 328, 322, 346 = 341.5
Cached HTML: 173, 123, 175, 187, 171, 179 = 168
Proxies:
Fresh JSON: 175, 183, 269, 219, 195, 178 = 203.17
Cached JSON: 88, 70, 89, 162, 80, 77 = 94.3
Fresh HTML: 494, 381, 350, 334, 451, 372 = 397
Cached HTML: 176, 170, 104, 101, 111, 116 = 129.7
So, overhead is significant, but the gains when cached (and that should be
all the time, since we currently have 0 evictions) are definitely worth
it. Worth pushing, and probably putting some future effort into reducing
overhead.
On production (again, lame), items#index was consistently averaging
73-74ms when super healthy, and 82ms when pets#index was being louder
than usual. For reference is all. This will probably perform
significantly worse at first (in JSON, anyway, since HTML is already
mostly cached), so it might be worth briefly warming the cache after
pushing.
That is, once we get our list of IDs from the search engine, only
fetch records whose JSON we don't already have cached.
It's simpler here to use as_json, but it'd probably be even faster
if I figure out how to serve a plain JSON string from a Rails
controller. In the meantime, requests of entirely cached items
are coming in at about 85ms on average on my box (dev, cache
classes, many items), about 10ms better than the last
iteration.
We originally had a regression on name-matching, where, among
other issues, `straw hat` returned items containing both "straw"
and "hat", which isn't really helpful behavior since we're sorting
alphabetically. Now, `straw hat` behaves as expected.
Additionally, "phrases like these" behave as expected, too.
Confirmed features:
* Output (retrieval, sorting, etc.)
* Name (positive and negative, but new behavior)
* Flags (positive and negative)
Planned features:
* users:owns, user:wants
Known issues:
* Sets are broken
* Don't render properly
* Shouldn't actually be done as joined sets, anyway, since
we actually want (set1_zone1 OR set1_zone2) AND
(set2_zone1 OR set2_zone2), which will require breaking
it into multiple terms queries.
* Name has regressed: ignores phrases, doesn't require *all*
words. While we're breaking sets into multiple queries,
maybe we'll do something similar for name. In fact, we
really kinda have to if we're gonna keep sorting by name,
since "straw hat" returns all hats. Eww.