Compare commits

..

6 commits

Author SHA1 Message Date
2e5b1c7350 Remove unused Item.per_page attribute
I feel like this was part of `will_paginate` back before the Rails
community had itself figured out about what belongs in a model?

But yeah, a default per-page value for search results does not belong
here. And I don't think anything references it anymore, because we pass
`per_page` to the `paginate` call in `ItemsController` explicitly! So,
goodbye!
2024-02-25 16:16:43 -08:00
fb2bdd6ea5 Fix crash when dealing with 404'd manifests
First off, I think our code has converged on a convention of gracefully
returning `nil` for manifest-less situations, so we can do that instead
of raise! And then that lets us just simplify this check to whether
`manifest` is present, instead of `manifest_url`, so we stop crashing
in cases where we get to this point in the code and there's a manifest
URL but not a manifest.
2024-02-25 16:05:43 -08:00
2cac048158 Save manifest load info when preloading them, too
This was a bit tricky! When I initially turned it on, running
`rails swf_assets:manifests:load` would trigger database errors of "oh
no we can't get a connection from the pool!", because too many records
were trying to concurrently save at once.

So now, we give ourselves the ability to say `save_changes: false`, and
then save them all in one batch after! That way, we're still saving by
default in the edge cases where we're downloading and saving a manifest
on the fly, but batching them in cases where we're likely to be dealing
with a lot of them!
2024-02-25 16:02:36 -08:00
cc33ce1d6e Skip loading manifest if we recently failed
This helps speed up some item search result pages a lot in the new API
endpoint I'm building!
2024-02-25 15:46:50 -08:00
a684c915a9 Track when manifest was last loaded, and what status it returned
Now we're *really* duplicating with Impress 2020's system lol, but I
need a way to not keep trying to load manifests that are actually 404,
which are surprisingly plentiful!

This doesn't actually stop us from loading anything yet, it just tracks
the timestamps and the HTTP status! But next I'll add logic to skip
when it was 4xx recently.
2024-02-25 15:35:04 -08:00
067cee2d41 Oops, fix bug reading manifest assets when .svg is not present
Ah right, I make this mistake a lot when doing `group_by` stuff: if
there's no SVG, then `assets_by_ext[:svg]` is `nil`, not `[]`. Oh well!
2024-02-25 15:02:35 -08:00
5 changed files with 86 additions and 25 deletions

View file

@ -18,9 +18,6 @@ class Item < ApplicationRecord
SPECIAL_COLOR_DESCRIPTION_REGEX = SPECIAL_COLOR_DESCRIPTION_REGEX =
/This item is only wearable by [a-zA-Z]+ painted ([a-zA-Z]+)\.|WARNING: This [a-zA-Z]+ can be worn by ([a-zA-Z]+) [a-zA-Z]+ ONLY!|If your Neopet is not painted ([a-zA-Z]+), it will not be able to wear this item\./ /This item is only wearable by [a-zA-Z]+ painted ([a-zA-Z]+)\.|WARNING: This [a-zA-Z]+ can be worn by ([a-zA-Z]+) [a-zA-Z]+ ONLY!|If your Neopet is not painted ([a-zA-Z]+), it will not be able to wear this item\./
cattr_reader :per_page
@@per_page = 30
scope :newest, -> { scope :newest, -> {
order(arel_table[:created_at].desc) if arel_table[:created_at] order(arel_table[:created_at].desc) if arel_table[:created_at]
} }

View file

@ -51,18 +51,65 @@ class SwfAsset < ApplicationRecord
end end
def manifest def manifest
raise "manifest_url is blank" if manifest_url.blank? @manifest ||= load_manifest
@manifest ||= NeopetsMediaArchive.load_json(manifest_url)
end end
def preload_manifest def preload_manifest(save_changes: true)
raise "manifest_url is blank" if manifest_url.blank? load_manifest(return_content: false, save_changes:)
NeopetsMediaArchive.preload_file(manifest_url) end
def load_manifest(return_content: true, save_changes: true)
return nil if manifest_url.blank?
# If we recently tried loading the manifest and got a 4xx HTTP status code
# (e.g. a 404, there's a surprising amount of these!), don't try again. But
# after enough time passes, if this is called again, we will!
#
# (We always retry 5xx errors, on the assumption that they probably
# represent intermittent failures, whereas 4xx errors are not likely to
# succeed just by retrying.)
if manifest_loaded_at.present?
last_try_was_4xx =(400...500).include?(manifest_status_code)
last_try_was_recent = (Time.now - manifest_loaded_at) <= 1.day
if last_try_was_4xx and last_try_was_recent
Rails.logger.debug "Skipping loading manifest for asset #{id}: " +
"last try was status #{manifest_status_code} at #{manifest_loaded_at}"
return nil
end
end
begin
NeopetsMediaArchive.load_file(manifest_url, return_content:) =>
{content:, source:}
rescue NeopetsMediaArchive::ResponseNotOK => error
Rails.logger.warn "Failed to load manifest for asset #{id}: " +
error.message
self.manifest_loaded_at = Time.now
self.manifest_status_code = error.status
save! if save_changes
return nil
end
if source == "network" || manifest_loaded_at.blank?
self.manifest_loaded_at = Time.now
self.manifest_status_code = 200
save! if save_changes
end
return nil unless return_content
begin
JSON.parse(content)
rescue JSON::ParserError => error
Rails.logger.warn "Failed to parse manifest for asset #{id}: " +
error.message
return nil
end
end end
MANIFEST_BASE_URL = Addressable::URI.parse("https://images.neopets.com") MANIFEST_BASE_URL = Addressable::URI.parse("https://images.neopets.com")
def manifest_asset_urls def manifest_asset_urls
return {} if manifest_url.nil? return {} unless manifest.present?
begin begin
# Organize the asset URLs by file extension, convert them from paths to # Organize the asset URLs by file extension, convert them from paths to
@ -88,7 +135,10 @@ class SwfAsset < ApplicationRecord
# (There's probably only one of each! I'm just going by the same logic # (There's probably only one of each! I'm just going by the same logic
# we've seen in the JS library case, that later entries are more likely # we've seen in the JS library case, that later entries are more likely
# to be correct.) # to be correct.)
{ png: assets_by_ext[:png].last, svg: assets_by_ext[:svg].last } {
png: assets_by_ext.fetch(:png, []).last,
svg: assets_by_ext.fetch(:svg, []).last,
}
end end
rescue StandardError => error rescue StandardError => error
Rails.logger.error "Could not read URLs from manifest: #{error.full_message}" Rails.logger.error "Could not read URLs from manifest: #{error.full_message}"
@ -259,7 +309,9 @@ class SwfAsset < ApplicationRecord
swf_assets.map do |swf_asset| swf_assets.map do |swf_asset|
semaphore.async do semaphore.async do
begin begin
swf_asset.preload_manifest # Don't save changes in this big async situation; we'll do it all
# in one batch after, to avoid too much database concurrency!
swf_asset.preload_manifest(save_changes: false)
rescue StandardError => error rescue StandardError => error
Rails.logger.error "Could not preload manifest for asset " + Rails.logger.error "Could not preload manifest for asset " +
"#{swf_asset.id} (#{swf_asset.manifest_url}): #{error.message}" "#{swf_asset.id} (#{swf_asset.manifest_url}): #{error.message}"
@ -272,6 +324,10 @@ class SwfAsset < ApplicationRecord
ensure ensure
barrier.stop # If something goes wrong, clean up all tasks. barrier.stop # If something goes wrong, clean up all tasks.
end end
SwfAsset.transaction do
swf_assets.each(&:save!)
end
end end
before_save do before_save do

View file

@ -17,11 +17,6 @@ module NeopetsMediaArchive
ROOT_PATH = Pathname.new(Rails.configuration.neopets_media_archive_root) ROOT_PATH = Pathname.new(Rails.configuration.neopets_media_archive_root)
# Load the file from the given `images.neopets.com` URI, as JSON.
def self.load_json(uri)
JSON.parse(load_file(uri))
end
# Load the file from the given `images.neopets.com` URI. # Load the file from the given `images.neopets.com` URI.
def self.load_file(uri, return_content: true) def self.load_file(uri, return_content: true)
local_path = local_file_path(uri) local_path = local_file_path(uri)
@ -31,7 +26,7 @@ module NeopetsMediaArchive
begin begin
content = File.read(local_path) content = File.read(local_path)
debug "Loaded source file from filesystem: #{local_path}" debug "Loaded source file from filesystem: #{local_path}"
return content return {content: content, source: "filesystem"}
rescue Errno::ENOENT rescue Errno::ENOENT
# If it doesn't exist, that's fine: just move on and download it. # If it doesn't exist, that's fine: just move on and download it.
end end
@ -42,7 +37,7 @@ module NeopetsMediaArchive
# we're not ready to use yet.) # we're not ready to use yet.)
if File.exist?(local_path) if File.exist?(local_path)
debug "Source file is already loaded, skipping: #{local_path}" debug "Source file is already loaded, skipping: #{local_path}"
return return {content: nil, source: "filesystem"}
end end
end end
@ -53,7 +48,7 @@ module NeopetsMediaArchive
File.write(local_path, content) File.write(local_path, content)
info "Wrote source file to filesystem: #{local_path}" info "Wrote source file to filesystem: #{local_path}"
return_content ? content : nil {content: return_content ? content : nil, source: "network"}
end end
# Load the file from the given `images.neopets.com` URI, but don't return its # Load the file from the given `images.neopets.com` URI, but don't return its
@ -78,10 +73,9 @@ module NeopetsMediaArchive
# requests in parallel! # requests in parallel!
Sync do Sync do
response = INTERNET.get(uri) response = INTERNET.get(uri)
if response.status == 404 if response.status != 200
raise NotFound, "origin server returned 404: #{uri}" raise ResponseNotOK.new(response.status),
elsif response.status != 200 "expected status 200 but got #{response.status} (#{uri})"
raise "expected status 200 but got #{response.status} (#{uri})"
end end
response.body.read response.body.read
end end
@ -106,7 +100,13 @@ module NeopetsMediaArchive
ROOT_PATH + path_within_archive(uri) ROOT_PATH + path_within_archive(uri)
end end
class NotFound < StandardError; end class ResponseNotOK < StandardError
attr_reader :status
def initialize(status)
super
@status = status
end
end
private private

View file

@ -0,0 +1,6 @@
class AddManifestLoadedAtAndManifestStatusCodeToSwfAssets < ActiveRecord::Migration[7.1]
def change
add_column :swf_assets, :manifest_loaded_at, :datetime
add_column :swf_assets, :manifest_status_code, :integer
end
end

View file

@ -10,7 +10,7 @@
# #
# It's strongly recommended that you check this file into your version control system. # It's strongly recommended that you check this file into your version control system.
ActiveRecord::Schema[7.1].define(version: 2024_02_21_005949) do ActiveRecord::Schema[7.1].define(version: 2024_02_25_231346) do
create_table "alt_styles", charset: "utf8mb4", collation: "utf8mb4_unicode_ci", force: :cascade do |t| create_table "alt_styles", charset: "utf8mb4", collation: "utf8mb4_unicode_ci", force: :cascade do |t|
t.integer "species_id", null: false t.integer "species_id", null: false
t.integer "color_id", null: false t.integer "color_id", null: false
@ -245,6 +245,8 @@ ActiveRecord::Schema[7.1].define(version: 2024_02_21_005949) do
t.timestamp "manifest_cached_at" t.timestamp "manifest_cached_at"
t.string "known_glitches", limit: 128, default: "" t.string "known_glitches", limit: 128, default: ""
t.string "manifest_url" t.string "manifest_url"
t.datetime "manifest_loaded_at"
t.integer "manifest_status_code"
t.index ["body_id"], name: "swf_assets_body_id_and_object_id" t.index ["body_id"], name: "swf_assets_body_id_and_object_id"
t.index ["type", "remote_id"], name: "swf_assets_type_and_id" t.index ["type", "remote_id"], name: "swf_assets_type_and_id"
t.index ["zone_id"], name: "idx_swf_assets_zone_id" t.index ["zone_id"], name: "idx_swf_assets_zone_id"