1
0
Fork 0
forked from OpenNeo/impress

Add handlers for requests that were stopped during the reboot process

According to our GlitchTip error tracker, every time we deploy, a
couple instances of `Async::Stop` and `Async::Container::Terminate`
come in, presumably because:

1. systemd sends a STOP signal to the `falcon host` process.
2. `falcon host` gives the in-progress requests some time to finish up
3. Sometimes some requests take too long, and so something happens.
   (either a timer in Falcon or a KILL signal from systemd, not sure!)
   that leads the ongoing requests to finally be terminated by raising
   an `Async::Stop` or `Async::Container::Terminate`. (I'm not sure
   when each happens, and maybe they happen at different points in the
   process? Maybe one happens for the actual long-running ones, vs the
   other happens if more requests come in during the meantime but get
   caught in the spin-down process?)
4. Rails bubbles up the errors, our Sentry library notices them and
   sends them to GlitchTip, the user presumably receives the generic
   500 error, and the app can finally close down gracefully.

It's hard for me to validate that this is *exactly* what's happening
here or that my mitigation makes sense, but my logic here is basically,
if these exceptions are bubbling up as "uncaught exceptions" and
spamming up our error log, then the best solution would be to catch
them!

So in this change, we add an error handler for these two error classes,
which hopefully will 1) give users a better experience when this
happens, and 2) no longer send these errors to our logging 🤞️

That strange phenomenon where the best way to get a noisy bug out of
your logs is to fix it lmao
This commit is contained in:
Emi Matchu 2024-02-28 13:50:13 -08:00
parent 522287ed53
commit bc8d265672
2 changed files with 72 additions and 4 deletions

View file

@ -1,3 +1,5 @@
require 'async'
require 'async/container'
require 'ipaddr'
class ApplicationController < ActionController::Base
@ -20,6 +22,11 @@ class ApplicationController < ActionController::Base
end
end
class AccessDenied < StandardError; end
rescue_from AccessDenied, with: :on_access_denied
rescue_from Async::Stop, Async::Container::Terminate,
with: :on_request_stopped
def authenticate_user!
redirect_to(new_auth_user_session_path) unless user_signed_in?
end
@ -52,14 +59,15 @@ class ApplicationController < ActionController::Base
raise ActionController::RoutingError.new("#{record_name} not found")
end
class AccessDenied < StandardError;end
rescue_from AccessDenied, :with => :on_access_denied
def on_access_denied
render file: 'public/403.html', layout: false, status: :forbidden
end
def on_request_stopped
render file: 'public/stopped.html', layout: false,
status: :internal_server_error
end
def redirect_back!(default=:back)
redirect_to(params[:return_to] || default)
end

60
public/stopped.html Normal file
View file

@ -0,0 +1,60 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Dress to Impress: Oops, caught in a reboot!</title>
<style type="text/css">
body {
background-color: #fff;
color: #666;
font-family: arial, sans-serif;
padding: 2em 1em;
}
main {
border: 1px solid #ccc;
margin-inline: auto;
padding: 1em;
max-width: 600px;
display: grid;
grid-template-areas: "illustration body";
grid-template-columns: auto 1fr;
column-gap: 1em;
}
h1 {
font-size: 1.5em;
margin: 0;
margin-bottom: 0.5em;
}
p {
margin-bottom: 0.5em;
}
</style>
</head>
<body>
<main>
<img
width="100"
height="100"
alt="Distressed Grundo programmer"
src="/images/error-grundo.png"
/>
<div>
<h1>Oops, caught in a reboot!</h1>
<p>
Oh wow, hi! We're deploying a new version of DTI
<em>right now</em>, and your pageload got caught in the
middle of the restart process, sorry about this!
</p>
<p>
Reloading the page now should hopefully get you back where
you were going! 🤞
</p>
</div>
</main>
</body>
</html>