From bc8d2656723760a7e365f314d4caf6f0e12c3100 Mon Sep 17 00:00:00 2001
From: Emi Matchu <emi@matchu.dev>
Date: Wed, 28 Feb 2024 13:50:13 -0800
Subject: [PATCH] Add handlers for requests that were stopped during the reboot
 process
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

According to our GlitchTip error tracker, every time we deploy, a
couple instances of `Async::Stop` and `Async::Container::Terminate`
come in, presumably because:

1. systemd sends a STOP signal to the `falcon host` process.
2. `falcon host` gives the in-progress requests some time to finish up
3. Sometimes some requests take too long, and so something happens.
   (either a timer in Falcon or a KILL signal from systemd, not sure!)
   that leads the ongoing requests to finally be terminated by raising
   an `Async::Stop` or `Async::Container::Terminate`. (I'm not sure
   when each happens, and maybe they happen at different points in the
   process? Maybe one happens for the actual long-running ones, vs the
   other happens if more requests come in during the meantime but get
   caught in the spin-down process?)
4. Rails bubbles up the errors, our Sentry library notices them and
   sends them to GlitchTip, the user presumably receives the generic
   500 error, and the app can finally close down gracefully.

It's hard for me to validate that this is *exactly* what's happening
here or that my mitigation makes sense, but my logic here is basically,
if these exceptions are bubbling up as "uncaught exceptions" and
spamming up our error log, then the best solution would be to catch
them!

So in this change, we add an error handler for these two error classes,
which hopefully will 1) give users a better experience when this
happens, and 2) no longer send these errors to our logging 🤞❗️

That strange phenomenon where the best way to get a noisy bug out of
your logs is to fix it lmao
---
 app/controllers/application_controller.rb | 16 ++++--
 public/stopped.html                       | 60 +++++++++++++++++++++++
 2 files changed, 72 insertions(+), 4 deletions(-)
 create mode 100644 public/stopped.html
diff --git a/app/controllers/application_controller.rb b/app/controllers/application_controller.rb
index c8df5d4b..f6628297 100644
--- a/app/controllers/application_controller.rb
+++ b/app/controllers/application_controller.rb
@@ -1,3 +1,5 @@
+require 'async'
+require 'async/container'
 require 'ipaddr'
 
 class ApplicationController < ActionController::Base
@@ -20,6 +22,11 @@ class ApplicationController < ActionController::Base
     end
   end
 
+  class AccessDenied < StandardError; end
+  rescue_from AccessDenied, with: :on_access_denied
+  rescue_from Async::Stop, Async::Container::Terminate,
+    with: :on_request_stopped
+
   def authenticate_user!
     redirect_to(new_auth_user_session_path) unless user_signed_in?
   end
@@ -52,14 +59,15 @@ class ApplicationController < ActionController::Base
     raise ActionController::RoutingError.new("#{record_name} not found")
   end
 
-  class AccessDenied < StandardError;end
-
-  rescue_from AccessDenied, :with => :on_access_denied
-
   def on_access_denied
     render file: 'public/403.html', layout: false, status: :forbidden
   end
 
+  def on_request_stopped
+    render file: 'public/stopped.html', layout: false,
+      status: :internal_server_error
+  end
+
   def redirect_back!(default=:back)
     redirect_to(params[:return_to] || default)
   end
diff --git a/public/stopped.html b/public/stopped.html
new file mode 100644
index 00000000..7daeca1e
--- /dev/null
+++ b/public/stopped.html
@@ -0,0 +1,60 @@
+<!doctype html>
+<html lang="en">
+	<head>
+		<meta charset="utf-8" />
+		<meta name="viewport" content="width=device-width, initial-scale=1" />
+		<title>Dress to Impress: Oops, caught in a reboot!</title>
+		<style type="text/css">
+			body {
+				background-color: #fff;
+				color: #666;
+				font-family: arial, sans-serif;
+				padding: 2em 1em;
+			}
+
+			main {
+				border: 1px solid #ccc;
+				margin-inline: auto;
+				padding: 1em;
+				max-width: 600px;
+
+				display: grid;
+				grid-template-areas: "illustration body";
+				grid-template-columns: auto 1fr;
+				column-gap: 1em;
+			}
+
+			h1 {
+				font-size: 1.5em;
+				margin: 0;
+				margin-bottom: 0.5em;
+			}
+
+			p {
+				margin-bottom: 0.5em;
+			}
+		</style>
+	</head>
+	<body>
+		<main>
+			<img
+				width="100"
+				height="100"
+				alt="Distressed Grundo programmer"
+				src="/images/error-grundo.png"
+			/>
+			<div>
+				<h1>Oops, caught in a reboot!</h1>
+				<p>
+					Oh wow, hi! We're deploying a new version of DTI
+					<em>right now</em>, and your pageload got caught in the
+					middle of the restart process, sorry about this!
+				</p>
+				<p>
+					Reloading the page now should hopefully get you back where
+					you were going! 🤞
+				</p>
+			</div>
+		</main>
+	</body>
+</html>