UPDATE 07/25 10:00AM:
Support is getting a window scheduled for their maintenance. I've asked for late afternoon/early evening today with a couple hours advance notice so I can post an outage notice.
===========
UPDATE 12:00AM:
Diagnostics did in fact return with a CPU fault. I've requested they schedule the downtime with me but technically they can proceed with it whenever they want to, so there's a good chance there will be an hour or so of downtime whenever they get to my server- I'll post some advance notice if I'm able to.
===========
As I mentioned in the previous post, we appear to have a hardware fault on the server running Lemmy.tf. My provider needs full hardware diagnostics before they can take any action, and this will require the machine to be powered down and rebooted into diagnostics mode. This should be fairly quick (~15-20mins ideally) and since it is required to determine the issue, it needs done ASAP.
I will be taking everything down at 11:00PM EST tonight to run diagnostics and will reboot into normal mode as soon as I've got a support pack. If the diagnostics pinpoint a hardware fault, followup maintenance will need to be scheduled immediately, ideally overnight but exact time is up to their engineers.
I'm also prioritizing prep work to get the instance migrated over to a better server. This has been in the works for a few weeks, but first I'll need to migrate the DB over to a new Postgres cluster and kick frontend traffic through a load balancer to prevent outages from DNS propagation whenever I finally cut over to the new server. I'd also like to get Pict-rs moved up to S3, but this will likely be a separate change down the road.
I was asleep, love it when this stuff happens overnight ๐