ticoombs

joined 1 year ago
MODERATOR OF
[–] [email protected] 2 points 3 months ago (3 children)

We do have one! That is where the admin team chats.

See you there!

[–] [email protected] 1 points 3 months ago (1 children)
[–] [email protected] 2 points 3 months ago

Yep the underlying bug was finally fixed in -rc.11.

Basically our logo died every time the container restarted as it was never properly saved.

I'll fix our logo backup soon

[–] [email protected] 2 points 3 months ago (2 children)

Are you still getting the long load times? We have updated multiple times since then and I haven’t found anything obvious which would be causing issues. Cheers

[–] [email protected] 3 points 3 months ago (2 children)

Are you still getting the long load times? We have updated multiple times since then and I haven't found anything obvious which would be causing issues. Cheers

[–] [email protected] 1 points 3 months ago (3 children)

Are you still getting the issues? I haven't found a reason yet, but we have updated 4 times since previously

[–] [email protected] 1 points 3 months ago

Possibly fixed, I tweaked the caching mechanism to specify if it was html and json. At least that's what it should do!
If you manage to see it again left me know.

[–] [email protected] 3 points 3 months ago

Thanks for letting me know!

An interesting issue indeed! I'll have to check it out. I have a sneaking suspicion it might be due to the nginx caching... 🤔

[–] [email protected] 3 points 3 months ago
[–] [email protected] 4 points 3 months ago (1 children)

I have, I've got a couple accounts on sone merch stores that will allow everyone to buy a few things. Such as stickers, mugs, t-shirts etc. But we need some good graphics. The Reddthat logo by itself only works for the stickers.

My ideas were:

  • Sticker: logo
  • Mug: logo + text & BIG text
  • Tshirt: big graphic on front, or lots of random text on the front with "I've readthat" on the back
  • Hoodie: small text on front + big logo/graphic on back

But I still need that big graphic design and some time to create all the other logos

[–] [email protected] 4 points 3 months ago

You're not wrong.
I remember back when I deposited money into a random bank account and some internet person sent me internet money.

The internet really is an amazing place

[–] [email protected] 3 points 3 months ago (1 children)

Congratulations! 👏 Happy B-day and here's to many more to my across the river friends 🎉

Ps. The video works and is great!

 

Basically, I'm sick of these network problems, and I'm sure you are too. We'll be migrating everything: pictrs, frontends & backends, database & webservers all to 1 single server in OVH.

First it was a cpu issue, so we work around that by ensuring pictrs is on another server, and have just enough CPU to keep us all okay. Everything was fine until the spammers attacked. Then we couldn't process the activities fast enough, and now we can't catch up.

We are having constant network drop outs/lag spikes where all the networking connections get "pooled" with a CPU steal of 15%. So we bought more vCPU and threw resources at the problem. Problem temporarily fixed, but we still had our "NVMe" VPS, which housed our database and lemmy applications showing an IOWait of 10-20% half the time. Unbeknown to me, that it was not IO related, but network related.

So we moved the database server off to another server, but unfortunately that caused another issue (the unintended side effects, of cheap hosting?). Now we have 1 main server accepting all network traffic, which then has to contact the NVMe DB server and pict-rs server as well. Then send all that information back to the users. This was part of the network problem.
Adding backend & frontend lemmy containers to the pict-rs server helped alleviate and is what you are seeing at the time of this post. Now a good 50% of the required database and web traffic is split across two servers which allows for our servers to not completely be saturated with request.

On top of the recent nonsense, it looks like we are limited to 100Mb/s, that's roughly 12MB/s. So downloading a 20MB video via pictrs would require the current flow: (in this example)

  • User requests image via cloudflare
  • (its not already cached so we request it from our servers)
  • Cloudflare proxies the request to our server (app1).
  • Our app1 server connects to the pictrs server.
  • Our app1 server downloads the file from pictrs at a maximum of 100Mb/s,
  • At the same time, the app1 server is uploading the file via cloudflare to you at a maximum of 100Mb/s.
  • During this point in time our connection is completely saturated and no other network queries could be handled.

This is of course an example of the network issue I found out we had after moving to the multi-server system. This is of course not a problem when you have everything on one beefy server.


Those are the board strokes of the problems.

Thus we are completely ripping everything out and migrating to a HUGE OVH box. I say huge in capital letters because the OVH server is $108/m and has 8 vCPU, 32GB RAM, & 160GB of NVMe. This amount of RAM allows for the whole database to fit into memory. If this doesn't help then I'd be at a loss at what will.
Currently (assuming we kept paying for the standalone postgres server) our monthly costs would have been around $90/m. ($60/m (main) + $9/m (pictrs) + $22/m (db))

Migration plan:

The biggest downtime will be the database migration as to ensure consistency we need to take it offline. Which is just simpler than

DB:

  • stop everything
  • start postgres
  • take a backup (20-25 mins)
  • send that backup to the new server (5-6 mins (Limited to 12MB/s)
  • restore (10-15 mins)

pictrs

  • syncing the file store across to the new server

app(s)

  • regular deployment

Which is the same process I recently did here so I have the steps already cemented in my brain. As you can see, taking a backup ends up taking longer than restoring. That's because, after testing the restore process on our OVH box we were no where near any IO/CPU limits and was, to my amazement, seriously fast. Now we'll have heaps of room to grow with a stable donation goal for the next 12 months.

See you on the other side.

Tiff

17
Forcing federation (reddthat.com)
submitted 5 months ago* (last edited 5 months ago) by [email protected] to c/[email protected]
 

In the past 24-48 hours we've done a lot of work as you may have noticed, but you might be wondering why there are posts and comments with 0 upvotes.

Normally this would be because it is from a mastodon-like fediverse instance, but for now it's because of us and is on purpose.

Lemmy has an API where we can tell it to resolve content. This forces one of the most expensive parts of federation (for reddthat) , the resolving of post metadata, to no longer be blocking. So when new posts get federated we can answer quicker saying "we already know about that, tell us something we don't know"!

This makes the current activity graphs not tell the whole story anymore as we are going out-of-band. So while the graphs say we are behind, in reality we are closer than ever (for posts and comments).

Shout out to the amazing @[email protected] who thought up this crazy idea.

Tldr, new posts are coming in hot and fast, and we are up-to-date with LW now for both posts and comments!

 

There was a period on an hour where we migrated our database to it's own server in the same data centre.

This server has faster CPUs than our current one, 3.5GHz v 3.0GHz, and if all goes well we will migrate the extra CPUs we bought to the DB server.

The outage was prolonged due to a PEBKAC issue. I mistyped an IP in our new postgres authentication as such I kept getting a "permission error".

The decision of always migrating the DB to it's own server was always going to happen but was pushed forward with the recent donations we got from Guest on our Ko-Fi page! Thank-you again!

If you see any issues relating to ungodly load times please feel free to let us know!

Cheers,
Tiff

 

Our server rebooted and unfortunately I had not saved our new firewall rules. This blocked all Lemmy services from accessing our databases.

This has now been fixed up! Sorry to everyone for the outage!

 

Edited: this post to be the Lemmy.World federation issue post.

We are now ready to upgrade to postgres 16. When this post is 30 mins old the maintenace will start


Updated & Pinned a Comment in this thread explaining my complete investigation and ideas on how the Lemmy app will & could move forward.

43
submitted 6 months ago* (last edited 6 months ago) by [email protected] to c/[email protected]
 

Recently beehaw has been hung out to dry with Open Collective dissolving their Foundation which was their hosted collective. Basically, the company that was holding onto all of beehaw's money, will no longer accept donations, and will close. You can read more in their alarmingly titled post: EMERGENCY ANNOUNCEMENT: Open Collective Foundation is dissolving, and Beehaw needs your help
While this does not affect us, it certainly could have been us. I hope the Beehaw admins are doing okay and manage to get their money back. From what I've seen it should be easy to zero-out your collective, but "closure" and "no longer taking donations" make for a stressful time.

Again, Reddthat isn't affected by this but it does point out that we are currently at the behest of one company (person?) from winning the powerball or just running away with the donations.

Fees

Upon investigating our financials we are actually getting stung on fees quite often. Forgive me if I go on a rant here, skip below for the donation links if you don't want to read about the ins and outs. OpenCollective uses Stripe, which take 3% + $0.30 per transaction. Thats pretty much an industry standard. 3% + 0.30. Occasionally it's 3% + 0.25 depending on the payment provider Stripe uses internally. What I didn't realise is that Open Collective also pre-fills the donation form. This defaults to 15.00%. So in reality when I have set the donation to be $10, you end up paying $11.50. $1.50 goes to Open Collective, and then Stripe takes $0.48 in transaction fees. Then, when I get reimbursed for the server payments, Stripe takes another payment fee, because our Hosted Collective also has to pay a transfer fee. (Between you and me, I'm not sure why that is when we are both in Australia... and inter-bank transactions are free). So of that $11.50 out of your pocket, $1.50 goes to Open Collective, Stripe takes $0.48, then at the end of the month I lose $0.56 per expense! We have 11 donators so and 3 expenses per month, which works out to be another $0.15 per donator. So at the end of the day, that $11.50 becomes $9.37. A total difference of $2.13 per donator per month.

As I was being completely transparent I broke these down to 3 different transactions. The Server, the extra ram, and our object storage. Clearly I can save $1.12/m by bundling all the transactions, but that is not ideal. In the past 3 months we have paid $26.83 in payment transaction fees.

After learning this information, anyone who has recurring donations setup should check if the would like to continue giving 15% to Open Collective or not, or pass that extra 15% straight to Reddthat instead!

So to help with this outcome we are going to start diversifying and promoting alternatives to Open Collective as well as attempting to publish, maybe a specific page which I can update like /donate. But for the moment we will update our sidebar and main funding page.

Donation Links

From now on we will be using the following services.

I'm looking into Librepay as well but it looks like I will need to setup stripe myself, if I can do that in a safe way (for me and you) then I'll add that too.

Next Steps

From now on I'll be bundling all expenses for the month into one "Expense" on Open Collective to minimise fees as that is where most of the current funding is. I'll also do my best to do a quarterly budget report with expenditures and a sum of any OpenCollective/Kofi/Crypto/etc we have.

Thank you all for sticking around!

Tiff

 

Now that is a Gaming Router

 

Just got the server back up into a working state, still investigating why it decided to kill itself.

Unfortunately dealing with $job all day today with things randomly breaking like this. I'm guessing leap year shenanigans, but have no way in knowing that yet.

Just wanted to let you all know it's back up!

103
submitted 7 months ago* (last edited 7 months ago) by [email protected] to c/[email protected]
 

Video link for those on clients who don't show links when they are videos: https://i.imgur.com/5jtvxPQ.mp4

view more: ‹ prev next ›