Opinions, Context & Ideas from the TPM Editors TPM Editor's Blog
The following is from the latest update I posted (yesterday) in The Hive ...
First, after a very successful launch of the new comment system, the system became sort of too successful, by which I mean that people started using it a lot faster and in larger numbers than we'd anticipated. There were also parts of the configuration of Discourse that weren't optimized for our system and put way too much strain on our servers. So, a bigger crowd than anticipated and server issues that made individual users put more pressure on the servers than they should have.
The site never went down but this is what led to the pervasive site slowness that was at its worst during peak hours of mid-day. That was also behind the random weirdnesses and glitches that started to crop up in The Hive and in public comments. So mid-week our tech team was triaging between diagnosing the causes of the problems, restructuring the server array that runs TPM and keeping the site online. And not all of those are easily done at the same time.
So yesterday morning I authorized taking public comments temporarily offline to make sure that the site itself was fast and available and give the team a chance to get ahead of the curve on fixing the underlying problems. I've been on a business trip to DC since Wednesday morning so when I authorized that I didn't realize that it would end up that we took The Hive temporarily offline too. When the tech team looked at it, severing comments from The Hive would be too much additional delay etc. If I'd been available when they had to make the decision I would have approved it.
Anyway, this is what you saw as a window of about 10 hours when all Discourse stuff was offline. Once we had things stabilized we temporarily brought everything back but only for logged on users - whether they're Prime or just people with regular commenting accounts. That got things down to a manageable load while we continue working on the servers. We anticipate that the new set up will be completed over the weekend - at which point we'll put everything back online.
As many of you have noticed a window of Discourse activity disappeared while this happened. None of that stuff is gone. It came down while we were swapping out databases while all this was happening. We'll bring it back online once we get the server reconfiguring I mentioned above done, which should mean they'll be back mid-next week.