September 9, 2017 at 7:29 am #529
Hey guys, you might already know and I know its quite early in the morning over there in the UK but the stats and maybe the mining servers are down.
Getting a 503 server error on a lot of the JS as well as not being able to log into the platform.
Noticed it was down approximately 2.5 hours ago from posting this messageSeptember 9, 2017 at 8:02 am #530
yep have been getting:
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
<title>503 Service Unavailable</title>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
on the JS for a couple of hours now – and login not possibleSeptember 9, 2017 at 9:34 am #532
thanks for getting in touch. Small server issue this morning but all back up and running now.
Thanks again.September 9, 2017 at 9:48 am #533
Confirmed we’re all back up and running again.September 9, 2017 at 3:14 pm #535
Just to give some more details really, the Apache log files grew too big and caused it to crash. My own fault, I should have seen it coming. Seems like a lot of the scaling problems we have had are related to Apache so I’m hoping to move the web server across to nginx ASAPSeptember 9, 2017 at 7:51 pm #537
nginx also has scaling problems. Keep an eye on those log files on bother servers.September 10, 2017 at 7:26 am #539
Hey Gemino, thanks for the heads up, maybe it is better the devil you know. I had chance to calm down a little bit and turned off the access log on Apache which should help a little bit. Tomorrow we are going to be moving the servers about and trying to distribute the load better. Still in two minds about whether to try and switch it across to Nginx or stick with Apache. Probably we are changing too much already so will stick with Apache for the time being.September 10, 2017 at 9:07 am #547
You must truly optimize any server to suit your needs, but for now all servers require heavy optimization to even handle any load of traffic for a long period of time. Especially heavy loads.
Good luck with this project.September 10, 2017 at 12:23 pm #548
I suspect another AWS instance or two behind the load balancer will help.
Appears that you’re down again or your rolling out the changes.September 10, 2017 at 6:31 pm #553
Yep we had an issue with the CPU this time. 6500 concurrent connections proving too much to handle. We are learning fast but not fast enough. Tomorrow we will be rolling out the update so the servers will be down around 1pm GMT. We should be able to separate the publisher demand from the platform then which will be a big help, or at least we can break multiple machines at the same time.September 10, 2017 at 8:32 pm #562
The servers still aren’t coping well so we are going to move the update forwards to 9am GMT 4am CST