Today’s bot outages (the techy post)

Some bots went offline today, which led to web account outages. The reason was one of our bot servers which has died because of a hard drive failure. The affected bots were quickly moved to a nearby server… which got crashed under the extra load, too. This was unexpected.

The good news is that while our server team was trying to reanimate these machines, the development team managed to develop the system which allows SmartBots to quickly detach (or, say, temporary forget) any set of broken bots. This will also help during SL rolling restarts, when large random sets of bots fall offline.

We are sorry if your bots were offline today. Everything is running smoothly now!

P.S. We are going to completely replace the second, unreliable server tomorrow. All bots are expected to stay online during the migration.

The issue with some of personal bots fixed

We were updating one of the bot servers (the 2nd of 5 we have) and got stuck with a minor issue which turned out into a real problem: the ISP network outage during update caused the server to lock up.

Personal bots living on that server were affected, it took few hours to relaunch them (we maintain a live backup of all running bots, but this system update was related to backup itself).

We deeply sorry for that! We are adding 3 days to all affected bots, hope this will compensate the inconvenience.