Jump to content
  • Sign Up

Inside ArenaNet: Live Game Outage Analysis


Recommended Posts

Thanks for sharing Robert!!  I love this look behind the scenes.  I work in IT just down the road from you, in Bothell, and we've had our share of those Mondays.  Great teams make all the difference!  Maybe I'll shoot my resume your way.  😉

Edited by dippy.8961
  • Like 1
Link to comment
Share on other sites

tl;dr: Karka invaded storage and held the drivers hostage. With their massive evade frames, they managed to evade the server teams for several hours. Server team now has alerts in place to detail if any karka find their way close to storage again.

  • Haha 8
Link to comment
Share on other sites

As a fellow AWS developer, reading about buckets, instances and microservices was a fun reading. I've fought similar incidents so I hope you don't have 1.700.000 objects on the same bucket "folder" like we do, it's hell pulling data from that one 🤣. Also, kudos for high avalaibility databases with one always on standby, it really can save your kitten unless something critical happens (like it happened in the rollback).

 

I am looking forward for more cool blog posts.

  • Like 3
Link to comment
Share on other sites

  • ArenaNet Staff

 Hi All,

 

Many thanks for the feedback about this post and communication in general! I really enjoyed preparing this content - I hope my passion for this game and community showed through! Thanks for reading!

 

21 minutes ago, Razor.9872 said:

tl;dr: Karka invaded storage and held the drivers hostage. With their massive evade frames, they managed to evade the server teams for several hours. Server team now has alerts in place to detail if any karka find their way close to storage again.

 

Also there was a Skritt insider. I've made some new graphs to watch for it 😄

 

-R

  • Like 22
  • Thanks 1
  • Haha 4
Link to comment
Share on other sites

I'm glad most seem to have enjoyed the post. I'll be honest, I didn't really care about it. It reminded me that rollback and all this frustration when some of my friends got a shiny mount for doing nothing, not losing anything, while I didn't get it.

 

I'm not an IT guy and I couldn't finish the post. I'm happy for those who have been interested, but I hope it's not the kind of thing Anet is talking about when then want to "communicate more with their playerbase".

 

I am expecting more concrete, shorter, straight to the point posts about the future of the game, your vision and the direction we're heading in. More answers to our feedback, etc.

 

Tdlr. I'm ok with this kind of post, but I hope it's not the only type we'll get for the "more and better communication".

  • Confused 5
  • Sad 2
Link to comment
Share on other sites

Very interesting read, thanks. Great to hear communication about behind the scenes stuff, helps us to remember the devs are human too.

Also, can't help but notice the irony of getting a code 7:11:3:202:101 error while reading the blog post on my second monitor...

  • Like 2
Link to comment
Share on other sites

I don't think I have ever logged in to post something, but I felt the need to do so just to tell you that I loved reading this! I often browse r/talesfromtechsupport and this was just extra fun to read since I live through the blip! I had also never thought about how truly impressive the uptime is on GW2. Great job with the infrastructure!

Thank you so much for posting this, Robert

Edited by Lanaos.9672
  • Like 1
Link to comment
Share on other sites

What an insight! The way you simplified the problem was really easy to read and understand. As someone who is recently digging into IT stuff for work, this was a really nice story from someone experienced (the line "with that came a difference in how the servers interact with connected devices" reminds me heavily of a similar problem, we face currently in our web development). Every time I have log in problems now, I imagine Robert now, winking at his monitor, sipping coffee, whispering "Yes, it's me 😏". And I will smile.

 

If I had to wish a future IT-related topic, I'd be interested in the megaserver functionality. You've had a big blog post "Introducing the Megaserver System" back then, in which Colin and Samuel explained, how the system will work for us players, but I am now curious, what challenges you had to face during the switch from the old system to the new one and how the megaserver system itself changed over time. An example question, that occurs to us sometimes during the game, is: "Why does the system think this map should close? We are in the middle of a meta event?"

Edited by JotGeh.8047
  • Like 2
Link to comment
Share on other sites

That was a great blog post, detailed and all. Yes i read it all, that was long (and text size was too small) and i enjoy behind-the-scenes especially if it were something involving something i follow.

I remember that time when many have been questioning Arena Net's  efforts and time into the game, i told them this can't be a minor issue. I've had a gaming mouse SOG in which went crazy when i installed the official driver from site, i thought it was W10 fault so i tested it in W7 pc and it stayed the same, after 2 days checking the web about infos and testing i found nothing, than i imagined that the problem was the Windows peripheral "Neutral Mouse (Hidden)" was responsible so i turned Mouse Keys using only keyboard in W7 and Tada: Mouse works fine, so i went back into W10 to check again with Mouse Keys and yes it was the Windows 10 and Windows 7 interfering with the driver 🤣😅 so driver can sometimes make the output faulty in which it can't be known because there isn't any way for knowing it. Even with that i have a question that keeps coming at my mind each time i go to a meta map: Why does the map instance contribution change occurs? And why is it forced to change to that map instance? Funny story really: I was doing map completion in a map with a necro character, a map change pop-up widget appeared at same time when i get to view a vista so the widget disappeared and i continue map completion but felt that map went empty so i knew there was a map change, five minutes later loading screen appears and i was exactly at the place of a champion with no minions at all (nice death) so why the forced map change if it only says "Contribution"?

Thanks for all the events and content, Arena Net. Keep up the good work and stay safe 😄

Edited by Achraf Hidouri.8029
Link to comment
Share on other sites

Gotta say this was pretty neat. Even if I don't understand all the points, I still enjoy when the gaming companies divulge a little of these "behind the scenes" type details to get a better understanding of how they work and problem solve. Probably helps to remind the rest of us that it's not all magic fairies working in the background of the game to keep it functioning since that denies the hard work that's actually put into it all.

  • Like 3
Link to comment
Share on other sites

Thanks for sharing this. I don't think I understood it all but it was still interesting to get an insight into what goes on behind the scenes. I've sometimes said I don't think Anet does enough to draw attention to the fact that the game doesn't normally need to go offline for maintenance and rarely goes offline at all.

 

After I quit Ultima Online in the early 2000's the only MMOs I played for years were GW1 and GW2 and it came as a bit of a shock after that to find out that shutting the game down for maintenance wasn't a thing of the past and was still not only accepted but expected by fans of other games. I've had arguments with people who insisted that if GW2 never goes offline for maintenance that can only mean it has no maintenance of any kind, ever, because there's no other way to do it so it's nice to have something official explaining the situation.

 

(Also I remember that day and how weird it was to realise the problem wasn't with my PC or internet connection and GW2 really was offline.)

  • Like 2
Link to comment
Share on other sites

As someone who is an operations engineer at a big webhost, I can feel the pain. 

I not only understand it, but feel it - I've been there, when databases go out of sync and the effort it can take to fix it.

And how obscure the reason can be... often, just where you won't ever look for it.

 

Amazing job, very fun to read (I've shared it with my entire team), keep it up!

Absolutely loving the communication! 😄

  • Like 1
Link to comment
Share on other sites

I'm all for more posts like this one.  It doesn't matter what facet of the game is focused on - the more the better.  And it doesn't matter what time period is covered; I'm up for hearing things from the early days of GW2 or even from the original GW.  In general, I don't think you can have too much communication.  Thanks for all of your hard work and effort!  You all have put together an amazing game.

  • Like 3
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...