The problem with your analysis is that it assumes multiple things are part of the same “system”.
There are really four pieces in this puzzle.
First, you have the multi-player matchmaking. That is, You, Alice and Bob want to play. Alice creates a game and you and Alice try to join. Because Demigod is peer to peer (and not client server) all 3 of you have to be able to connect to each other in order for it to work.
The originally shipped version of Demigod had a pretty elegant third-party NAT server system we licensed and adapted for our uses. It worked pretty well when it was just a few hundred people online but quickly fell apart when there were tons of people. It also completely failed to work with ADSL and a few other types of uncommon in the US but common internationally network connections.
So over the past two weeks, the Impulse team was assigned to build something new from scratch. Whereas before everyone had to connect to everyone to even get into the lobby (which meant 1 failed NAT and nobody even got into a game) instead a new direct code mode was developed (anyone who port forwards and such should be able to get in rapidly). If that fails, then it goes back to the third-party NAT stuff.
However, the beta has a bug. Even once you get into the lobby, if someone fails to connect to someone, for whatever reason, they are disconnected from the NAT facilitator. It’s one of the side-effects of working 108 hours (See the crunch time thread on Qt3). That will get addressed tomorrow.
Second, when Demigod shipped, even when players connected to each other and were then sent into the lobby, a new socket was created to get to the next person requiring yet another port. As a result, connecting players took exponentially longer for each person you tried to put into the lobby (i.e. many minutes). Moreover, many routers and ISPs do not allow multiple ports for P2P so it would cause more people to come in.
However, over the last 2 weeks, the Impulse team developed an internal proxy socket system (I don’t really know what that means but everyone tells me it’s insanely cool). So before, Demigod might use 20 ports now it will only use one and it was done without having to require GPG to change a line of code.
Third, once people are connected, it’s all on Demigod now. So if one person quitting knocks everyone out that is a completely unrelated issue that GPG is looking into. Frankly, I don’t understand it.
However, anyone who experiences that should email me their Demigod log and their ImpulseReactor log (bwardell@stardock.com) that’s located in their my documents\my games\gpg\demigod\ directory and I’ll forward the info to the right people.
And last, you then have the stats posting and all the usual cheese stuff that has to be dealt with at the same time. Favor points, disconnects, etc.
Now as an outsider who’s on the inside (so to speak) I’m very frustrated with what has happened. Ultimately, as the CEO of the publisher, it’s my fault. I understand why the US multiplayer launch was so problematic and I’ve thought of plenty of things that would have reduced it (an open MP beta would have been very useful).
But I can say, having looked at it pretty closely that the new connectivity system is pretty robust as a architecture and the in-game disconnect issue is unrelated and probably fairly easy to address (GPG is aware of it).