There's some really interesting neural network tricks involved in training where you train the system with all the neurons, then you selectively shut off the input of neurons and force the system to work without, and rotate those around. This is called dropout, and is used to avoid "overfitting" your solution to the problem - sometimes you can come up with a very precise fit for the input data, but given new data the solution will break because it was in effect too precise.
This article says it better than I can:
Dropout is a technique where randomly selected neurons are ignored
during training. They are “dropped-out” randomly. This means that their
contribution to the activation of downstream neurons is temporally
removed on the forward pass and any weight updates are not applied to
the neuron on the backward pass.
As a neural network learns, neuron weights settle into their context
within the network. Weights of neurons are tuned for specific features
providing some specialization. Neighboring neurons become to rely on
this specialization, which if taken too far can result in a fragile
model too specialized to the training data. This reliant on context for a
neuron during training is referred to complex co-adaptations.
You can imagine that if neurons are randomly dropped out of the
network during training, that other neurons will have to step in and
handle the representation required to make predictions for the missing
neurons. This is believed to result in multiple independent internal
representations being learned by the network.
I do wonder if we as humans tend to overfit when we solve problems way too much. I do a bit of machine learning (though I'm not expert, and would not insult one by saying I could hold a candle to them), and often need to remind my boss/peers that while the hypothesis we proved looks good with THIS data set, we need to look at many data sets and make sure we didn't just get lucky.
AlphaGo has probably looked at far more data than an all of humanity to train it's best practices, without the premature optimization traps we fall into.