Google reveals details about its datacenters

  • The key takeaway for algorithms research seems to be that "[w]e don’t know how to build big networks that deliver lots of bandwidth". This is exactly what S. Borkar argued in his IPDPS'13 keynote [1]. An exa-scale cluster can't be cost-efficient unless the bisection bandwidth is highly sublinearly in the cluster's computing power.

    We need new algorithms that - require communication volume and latency significantly sublinear in the local input size (ideally polylogarithmic) - don't depend on randomly distributed input data (most older work does)

    It's really too bad that many in the theoretical computer science community think that distributed algorithms were solved in the 90s. They weren't.

    [1] http://www.ipdps.org/ipdps2013/SBorkar_IPDPS_May_2013.pdf

  • One of the biggest things I've had to unlearn as an SRE leaving google is this: RPC traffic is not free, fast, and reliable (So long as you don't go cross-datacenter). For most companies it is expensive and slow. Facebook's networks are still designed for early-2000s era topologies and their newer topologies won't fix that; They've still got way too little top-of-rack bandwidth to the other racks nearby.

    Microsoft hasn't even caught on yet, and is still designing for bigger and bigger monolithic servers. I can't tell what Amazon is doing, but they seem to have the idea with ELBs at multiple layers.

  • I once saw the co-founder of Cloudera saying that Google exists in a time-warp 5-10 years in the future, and every now and then it gives the rest of us a glimpse of what the future looks like.

    Felt exaggerated at the time, but it often seems like the truth.

  • So every cluster machine has 40gbit ethernet (?) - does anywhere else do that?

    Looking at Table 2 http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183....

  • Ironically, if you look at the data center as a computer, this looks very much like scaling up, not scaling out.

    I wonder if one day we will find that sending all data to a data center for processing doesn't scale. I think that's already a given for some realtime'ish types of applications and it could become more important.

    Obviously, the success of decentralised computing depends a lot on the kinds of connected devices and whether or not data makes sense without combining it with data from other devices and users.

    With small mobile devices you always have battery issues. With cars, factory equipment or buildings, not so much. But management issues could still make everyone prefer centralisation.

  • > The amount of bandwidth that needs to be delivered to Google’s servers is outpacing Moore’s Law.

    Which means, roughly, that compute and storage continue to track with Moore's Law but bandwidth doesn't. I keep wondering if this isn't some sort of universal limitation on this reality that will force high decentralization.

  • > The I/O gap is huge. Amin says it has to get solved, if it doesn’t then we’ll stop innovating.

    I can imagine you can solve the throughput problem with relative ease, but the speed of light limits latency at a fundamental level, so proximity will always win there.

    I tend to think that storage speed/density tech rather than networking is where the true innovations will eventually need to happen for datacenters. You can treat a datacenter as a computer, but you can't ignore the fact that light takes longer to travel from one end of a DC to another than it would from one end of a microchip.

  • > The end of Moore’s Law means how programs are built is changing.

    I feel like this is just not true. Yes we have slowed down significantly over the past couple of years. BUT I do believe that once something big happens (My bet is the replacement of Silicon in chips or much stronger batteries). I feel like Moores law will pick up right where it left off.

  • Shameless plug, but if you'd like to build a network like Google's, at any scale, let us know, we'd love to help: https://cumulusnetworks.com/

    - nolan cofounder, cto cumulus networks