Google reveals details about its datacenters
- The key takeaway for algorithms research seems to be that "[w]e don’t know how to build big networks that deliver lots of bandwidth". This is exactly what S. Borkar argued in his IPDPS'13 keynote [1]. An exa-scale cluster can't be cost-efficient unless the bisection bandwidth is highly sublinearly in the cluster's computing power. - We need new algorithms that - require communication volume and latency significantly sublinear in the local input size (ideally polylogarithmic) - don't depend on randomly distributed input data (most older work does) - It's really too bad that many in the theoretical computer science community think that distributed algorithms were solved in the 90s. They weren't. - [1] http://www.ipdps.org/ipdps2013/SBorkar_IPDPS_May_2013.pdf 
- One of the biggest things I've had to unlearn as an SRE leaving google is this: RPC traffic is not free, fast, and reliable (So long as you don't go cross-datacenter). For most companies it is expensive and slow. Facebook's networks are still designed for early-2000s era topologies and their newer topologies won't fix that; They've still got way too little top-of-rack bandwidth to the other racks nearby. - Microsoft hasn't even caught on yet, and is still designing for bigger and bigger monolithic servers. I can't tell what Amazon is doing, but they seem to have the idea with ELBs at multiple layers. 
- I once saw the co-founder of Cloudera saying that Google exists in a time-warp 5-10 years in the future, and every now and then it gives the rest of us a glimpse of what the future looks like. - Felt exaggerated at the time, but it often seems like the truth. 
- Previously: https://news.ycombinator.com/item?id=9977414 
- So every cluster machine has 40gbit ethernet (?) - does anywhere else do that? - Looking at Table 2 http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.... 
- Ironically, if you look at the data center as a computer, this looks very much like scaling up, not scaling out. - I wonder if one day we will find that sending all data to a data center for processing doesn't scale. I think that's already a given for some realtime'ish types of applications and it could become more important. - Obviously, the success of decentralised computing depends a lot on the kinds of connected devices and whether or not data makes sense without combining it with data from other devices and users. - With small mobile devices you always have battery issues. With cars, factory equipment or buildings, not so much. But management issues could still make everyone prefer centralisation. 
- > The amount of bandwidth that needs to be delivered to Google’s servers is outpacing Moore’s Law. - Which means, roughly, that compute and storage continue to track with Moore's Law but bandwidth doesn't. I keep wondering if this isn't some sort of universal limitation on this reality that will force high decentralization. 
- > The I/O gap is huge. Amin says it has to get solved, if it doesn’t then we’ll stop innovating. - I can imagine you can solve the throughput problem with relative ease, but the speed of light limits latency at a fundamental level, so proximity will always win there. - I tend to think that storage speed/density tech rather than networking is where the true innovations will eventually need to happen for datacenters. You can treat a datacenter as a computer, but you can't ignore the fact that light takes longer to travel from one end of a DC to another than it would from one end of a microchip. 
- > The end of Moore’s Law means how programs are built is changing. - I feel like this is just not true. Yes we have slowed down significantly over the past couple of years. BUT I do believe that once something big happens (My bet is the replacement of Silicon in chips or much stronger batteries). I feel like Moores law will pick up right where it left off. 
- Shameless plug, but if you'd like to build a network like Google's, at any scale, let us know, we'd love to help: https://cumulusnetworks.com/ - - nolan cofounder, cto cumulus networks