Google Cloud NEXT '17 - News and Updates

Cloud networking solutions that support hybrid cloud workloads (Google Cloud Next ’17)

NEXT '17
Rate this post
(Video Transcript)
JOHN VEIZADES: So my name is John Veizades. I'm a product manager at Google, and I work a lot on how enterprise customers connect to our cloud infrastructure. And today's talk is going to be exactly about that. What we're going to be covering is a little bit about– and you're going to hear this throughout the whole week– I'm going to talk a little bit about the Google infrastructure, what we– how you connect to us today and then talk a little bit about how you connect to us tomorrow or soon. Talk a little bit about architectures, deployment architectures and how you can use our cloud networking to build enterprise networks that extend out to the cloud. Something about how we monitor these systems, what we do around monitoring. Some discussion about how you can get better application performance when you come to the cloud. And finally, a little bit about the pricing of all this stuff. So infrastructure, and if you go to about half of the sessions in this conference, you're going to see a picture something like this.

And what this is trying to point out is where are all the Cloud data centers? When you build an enterprise network, you have Cloud data centers as well. Or I should say you have data centers. This is saying where Google's data centers are. And the ones in green are showing the ones that are available today and all the ones in blue are data centers that are in the process of being built out. The numbers tell you how many zones each one of those data centers has. And as you can see, what we're trying to do is cover the globe with data centers that are close to customers. Because millisecond closeness means application performance. So it's really important to get that level of connectivity. The other thing that we have, and if I take away all the Cloud data centers, what pops out is all these little lines, this is Google's backbone. This is what we have been spending years and years on building. Predominantly it was to service the applications that Google is known for– search and YouTube and Gmail and all those.

What we're doing with our Cloud networking infrastructure is giving you access to all of this. Now you don't see it because it's hidden under the covers, but your application data runs over this infrastructure. And it's an infrastructure that Google continues to build on. So, this is what it looks like today. If you see this little– there's this thing that says 2018, that's a fiber cable that we're dropping from somewhere in Portland, Oregon area to somewhere in Japan. And we're going to drop in a little over 100 terabytes of infrastructure there to give you more capacity to– originally it was to give Google more capacity but in short, it's also to give our customers more capacity as you grow. So, we're expanding– in summary, Google Cloud is an expanding platform. It's expanding to cover– give you a global footprint, a footprint that's close to your customers, that gives you high availability, local availability, the ability to take VMs and get them closer to your customers and continue to drive the scale that you guys are driving us to build.

So today, there are a number of ways to connect to this infrastructure. The first one is something what we call direct peering. This is where you have a private ASN– sorry, a public ASN and you're connecting to our infrastructure at one of our points of presence. I'll go into it where are points of presence are in just a second. This is public peering. So when you do that, you get all of Google's routing table and you expose your routing table to us. Another way to do that is through a carrier interconnect. And that's where you go through carriers that you know and love and connect to us through their infrastructure to us. And finally, if you can't do either one of these two or you have a place where you don't have a lot of capacity, you don't need a lot of capacity, you can connect to us using a VPN tunnel. So I'll go into some detail about that as well. So our Cloud interconnects. First of all, when you go through those direct peering links, you do get a dedicated link.

That means you have– how do you say it– I wouldn't say enhanced security but you're just going over a cross-connect in a colocation facility. So it's in a protected– physically protected environment. And from a performance standpoint, you're getting carrier level SLAs, you're getting locations in 70 locations and 33 countries, so it's a place where you can get closer to us and get better performance out of it. And finally, a lot of customers use these methods for the last reason which is there's lower egress costs when you come and meet us versus when we have to carry your traffic to you. And finally with those systems, you get access to the whole set of Google properties. So if you're an enterprise customer who's rolling out G Suite and you want to have a close connection that doesn't go over the internet, these are great solutions for you. Unfortunately, a lot of the customers, what you really need is a pri– extend your private network into Cloud.

And today the way you do that– sorry– is with our IPsec VPN solution. So our IPsec VPN solution is a virtualised solution that runs in the Cloud. It's a scale out VPN so each tunnel starts at about three gigabits per second but you can scale that out based on deploying more infrastructure on top of that. And what it does is it lets you take your RFC1918 space, your private address space and extend it to Cloud. And we do that with a couple entities here. There's a cloud router and a VPN gateway. VPN gateway [INAUDIBLE] terminates your IPsec sessions and the Cloud router gives you the BGP reachability of everything in your Cloud. I'm going to make one comment on Cloud Router because I'm going to use this throughout the rest of the presentation. I want you guys to understand what does the Cloud Router do? Yes, it is a BGP endpoint but that's all it is. It is the control plane of our routing infrastructure. So when you terminate a BGP session on the Cloud Router, what it does is it programs the routes into our SDN infrastructure.

If you– I don't know if you've ever read about something called Andromeda. Andromeda is Google's SDN network architecture. There are some great presentations on it. I'm not going to cover that here. There is actually some presentations online that go into a lot of detail there. But Cloud Router is not the typical Juniper or Cisco router that all the packets go through it. It just programs the– it's the control plane that programs the data plane. So packets don't go through that so it doesn't– we don't need a scale out router. It's a scale out network. So I just wanted to cover that before I use it in other contexts and you guys understand what does the Cloud Router do. When you scale out VPN too, if you need more than three gigabits per second, you can use multiple tunnels and use ECMP across those tunnels to get greater capacity. And we have customers doing that today to drive tens of gigs of traffic into their Cloud infrastructure. But that seems a little bit yesterday.

So what we're doing today is announcing the– same products that are coming out later on this year, actually later on this quarter, that will allow you to extend that using more traditional networking architectures other than VPN. So specifically, we're announcing two products. One is called Private Interconnect. Another is called Partner Interconnect. Both of these techniques allow you to extend your RFC1918 address space from your premise to the Cloud. With the Private Interconnect, this is where you meet us in one of our colocation facilities and you extend a cross-connect from your routed infrastructure to our edge. And with Partner Interconnect is where you go using a carrier. And they can offer you a sub rate interface or actually a full 10 gig interface to our Cloud infrastructure. So, Private Interconnect, this is where an enterprise customer comes to Google. And Partner Interconnect is where an enterprise customer goes to a carrier and then comes to Google. Again, the goal here is to extend your network to our network using your private address space into Google's– in your private address space in the Google Cloud.

And it doesn't require the use of any VPN devices and uses standard routing, as I'll go into it just a second. So, in a more pictorial view, with Private Interconnect, there's the Google peering edge and there's your peering router. And you're connecting to us in a colocation facility. That means your network is connecting to your compute instances in the Cloud. And with Partner Interconnect, you're coming through a service provider from your router to their infrastructure into the Google Cloud. This, as in the previous diagram, these are all layer two connections. And I'll talk about how you do the layer three extensions in just a second. There are carriers that offer you a service like a Metro E kind of a service where they offer you an ethernet connection and you don't have to worry about the routing infrastructure. So there are two methods that carriers offer, one where you have to deal with BGP and another one where you don't have to deal with BGP.

So where do you connect to us? Shortly, when we come out with the product, you'll be able to connect to us where all these green areas are. And over the course of 2017, we'll be lighting up a lot of these areas in blue. These are points of presence that Google has across the globe. So what is a point of presence? For instance in Chicago, we're actually in two locations in Chicago. We're in CoreSite and in Equinix. In Frankfurt, we're in four locations in Frankfurt. So when we say that you're– we have a metro availability, what we're doing is actually offering you in a number of connections in that metro. And in each one of those connections, we offer multiple no shared fate paths to Google infrastructure. So if for instance, if you come into CoreSite in Chicago, what you'll see is there are two locations that you can connect to us in Chicago in CoreSite, what we call a zone 1 and a zone 2. And there is no shared fate between the two of them. So if you're building a highly reliable network, you probably want to connect in zone 1 and zone 2.

I'll go to high availability and architectures for high availability later on in this slide deck. So in general, it doesn't matter where you connect to us in that metro, you're connecting into the Google infrastructure but in multiple sites in any one metro. So a little– some comments, this is all well and good, but what do I have to do? Couple of things, this is all single mode fiber. We support LACP for link bonding. So we offer 10 gig interfaces. You can link bond across multiple 10 gig interfaces to go up from 10 gig up to 80 and even more. If you go for more, please come and talk to us because there's better ways of doing that. You have to support a link local address. So if you're doing the private solution, you have to have a link local address on your router. You have to have a private ASN that you exchange with us. 802.1q VLAN tagging, and that's how we'll do traffic separation. I'll go into the details about traffic separation in just a bit. And this is all using iPv4 link local access.

So, a little bit more detail. So link local address between your router. And the router you connect to is actually the Cloud Router that we talked about earlier. This is where your virtual connection ends up being is on your Cloud router in your project. So in that project, you can then connect to VM instances and you're taking your private address space, you're going over a layer 2 connection to the Google Cloud infrastructure and to your project instances running in the Google Cloud. So, if you're in multiple regions, you'd have a Cloud Router in one region, a Cloud Router in another region, and then you'd have VLAN tags going between both of them. So this shows one VLAN tag going to one region and another VLAN tag would go to another region. All those instances can talk to each other over the Google Cloud backbone. That's all well and good, but let's get better. So we're announcing a feature to Cloud Router called global routing. What global routing does is why should I– if I use BGP, I should be able to advertise region 2 out of region 1 so now I only have to deploy one Cloud Router.

That means I only have to peer with Google in one location and now you have access to our whole infrastructure. Remember– and this is something that maybe we don't drill into you guys enough– is that the network that you have in a Cloud project is a network that spans the globe. And this is what I'm showing here, is that even though you're connecting in one region, you have connectivity to all the regions that you have projects– your project has VMs in. So that means you get access to that infrastructure as showed in earlier slides. And if you're a multinational company that wants to, for instance, do web serving out to people all over the globe, you only really need one connection to us and that gives you access to infrastructure across the globe. Now there are customers that have– actually most customers have multiple projects. So you might have a dev project and you might have your release project. You don't want the two networks to meet. That would be a very bad thing.

It gives you the chance of cross-pollination of information and hackers can get through one network into the other. And most customers want to keep their production network separate from your development network. So what you do is you can have a project with a Cloud Router here and another project with a Cloud Router here. There's VLAN tagging that separates the traffic between those two projects. And on your side, you would do VRFs to separate the traffic on your enterprise network. So this gives you one link into Google but multiple projects using that link and having traffic separation across those multiple projects. The other thing– actually, there's a talk this afternoon at 4:00 that's going to go into a lot of detail on XPN, which is our cross project networking solution, also known as a shared VPC. So, what is a shared VPC in Google's world? What happens is that you have a host project and you have service projects. Your networking for those service projects ends up in the host project.

So this is where you virtually connect to the Google Cloud in that project. This is where all your network resources are, these are where your firewall policies can live. And what that lets you do is have a separation of concerns between the folks that are worried about security and networking and those that are worried about development. So this lets you share a resource like a network connection to multiple projects without those projects really needing to know about the physical infrastructure of building a network. So you can monitor– you can have one set of people monitoring your network infrastructure and another set of people doing all the production deployments of your VMs and all of that. So you have that separation of concerns, you have that separation of management responsibilities, and that extends to the IM roles that are associated with these host and service projects. Again, this afternoon, there's going to be a presentation that goes into a lot more technical detail about a shared VPC and how you can use it in your projects.

So as I said, we were going to talk a little bit about high availability. When you deploy network infrastructure anywhere, you're going to want to do multiple links into our infrastructure. One of the reasons is that we can't support an SLA for one single connection. We do maintenance on all this infrastructure and when we do maintenance, we would take your network out, which is not a nice thing. So, what we suggest doing is in one colocation facility, have multiple links. That gives you the ability to have redundancy and we give you the benefit of having no shared fate between those two links. But in addition to that, and a lot of customers want to go this route, is that you want to do more than that. You want to have multiple connections in multiple regions. And the reason you want to do that is because you want to deal with failover, metro level failover. If your infrastructure in a metro fails, you can failover to another metro area. So, now you have two routers. What we've also done in these two routers is we've built the ability to use BGP the way it's meant to be used.

So you can have different route preferences in each one of those routers. So, let's say this facility is in London and this facility is in the Bay Area, in San Francisco. If you get a cable cut here in London, your traffic off your network would end up in San Francisco because the BGP route preferences would tell you that's what you want to do. You'd see that there was a disconnect connectivity to your London facility and you'd route your traffic to another facility. So this gives you– what we're doing is starting to extend the Cloud Router, the network infrastructure to look more like a traditional enterprise network. With the same level of redundancy, the same level of failover, the same way of doing all of these things that you do in a large multi-region, multi-country enterprise network. Oh one other– one last thing on this. Also when you think about– I used to do a lot of work with telcos. And telcos are all about, it has to be five nines and they're starting to think about six nines.

So with this kind of an infrastructure, you start building that level of failover, that level of SLA by multiplying the number of connections you have. So with a description, if you were to build a network that looks something like this, you'd be able to build a network that had a five nines reliability. A number of customers have come to us and said, yeah that private address space is really good. But what I really want to do is I have some public address space and I would like to back up that public address space with infrastructure in the Cloud. And there's a way to do that in GCP. The first thing you would do is at the VM level, you'd attach an IP address to each one of those VMs. So you're taking that public IP address and attaching it at the operating system level of the VM. The second thing you'd do is you'd– with another feature that's coming out shortly– is you'd be able to change the route that is being advertised. Today, the Cloud Router advertises in a very static way what routes are available.

We're adding a feature called flexible routing that will allow you to specify which routes you would advertise. A reason you'd want to do this is for instance, you may have a production network that you're not ready to go and expose to the whole world, that you want to have it all ready. So you want the routing infrastructure ready. You'll be able to flip a bit on that network and then it'll start to be advertised off the Cloud Router. It's a way too predictably release routes out to your on premise infrastructure. So that's a feature that's coming out. So, in Cloud Router, you'd add the prefixes for those public IPs that you want to advertise over BGP. Then you'd add a static route from the private IP. Because these VM instances will have a Google private IP and you need to know how to get the reverse traffic back. So you'd put a static route, taking that private IP to the public IP of the VM. Sorry, from the public IP to the private IP of the VM.

And finally, you'd set up your router so that advertises as those to the rest of the world. And that is a method for getting public IPs into the Google infrastructure that are your own public IPs. Because today, the only way you can use a public IP is if you take something out of our range. So, for monitoring, so now we've given you a link to our infrastructure and you've decided to provision 10 gig and you have no idea how much of that link you're using. So there'll be– this is a mock up of the user interface that you'd see in the Cloud Console. And we'll do things like link utilization, drops, items like that. We'll also give you– suggest that you get a ping endpoint so you can measure round trip times and monitor that as well. And all of these statistics will be available in Stackdriver so you can actually extract them and put them into your own console. So you don't have to go to the Google Console to get all this data. You can actually get it out and use it on your own.

Another thing we're changing– we've done some fundamental changes in how you interconnect to the Google infrastructure. And doing fundamental changes means you probably fundamentally want to change the way we describe that in the Cloud Console. So today, we're interconnected, as you see, there's something called VPN. That's going to change so it would be interconnect. And that will subsume VPN and private and partner interconnect. So you'd have one place to go in our user interface to control those interconnect methods into the Cloud. So, now you've moved your applications to the Cloud. You've connected to us in a variety of areas. It's still different. It's not your data center, right? It's not the rack of machines that's down the hall. And what we found is that customers have a problem, their applications have problems with this. Because we've changed– fundamentally, we've changed some of the network, some of the basic networking parameters that applications are used to.

And what that big change we've done is we've changed the latency between your front end, the application, and the back end. Whether it be a database server or whether it be some other REST API that that application is using. And those changes mean that you may want to think about changing the way the application behaves. So I'm going to go into some details about how you can change the application to make sure that you're getting the most of our network. And it really has to do about something called the bandwidth delay product. And let me go into a little bit about why– what this is. So if you look at typical Cloud latencies, and I'm throwing– showing you three Google regions that we have– and the average latency to each one of those regions from our pop locations. So if you have an instance running in our US Central region and you connect to us at the closest place you can connect to that US Central region, you're still about 10 milliseconds away from those VMs.

In US East, it's about 10 to 25 milliseconds. And finally in Europe, because it's a smaller place, it's about five to seven milliseconds. That's different. And the difference is. Profound because what you'll see, and now I'm going to go into a little bit of details about how TCP works, if you think about it, what TCP does is it sends a bunch of data down the pipe. And it sends a Windows worth of data down the pipe. When you're in the same data center, your application and the back end are running in the same data center. The time it takes to send that is fairly quick. Well, it's the time to take– regardless of where you're at, the time it takes to send it is still about the same. What's very different is how long it takes to acknowledge that data. Because your data goes down the pipe, has to get to its destination before an acknowledgement can come back. And on a network that is less than a millisecond in delay, this time it's very short. But as you get to even short delays like seven milliseconds, if you use the default settings in a TCP kernel, you're going to get really bad performance.

And I get a lot of calls from customers saying, I connected to you. I have these big pipes to you and I still get really lousy performance. Why? And they're forgetting one thing, is I need to change the TCP behavior to take advantage of the– well not take advantage, to remediate the issues you have with this milliseconds of latency that you've added to your application. And the way you do that is by increasing the advertised window in your TCP session. So rather than advertising in the default window, you want to extend that window so that you can send more data to the remote site. Now this doesn't help applications that– especially HTTP applications that are doing small transactions all the time, but it really affects applications that are trying to move lots of data. And that's where more often than not, you complain about slowness. You're doing an upload into a GCS Cloud Bucket. You're taking the movie you just finished rendering and downloading it. Doing those kinds of things can take a long time if you don't take into account the delay bandwidth product.

And the way to do that is by increasing the window size. And when you increase the window size, what happens is that you send more data and you acknowledge more data at the same time, meaning you can keep the pipe full. So if you're getting 10 gigabit pipes to us, you really want to keep the 10 gigabits full, especially if you're doing these large data transfers. So, just some notes on– one last note on this subject is that many of the TCP kernels are already set up this way. And the reason I bring this up is because if you do see that, it's a good way to go check this. And if you do a Wikipedia search on delay bandwidth product, there's tons of articles that go into how you'd go and look at these features and how you calculate all these things. So let's talk briefly about pricing of the solution. Because everyone wants to know how much they're going to spend. So, with our public item, right now it's– when you're doing direct peering with us, we don't charge you to direct peer with us and we give you some benefits on the cost of egress.

So you get a benefit in price on the cost of egress. With VPN, you're paying for the VPN gateway, and that's roughly about $5 a month for every three– roughly three gigabits of large packet data. With the new solution that we're coming up with, Private Interconnect, it's a $2,500 charge for the port that we give you, and there's egress costs on top of that. With a Partner Connection, there's a lot of charges, depending on how you connect with a partner. We start at the partner access pricing, so whatever they charge you to connect to us. And then there's a sliding scale depending on how much data you're pumping into Google on our end as well. So this gives you a sense of what the pricing is of those solutions. So, I'm almost finished. So how do you learn more? If you're really interested in this, please talk to your sales folks, whoever your sales contact is at Google. So that they can help you get into our team, possibly become part of the alpha, and be aware of when the beta of this solution comes out.

The other area where you can help a lot is somewhere in this– oh, there she is. There's one of my colleagues in an orange t-shirt back there. And they're running a UX lab down on the first floor. So if you're interested in helping us help you with a great user interface, they are more than happy and more than willing to listen to you guys and get you involved in studies about how we do UX development at Google so that we can give you great user interfaces and get great solutions back. And finally, there are a number of sessions that cover a number of aspects of networking. Unfortunately, one of them is actually happening right now that's two doors down. But I want to bring this up, all these sessions are recorded. So if you miss a session here, you'll be able to see at a later time. So I want to highlight some of these sessions that are happening throughout the course of the next three days. And I find, I end up with one on Friday where we'll be talking about third party solutions to networking.

Today Checkpoint, for instance, announced their VSEC firewall, so next generation of firewalls for GCP. We'll be talking about that and we'll be talking about other SSL VPN, about load balancers, and other solutions for enterprise networking in the Cloud.


Read the video

As enterprises extend their network edge into the cloud, they can use a number of cloud networking features that enable the seamless connection of an enterprise data center to the cloud. In this video, you’ll learn about Google Cloud Platform (GCP) features that can help you enhance your data center workloads with cloud based solutions. John Veizades covers physical connections to the cloud, options and selection criteria for connection to the cloud and cloud features that facilitate extending your data center to the cloud.

Missed the conference? Watch all the talks here:
Watch more talks about Infrastructure & Operations here:

Leave a Comment

Your email address will not be published. Required fields are marked *

1Code.Blog - Your #1 Code Blog