Google Cloud NEXT '17 - News and Updates

Managing encryption of data in the cloud (Google Cloud Next ’17)

NEXT '17
Rate this post
(Video Transcript)
[MUSIC PLAYING] MAYA KACZOROWSKI: My name is Maya Kaczorowski. I'm a product manager here at Google. I focus on encryption at rest encryption key management. And that's what we'll be talking about today. So I'm a crypto nerd, but you might not all be, so I'll cover some primitives, as well, before jumping in. We'll talk about how we encrypt data at rest at Google, the options that you have as a developer, and some partners that you can work with in order to have more options. I also apologize, I'm kind of losing my voice, so you can decide at the end if this was more painful for you or for me. So first let's talk about what kinds of encryption we have, in general. So you can think about encryption in three different states– encryption at rest, encryption in transit, or encryption in use. Encryption at rest is what you use to protect your data from exfiltration, in case your device is stolen, et cetera– it's how it's protected at the base layer, on hardware, on a device.

This is the focus of the talk that we'll have today. The second point, encryption in transit, is how we protect data when it's moving between two locations. So how we first authenticate the endpoints, encrypt the data, validate that transaction, and decrypt that data once it's been moved around. So Google uses RSA 2048-bit certificates to our front end. We use perfect forward secrecy. We were the first major provider to use perfect forward secrecy on our cloud. And all of that's, again, by default. You don't have to configure any of that for your data in transit. The last point here, data encryption in use, is how you can protect your data while it's being used by a server. This is a newer concept in the crypto community. And you can think of something like homomorphic encryption, which you might have heard of, as being a topic in this space. It's something that's much newer. It's not as practically applicable yet. Google does to research in this space, but it's way beyond the scope of this talk, so we're not going to talk about that today.

So let's talk about how you might be doing encryption right now if you're a company. So if you're a smaller company, a startup, if you have something on-prem, you might not be encrypting your data at all, or you might be relying on your cloud provider to protect your data. So you're kind of like this Service A situation here– I have some data, it's there, I don't know if it's encrypted or not. I don't have the infrastructure to manage that. If you're a larger enterprise company, you probably have some way of encrypting your data and managing keys. If you're in your own data center, you actually might not be encrypting your data at rest in your own data center. But you might also be more sophisticated, and have a custom cryptographic library, and a central Key Management Service, or a KMS. So that's what you see in Service B here. So I have some data in my application, and that data is encrypted with a key. Because it's centrally in a Key Management Service.

If you're a much more sophisticated enterprise company, you probably have a more complex key hierarchy. So we'll get into what that means in a second. But you're not just using one key to protect data, you're using multiple layers of keys. And that's because when you start encrypting things at really large scale, it becomes really hard to manage a massive amount of keys. You might also be using something like a Hardware Security Module, or an HSM, to protect your data. So in application C over here, Service C, you have some data that's encrypted with a key, and that key's encrypted with another key, a key-encryption key, that could sit in that Hardware Security Module. So it's a more complex key hierarchy. So what do we offer at Google? So you have a continuum of options before, if you're doing things on-prem right now, in terms of maturity. The same idea at Google. Depending on where you lie on this spectrum, you're going to pick a different thing.

And it's not about picking an option that's specific for your company, it's about picking option that's specific for your workload. So the first option, and how we do default Google encryption. Google encrypts customer content stored at rest by default, with no action required from the customer, using one or more encryption mechanisms. And I'll be talking about this more in a second. We're the only cloud that does this. You don't have to go check a box to turn on encryption– it's just there. This is no-op security. For most of your workloads, that's probably good enough. The second one– customer-managed encryption keys in Cloud KMS. So Urs mentioned yesterday morning that Cloud KMS is in general availability. I'm super excited. What Cloud KMS lets you do is generate encryption keys, use them, rotate them, destroy them. They're AES-256 symmetric keys. And you can use them right now if you're doing application layer encryption, secret management, et cetera, et cetera.

If you have a requirement to manage your own encryption keys, that's the option for you. The last option here is customer-supplied encryption keys, where Google does this a bit differently than our competitors. You can provide us with some bits that we use to protect your data. I'll talk a bit more what I mean by that in a second. But we don't actually write those bits to disk. We only keep them in memory. So we're not recording what you've given us anywhere at Google. We're the only cloud that has this exact set up, and that's pretty exciting. So this is already available for data objects in Google Cloud Storage, and for disk instances and images in Compute Engine. And we have new partner integrations that I'll be talking about today. If you have a requirement to manage your keys on-premise, use multiple parties to manage your keys, or a requirement to generate your own key material, then that's the option for you. So recapping, you have an array of options here.

If you don't have any specific requirements, don't worry about it. You can walk out right now and your data is encrypted, it's great. If you have a requirement to manage your own keys, Cloud KMS. If you have a requirement to generate your own keys, keep your keys on-prem, or keep your keys on a secondary provider, then C-SEK is the way to go. So I'll be delving into what each of these offer. So first, default Google encryption. How does Google encrypt everything at scale? Because it's massive. So to understand how we do that, you have to understand how we store data at Google. When an object is uploaded, for example to Google Cloud Storage, we shard or chunk that object into smaller pieces. And then each of those chunks gets a locally-generated data encryption key, or a DEK. That key is generated on that machine that writes that thing. I'm talking about storage layer encryption here. Each chunk will have a unique data encryption key, even if it's part of the same Google Cloud Storage object as another chunk, even if it belongs to the same customer as another chunk, even if it's sitting on the same machine as another chunk.

We have a super small, low-level isolation– cryptographical isolation– for how data is stored. Access to those chunks is controlled through access control lists at a per-chunk level. So the storage system controls who can actually access those chunks and those keys at that level. Those chunks are distributed across Google's infrastructure for latency, backup, storage, et cetera. And we also actually backup data encrypted. We don't decrypt it and re-encrypt it in the new backup system. We just take it and move it somewhere else, and it's backed up that way. So if someone were to try to walk into Google and steal– steal hardware, steal something physical– and take your data, they would first have to know where all the chunks are that their object has been split into, which is a really non-trivial problem at Google's scale, have access to all those chunks, and have access to all the keys that encrypt those chunks. So where are those keys stored? Well actually, you can think of them as being stored as, like, a header to that data chunk.

But of course, we're not going to write data encryption keys in plain text– no, no, no. We're going to encrypt those data encryption keys. So we encrypt them with another key called the key encryption key. This is the hierarchy that I was mentioning earlier. So we have data encrypted by data encryption keys, and data encryption keys encrypted by key encryption keys. Those key encryption keys sit inside Google's internal Key Management Service, or KMS, and KMS was built for that purpose. So KMS does not allow keys to be exported. Rather, the data encryption keys go to KMS to be encrypted and re-encrypted, and sent back. That allows us to have auditability over how those key encryption keys are used. That also allows us to have those tight ACLs that we want, again, on how the usage of those key encryption keys by services and users is done. KMS is highly scalable, and it has to be, again, to operate things at Google scale. So it's incredibly low latency, high availability, and globally available.

And what that actually means, in practice, is that our KMS runs on tens of thousands of production machines in Google data centers globally. So when you actually are pinging something and getting a piece of data back, there is an encryption process happening. And it's hitting a local instance of KMS in order to decrypt that data and get it back to you. That's what allows us to have encryption at scale at Google. All right, so you guys are like, OK, I got– you know, I'm skimming a slide. So first, how does that work in practice? So you can think about this as being three different actors here. There's a service that's using the data. There is a storage system that's storing the encrypted data. And there's a KMS that's storing the encryption keys– the key encryption keys. So what happens in practice is the service asks the storage system for some object. The storage system verifies that the service has the right to access that object, figures out all the chunks in which that data is stored, checks those ACLs.

Then it pulls the data encryption keys that are sitting with those chunks of data, and passes those data encryption keys, encrypted, to the KMS. The KMS then verifies that the storage system– again, another ACL check– that it has the right to access those key encryption keys, decrypts those data encryption keys in memory, sends them back to the storage system. And people often ask me– that connection here between the storages and the KMS, that's an encrypted RPC. So you're not sending around plaintext keys on our network. Then from the storage system back to the service, the storage system decrypts the data, and sends back the plaintext data to the service, in most cases. In some cases, the service decrypts it directly. Again, that's a general example. For something like SSDs, it's actually a locally-managed key for the SSD at the DEK level. But this, in general, is how storage works in our systems– sorry, encryption and storage work in our systems. So you're like, OK, we got data, data encryption keys, key encryption keys– it's turtles all the way down, right?

Kind of. So what happens next? This is the fun part. So we said earlier, we have some blue– data, the yellow– data encryption key, green– key encryption key. And that key encryption key is stored inside our KMS. All right, but we're not going to write keys to disk, right? We said that earlier. That's not going to happen, plain text keys to disk. So when KMS is not running in memory, where are those keys stored? Well of course, they're stored encrypted on disk. So they are wrapped with a KMS master key. And that KMS master key sits in another KMS that we call root KMS. And root KMS is much smaller, has maybe a dozen keys that are really critical to Google's infrastructure, and runs in special security machines in our data centers. Root KMS is, of course, itself protected with something called the root KMS master key. So the root KMS master key is held in distributed memory, in a peer-to-peer infrastructure that we'll call the root KMS master key distributor. That's the fun part about taking internal code names and trying to explain them externally– I end up with root KMS master key distributor.

So the root KMS master key distributor is kept in memory, and basically what happens is a new job spins up and says, I'm the root KMS master key distributor, give me the root KMS master key. And it has a list of all the other hosts that are running the root KMS master key distributor. Asks it for the master key, and they pass it back and forth, and you have another instance up and running. I think it's explained in much better detail in our security design white paper. But the reason this can work in practice is that every job that spins up at Google has a unique identity. And so when something spins up and asks for the root KMS master key distributor, we know that that's actually what it is– it's not faking and potentially another rogue piece of code or something. So what that means in practice is that your data is stored locally to wherever it is, with the data encryption keys. The key encryption keys are stored on prod machines in KMS. Root KMS is run on security machines in our data centers.

And as long as at least one instance of the root KMS master key distributor is running in memory globally, in theory, we're fine, right? OK, now what if everything at Google goes down? What if every single Google data center loses power simultaneously? We have nothing running in memory anymore. I love working at Google. People think about problems like this. What that means in practice is that we've written that key to disk, to security hardware kept in physical safes, in at least two globally-distributed Google locations that less than 20 people have access to. And so if everything at Google were to completely go down, someone would go open a safe, get our root KMS master key, put it back into production as we turn back up our data centers. Your data will still be there, and it will still be encrypted. So that's how– that's kind of cool. So again, a quick summary– that's many, many layers of the key hierarchy there. But we can do this in order to encrypt your data at scale at Google, by default. The sexy line that you get to say is, "Google Cloud Platform encrypts customer content stored at rest, without any action required from the customer, using one or more encryption mechanisms.

#GoogleNext17" So there's a white paper on our website that explains this in detail. All the same diagrams and things. Everything I'm saying right now is public. And there are a few exceptions around logs and that type of thing, but those are all explained in the white paper. We're not hiding anything here. So that's what you get by default. And once you explain this to your CSO, and they're like, no, no, no, I still need to manage my own keys. You're like, great. Google has this other thing, called Cloud KMS. Cloud KMS is now in general availability. It lets you generate, use, rotate, and destroy keys, and has a couple features that I want to just highlight quickly. One is automatic rotation. What that means in practice is that we can generate a new crypto version for you, that you can use to encrypt data on an automatic basis. So you can say, for example, as a rule, every 60 days I want a new crypto version created. And we don't re-encrypt the old data, but we– you can therefore limit the amount of data encrypted with any single key.

The second item here is a 24-hour delay for key destruction. So I'm very terrified about a customer potentially destroying an encryption key without realizing that it encrypts data, and therefore, losing access to that data, for all intents and purposes. So we built in a purposeful 24-hour delay, so that when you say, destroy key, it goes into a kind of a temporary suspended state for 24-hours, and you can restore that key– again, to prevent error, insiders, any kind of other risk that could happen there. IAM and logging were integrated with cloud IAM, have predefined roles to allow for separation of duties, that I'll talk about in a second, and logging. We're integrated with cloud autologging. Cloud autologging has two kinds of logs. Admin activity logs– which are on by default– which are things like change IAM permission, change key rotation policy, et cetera. And it has data access logs– which are not on by default, but you can turn on– that will say things like, was this key used to encrypt?

Lastly, KMS is available via our gcloud CLI, client libraries in seven languages, including Go, Java, Python, and then through the cloud console UI. I do often get a question, so I'll just point out really quickly– just like our KMS internally, keys in cloud KMS are not exportable. So somebody will be like, hey, how come I can't see my encryption key on the cloud console? That's by design. You can't see the encryption key. The only way for you to interact with that encryption key is to send data via an API. You'll get back encrypted or decrypted data, depending on the request that you were making. Let's talk about a couple of concepts in Cloud KMS, so that we're all on the same page. Object hierarchy– so Cloud KMS has the concept of CryptoKeys. CryptoKeys are objects that sit inside KeyRings. KeyRings are just groups of CryptoKeys. CryptoKeys are like a named object that has a set of CryptoKeyVersions. The CryptoKeyVersions are the actual cryptographic material.

So when you're talking about a new Version, we're talking about a new set of cryptographic material that we're going to use to encrypt or decrypt data. The CryptoKey itself has nothing, and no particular material [? is hid to it. ?] Key states– CryptoKeyVersions can be in multiple states. They're listed here– enabled, disabled, scheduled for rotation, destroyed. Enabled CryptoKeyVersions can be used to encrypt and decrypt data. Disabled Versions means that they cannot be used to encrypt and decrypt data. So far, so good. There's another important concept to grok here, which is that of a primary key version. A primary version is the version that's used to encrypt your data. So now that we understand that, when we were talking about key rotation earlier, rotating a key– what we actually do is generate a new version and make that the primary version. So we're not talking about re-encrypting old data, but we're limiting anything else that you want to encrypt with that key.

Scheduled for destruction and destroyed– what happens in practice, when I was talking about destruction earlier, is that when you say, destroy this key, we move it into the state scheduled for destruction, and then after 24 hours it becomes destroyed. If it's still in scheduled for destruction, you can restore it, and get back to something like enabled or disabled. In terms of key rotation, I already explained what I meant by having a new CryptoKeyVersion created and making that primary. You can do that automatically, so like I said, set a policy– every 30 days, every two days, whenever you want– and create a new CryptoKeyVersion. And you can also do that manually, via an API call, or on the UI. And lastly, separation of duties. I mentioned that we have some predefined roles for IAM. What that means in practice is you might want to have an organizational control that says, the people who use the encryption keys aren't the same people who manage the encryption keys. So we set that up for you already.

We have two roles. You can give Alice the cloudkms.admin role and Bob the cloudkms.cryptoK eyEncrypterDecrypter role. And that means that Alice can manage who has access to the keys, and rotations, all of that, and Bob can set up something to encrypt data using those keys. What happens in practice is that Bob will probably be a service, so you'll probably want to give something like a service account that role. We also have encrypt-only roles and decrypt-only roles, if that's something that you're looking for. What makes Cloud KMS special? Things that your SRE is going to love– scalability, global availability, and latency. Scalability– you can have as many KeyVersions as you want. So I had a customer [? who wanted me to– ?] they're like, can I have 1,000 key versions? I'm like, of course– do you want a million? You want 10 million? Just hit us. The reasoning behind this is that we shouldn't have to limit how you can encrypt data at scale, just because you can't have as many versions as you want.

You should have as many versions as you want. This also enables a use case which I'll call cryptographic isolation. You might have the desire for your data, that each of your customers' data is encrypted with a different key. But every row in a table that corresponds to a particular customer is encrypted with a separate key. That's possible with Cloud KMS. You can implement that, there's nothing stopping you. So go forth and encrypt. The second one here, global availability– Cloud KMS is a regional service. So what that means in practice is that you pick a region in which to create your resources, CryptoKeys and KeyRings. CryptoKeys and KeyRings can't be moved, or renamed, or actually, in Cloud KMSes case, deleted. You can destroy all the crypto materials associated to them, but you can't actually delete that resource. And that's for logging, auditing, historical purposes. So something like global availability– you can create a key in, like, us-east-1, but you can still access it in Europe.

There's nothing stopping you from using that key globally. So you don't have to have a different key in each region. We have optimized the keys in those regions for latency, but that's not preventing you from using it elsewhere. And last one here– latency. Cloud KMS is very fast in encrypt and decrypt operations, on the order of tens of milliseconds. So you should be able to use it in the serving path of your application. You shouldn't non-encrypt something because you're worried about your application starting up. So this should not be a limit again. Again, all things your SREs are going to love. And part of that is also with our general availability announcement. We posted an SLA yesterday morning around availability, around uptime. So you can see that on our website, and that comes as part of standard Cloud KMS. OK, I'll take up some water. So we've been talking about secret management– sorry, encryption keys. What did I do? Yes, please– please help me with my notes, please.

We've been talking about encryption at rest encryption. So one of the first uses Cloud KMS might be encryption at scale– bulk data encryption. So you might want to implement something like a multiple-key hierarchy– like I just described that Google does– to encrypt your data. That lets you have encryption at scale. A second use case is around secret management. So you might have some small pieces of data that you want to keep secret in your application. One such example would be something like API keys, OAuth tokens, some super-special credentials that you have, something like that. And typically, developers that I've talked to now are keeping those in a variety of places. But they include things like directly in the code, in the metadata of a VM, in a deployment manager, in a configuration file, in a bunch of different places. And that's all fine, and– I think we have multiple slides happening. OK, cool. We'll figure it out. Cool. So you can store secrets in a variety of places right now.

What we're also going to be publishing today– publishing yesterday– on our website is a set of questions for you to think about secret management, and an option for how you could implement that at GCP with what's currently available. So what I mean by that is that there's a variety of questions you might be having around secret management. You probably want to optimize for a few things around security and functionality. Strong security– something like authorization, like, who should actually have access to the secret? How do I manage access to that secret? Version management– how do I know that all my applications have the same version of the secret right now, and that they're all in sync? Encryption– are my secrets encrypted at rest? These types of questions. I'm not going to click anything. Give me one second. So what we're hosting on our website today is not a be-all and end-all. You can think– there's probably a couple of ways you can improve on just hard coding secrets in code.

The first one would be just to encrypt those things that are hard coded in code, so that they are more protected than if any of your developers could have access, but some of your developers have access. A secondary improvement could be something like putting all of your secrets in a bucket, and encrypted that bucket– like a Google Cloud Storage Bucket– and putting the right IAM, the right logging information, all that, all around it. Another option might be to use a third-party secret management solution And I think, again, this is another array of options and complexity, like, what are you looking for out of this? If you're just looking to limit a little bit more access, you're looking for additional auditability, or you're looking for more smart features, like secret rotation, you might pick something more complex. So what we published is a bit more explanation as to how we do it, with Cloud Storage Bucket, Cloud KMS, IAM, and cloud autologging. It's not the best solution.

It's not the best solution for your environment, necessarily. But it's one way you can think about managing secrets in a central way, securely. Awesome, thank you. The last option that I talked about earlier– so we had default encryption, we had Cloud KMS, and we had customer-supplied encryption keys. And customer-supplied encryption keys– so what happens here in practice is that you provide us with a 256-bit string. We assume it's probably an AES-256 key. We combine that with a random cryptographic nonce– so that the randomness of that is sure to be at least within level– and keep that in memory, and use that to encrypt your data. So we talked earlier about data encryption keys and key encryption keys. What this means in practice is, for data objects in GCS, images, instances, and VMs in GCE, we can use what you provided us plus that cryptographic nonce as the key encryption key. The more complex thing here is that you might not have a really easy way of managing those keys locally, because we need you to pass us that key as part of an API call.

Since we don't store that key on disk, only in memory, and only for the amount of time required for that transaction, every time that you would need to access that object, or write that object, or spin up that instance, or whatever, you need to re-provide us that key in that API call. So what we're announcing today is a couple of partner integrations to make that significantly easier for you. Yes. So if you're looking to use customer-supplied encryption keys, you basically need to have a KMS on premise. We have a variety of partners who are going to help us do this for you. For customers without existing KMS solutions, you can look to work with them. So using KeyNexus products– KeyNexus Enterprise On-Prem, and KeyNexus Cloud– you can provision a key both for GCE via gcloud, or for GCS via gsutil. Using Thales nShield HSMs– which are FIPS 140-2 level 3 compliant– so if that's a requirement that you have around FIPS compliance, you can supply a key for use with GCE. Using Virtru, you can supply a Virtru generated a key for use with GCE.

Using Ionic's products– Ionic Protect for cloud storage– you can supply a key for use with GCS. And using Gemalto's products– SafeNet KeySecure– a product, again, a key for GCS. So KeyNexus right now is supporting GCE and GCS; Thales and Virtru, GCE; and Ionic and Gemalto, GCS. And we're going to expand that in the future, so you have even more options for how you can do key management with our partners. We hope that you find these much simpler than managing your keys on-prem, and again, love to hear any feedback on that functionality. So again, how does this work in practice for all of these options? We have a storage object. That storage object is split into chunks. The chunks are encrypted with data encryption keys. The data encryption keys are encrypted with key encryption keys. And the key encryption key will come from a variety of places, depending on what you're trying to do, right? It'll come from default Google encryption, using our internal KMS. Right now you can implement a similar model using application layer encryption for Cloud KMS.

Or you can use a customer-supplied encryption key, where you provide us with some bits– either directly or via partner, via one of our five partners– and we'll add a cryptographic nonce, and then use that in our system. So that's, I think, all I was actually going to talk about. So I went really fast. If you go to this link, there will be information on all of the options that I've talked about today, additional information on the white paper I mentioned. We have a couple of code labs that you can do using Cloud KMS, so there's actually one if you go to the first floor, downstairs. I'll also be standing outside of the security area, if you have any questions. And then will be a blog post next week, announcing the partners and some new functionality for secret management and all that kind of stuff. Thank you. [MUSIC PLAYING]


Read the video

Can management of encryption keys be easier in the cloud than on-premise? During this video, Maya Kaczorowski discusses the continuum of encryption options available, from encryption of data at rest by default, to Cloud Key Management System, to Customer Supplied Encryption Keys. You’ll learn how our encryption tools allow management of your own keys, including generation, rotation and destruction of those keys. She also shares best practices for managing and securing secrets.

Missed the conference? Watch all the talks here:
Watch more talks about Infrastructure & Operations here:

Leave a Comment

Your email address will not be published. Required fields are marked *

1Code.Blog - Your #1 Code Blog