Google Cloud NEXT '17 - News and Updates

Introduction to container development in Google Cloud Platform (Google Cloud Next ’17)

NEXT '17
Rate this post
(Video Transcript)
[MUSIC PLAYING] DAN PAIK: Hi everyone. My name is Dan Paik. I'm a product manager on the Google Container Engine team. So I work with Container Engine as well as Kubernetes. And today I'd like to give you a quick introduction to container development in the Google Cloud Platform. I want to show you how easy it is to write and deployed code using containers with the various tools that are available to you on the Google Cloud Platform. We'll go over using tools such as Google Cloud Shell, the Google Container Registry. And of course we'll also go over Container Engine, as well. The main thing is that the Google Cloud Platform has all the tools available for you to get started with container development. We'll go over Google Container Engine, and where it fits within the overall landscape of the product offerings across the Google Cloud Platform. Google Container Engine runs open-source Kubernetes which was started by Google to manage and orchestrate containers. It's a very powerful tool.

And I hope to be showcasing a lot of that power to you today. I'll also run through an example and demo of writing a simple application, containerizing it. We'll push the image over to the Container Registry. And we'll really start running and scaling the application with Google Container Engine. After manually building a container and deploying it, which is really simple, I'll show you a quick demo on how to use Helm, the Kubernetes package manager. To even more quickly and easily deploy containers for some of the most common applications, such as WordPress, my SQL, Redis, and many more. And finally, as you continue your journey into container development, hopefully you'll be building more and more complex applications. And as you start building more of these complex applications they almost always require the use of persistent volumes to manage, allocate, and bind storage. And this needs to be done the Kubernetes way. So I'll go through that as well. Now, if we look at the various product offerings within the Google Cloud Platform, this is really where Container Engine fits in.

And on the one side with Compute Engine, you have infrastructure as a service. This is where you use VMs, you configure your disks, you configure networking. And in many ways, Compute Engine is what many people think about when they first talk about using Cloud technology. Container Engine is containers as a service. As I mentioned earlier, it runs open source Kubernetes. And it's really able to manage and orchestrate your containers using Kubernetes. All you really do as an engineer or developer running on Container Engine is you manage your pods, and you replicate your pods, and you create a cluster, and you manage nodes. But you don't have to manage the underlying VMs, underlying networking, the underlying storage, that you would normally have to do with Compute Engine. All of that Compute Engine stuff is what Container Engine takes care of for you. And if we go sort of even further with App Engine, if many of you are familiar with App Engine, App Engine is a further level of abstraction, where it's really convenient, cause as an engineer and a software developer, all you have to think about is writing the code, deploying code.

And everything else is really taken care of for you, all the scaling. So that's kind of the scale of different parts that we have. In addition, we also have products such as Cloud Functions, which are serverless, and that sort of goes even beyond App Engine in terms of abstraction. So let's go through coding the container way. So we're going to get started by actually writing a simple application. We're going to containerize it. We're going to run it in Google Container Engine. And I'll walk through this first and then I'll show you a demo of all of this actually working, right? So let's take a look here. First, we have a simple server.js file. Hopefully most of you that are software engineers can sort of see what this is doing here. All we're really doing is we're listening to port 8080. And as we get a response into port 8080 we write, hello world. It's a node.js database application. And normally, you know, you'd write code like this, and you would type in node space server dot js, test your code, make sure everything looks good, right?

But this is really sort of before containers what we would normally do in a situation like that, right? But let's say we want to containerize this, right? And in this example, we'll use the Docker container runtime, right? So for Docker container runtime we need a Docker file. And all we've really done here is we've said, well, we want to use Node version 4.0. We want to expose port 8080. The source file server, the server.js that we talked about before. And the command is we just want to run Node space dot server.js. So that's a simple Docker file. And the next step, it really is, well, let's go build the container image. So if we're going to build the container image around this we'll use these Docker commands to build images. And then we'll also use Docker commands, and the G Cloud command to actually push this image to the Google Container Registry. Now, the Google Container Registry is a managed, private container registry that's run and operated in the Google Cloud Platform.

So you can use that. Now, there are other managed, private registries out there as well. Or you can even set up your own private registry just for your organization, for the containers that you may use within your organization as well. So let's actually demo this. And I can show you this working in person, right? In the back let's move over to– I think this is number two of the– [INAUDIBLE] my port here. OK. So here I've just logged into the Google Cloud Platform. This is our console. And yeah, I guess I have some meetings going on. So what we see here is I've simply logged in. I'm using Google Cloud Shell, which you can get to simply by clicking that button there. So this is a shell that's running within the Google Cloud Platform. It's pretty convenient. This may not be the way that you would do development on a regular basis. Like on a regular basis, you'll probably use your own laptops. And you'll run Kubernetes and the G Cloud commands directly.

But you know, for this demo I'll use Cloud Shell. And it's very convenient because if you don't happen to have your laptop with you, or if you want to have a quick environment up and running, this is always there for you, right? So I created a directory here called Hello Node. If we kind of take a look at this directory, I'll take a look at server.js. And this is a file that we kind of looked at earlier. It's the same file there. And as I did over there, if I type in node space server.js, now the file is actually running. Well it's supposed to be running. Whenever we have these live demos we always end up having these kind of issues even though– let's take a look at server.js. Make sure it looks good. It's always fun. AUDIENCE: Socket's already in use. DAN PAIK: Oh, Socket's already in use. OK. Keep going. So I'm going to actually just build this as a Docker container next, which basically is this command. So we'll take a look at that Docker command.

And I'll actually compile this and run this. This is a Docker file that we also looked at earlier. If we actually build it now the Docker file has actually run, and we've built the Docker file itself– I'm going to actually run this as a Docker container. And what you see here is like, the first thing that we did was we actually ran this as a container there. And the second thing that I did was I actually called this, and we see the Hello World coming up. So it's actually running as a Docker container. We can do some things here, like if you're familiar with Docker, we can run Docker PS to see what's running. We see that we have Hello Tutorial running here. And if we run Docker, like stop hello tutorial, then it will stop the Docker process from running. So we have a couple of things like that that we can actually do here within the Docker process there, right? So there you have it. What we've done at this point is we've taken a program that we wrote, the server.js file, the simple node.js file.

We've actually containerized it. And we actually ran it in Docker. So we ran the curl command. So now we know that, hey, our code actually works, and the containerized version of our code actually works. So pretty good there. But you know, the problem now is, I mean, running a container and building a container with Docker is pretty easy to do. And it's pretty convenient. And Docker gives you a lot of pretty cool tools there. But the next step is if you want to run this type of container in a production environment, well, now we have to think about a lot of other things. We have to think about, how would you scale this? How do you update this how? Do we handle rolling updates that we plug this into a CI/CD the type of framework? And to manage your containers in that sense. This is where we need Kubernetes to come in. So this is where Kubernetes can come in and actually manage, and orchestrate our containers for us, right? So what I'm going to do now is I'm actually going to create a pod.

The same pod running, the same image running, except the difference this time is I'm actually going to run this in Kubernetes and have it managed by Kubernetes, right? So that's actually all I'm doing here, is I'm actually running this as a node that I'm creating in Kubernetes with that image. And what we see here is I'm running this as– the container's being created right now. And if we sort of keep going up with this we'll see that hey, this is running. Now that we have a node running with a single pod, running a single container we have this Hello node, you can kind of see here that it starts with the numbers 4222, that's kind of just the unique name that's been assigned to our Node here, right? The other thing that I want to do is I actually want to test this using a browser that externally available. So I'm going to expose this. So what this command here does is this will basically expose the– I'll copy all of these– but this will actually expose our node through Kubernetes to the outside world.

So the other thing that I also wanted to do here was I said, you know, I want to add a little bit of redundancy here, a little bit of replication, and I want to make sure that our pods stay up and running for a long time. So I'm going to set four replicas here. So in this sense, instead of having one pod running as a single pod, we have four running. So now we've scaled this application a little bit. We're running in four pods. We see here that, you know, out of the four, there was one running from before. And we have three more that are running there. And if we do a kube control get pods, we'll see that, hey, all four of them are running now, right? So now we have this process set up with four pods running. Let me go check to see if our IP address is available yet. So what it does here is the Hello node services that I set, that other command that I showed up here. This expose command says, hey, I want to expose my Hello node here to the outside world, so give me an external IP address.

And so it's just pending. Now that it's no longer pending, we have an external IP address here. So I'm going to go to that external IP address in another window. We'll hit port 8080, and hopefully we actually see our Hello World Coming up, right? So now we actually have our container that we wrote. That node.js, that simple Hello World container. We've containerized it. And we actually have Kubernetes managing these containers now. And we have four of them running. So far so good. You know, looks like everything is working pretty well. It's replicated. So if for whatever reason a pod were to go down, Kubernetes is going to keep trying to get four running. If you notice in that command that I wrote up here earlier by setting replicas to four all I've really done here is– you know, I didn't mess with the config file or anything like that. I just said, hey, I want four replicas running. And Kubernetes are smart enough to make sure that there's always four running.

We actually have a demo upstairs called Whack-A-Pod where, you know, we have nine of these pods running. And every time you keep trying to whack one of the moles there it's going to bring a pod down. But Kubernetes is going to keep trying to keep all nine up and running. So you can see how fast Kubernetes is at continually making sure that your pods are up and running. So here we have four running. But let's say I want to make a change, right? I mean, you know, all of us that are in software development knows that, hey, code doesn't sort of stay stagnant. So we're going to go in here, we're going to go server.js again. This time, you know, we really want to say, hey, I don't really like Hello World. I want to actually change this to something else, right? So let's change this to Hello Party People, right? So we're all party people here, I hope. And now I changed the code there. So now what happens in the case, right? So this is really where Kubernetes starts to excel, right?

Like we've updated our code. What we want to do now is we want to rebuild our container image. We want to push our container image to the Container Registry. And then we want to update our Hello node to the new version of code. So I'll do that right now. The quick commands can do that. We go right here where we have a Docker build command here. And then we have this. And then I'll sort of do all that. So first section is we build the image. We actually push the image up. This next command here, right here, which is kube controls set image deployment. Hello node. Set it to the new V2 version of the container registry image that we just pushed up. So that's essentially what we're doing there. If I get the deployment it will say, hey, there's four up and running if I do a get pods. Yeah, we see that, hey, the 422 ones were the older pods that we had earlier. Now it created these new pods that start with 968. And slowly it's going to start terminating all of the old ones.

And all the traffic will basically start moving to these new 968 pods. If I sort of continue with that we see that, hey, within probably about 30 seconds or so it's done updating. And if I go back here again, I should see Party People if this demo works correctly, and we do so. So you can see how like using containers, and container development, especially with Container Engine, it's a lot quicker and easier to manage your code in this way, right? Think about how you normally go through a CI/CD type of process, where maybe you have to build with larger files. And larger files have to get pushed through the system. I mean, this is a way where containers are fairly small. They're portable. And you're able to get them to the system fairly quickly. So let's go back to the slides right now. And I'll show you another demo in a moment, as well. OK. So the next thing. So what we did right now was actually pretty easy, right? I mean it wasn't too hard to do. We were able to use a lot of existing tools to create a container image.

We push it up to the registry. We're able to get this stuff to work. But you know, what about a lot of the other common applications that are out there, right? So you know, let's say you wanted to, you know, WordPress, or Redis, or MySQL, or Engine X. Like, you know, you shouldn't have to go through kind of the work that we did to do that, right? There's an open community of people and Kubernetes is an open source project. So one of things we definitely rely upon is our open source community to work together to solve these types of problems ourselves. So really what I want to do is, you know, if you remember back to the days where you were installing software on a typical Linux box. Maybe at first you had to take a tarball. You had to un-tar it. You had to try to make it. And you know, sometimes you ran into errors. Like, you know, library dependency type errors, and things like that. And it was really annoying. And really, what kind of helped with that was package managers, right?

Like YUM, and APT-Get, and those kinds of things came out. And it made the installation of software a lot easier in those boxes. Now, in Kubernetes the container stuff that we sort of walked through earlier was like, hey, this is how you build a container from scratch. But we wanted to build a package manager as well. So Helm is the package manager for Kubernetes. And it really makes it easy for you to start up, and build, and run containers that are most common in the marketplace, right? So this is an overview on Helm. Helm runs a set of files called Helm Charts. And they contain the application definitions that are necessary for the Helm install. So as I mentioned earlier, Helm's the Kubernetes package manager. Makes it very easy to install many of the most common applications. The list of generally available helm charts, we have over 50 today. And those are the stable charts. We also have a lot of incubator charts that are still in progress. So it's a growing list that we've been talking about.

And you know, specifically there's one project that we've been working on that's actually getting ready for an alpha release right now, right? And this is SPARC, right? So today users have had to run big data workloads on standalone SPARC servers, or they had to run SPARC on yarn clusters, or they had to run these on Mesos clusters, right? The community has been hard at work with SPARC integration, and Kubernetes is giving users the option to run their big data jobs in Kubernetes, right? So we had SPARC running in standalone mode in Kubernetes last year. You can manually launch a SPARC master. You can launch SPARC workers. And you can do these by hand. But the architecture didn't really fully integrate with Kubernetes for all of the resource scheduling. So the alpha that's being ready to be released right now directly integrates with Kubernetes. Launches executers via a SPARC driver. So it really scales up, and it takes advantage of all the orchestration management that Kubernetes is so good at doing.

So I mean, this is a great first step to enable big data SPARC workloads to work on Kubernetes and to let users of big data move off of the isolated clusters that you may have onto a shared infrastructure with the other applications that you run in Kubernetes, right? Ultimately, we want to see new things enabled by having big data closely integrated with modern microservices-based applications, all of these things running in Kubernetes. So the community is hard at work. And the community that's largely been responsible for SPARC at Kubernetes includes developers from a wide range of companies. I mean, there are developers from Palantir, Google, Pepperdata, Intel, Redhead, Holloway, many others. So this is, like I said, it's alpha ready very soon. If you're interested in trying this out, and if you're a company that uses a lot of SPARC, and you want to try running SPARC on Kubernetes, go to that link up there, and try downloading it, and play with it, because this is really the way that we start getting applications running within Kubernetes, and in containers, and they eventually become Helm charts, and then the whole community benefits from that.

So why should you use Helm, right? So Helm is easy to use because much of the work in getting the container is already done for you, right? So you should use Helm because it's really fast. It's convenient. It's reliable. There's no reason to reinvent the wheel when someone else is in the community has already done a lot of that work for, you right? And you know, because we are a community, if you don't see a Helm chart for a piece of software that's fairly commonplace, go build it. Build one. Contribute to the community. And now everyone else will have that Helm chart available. So we've been emphasizing this whole community aspect of Helm a lot. And just, I think, three months ago there were only like 10 charts out there that were stable. And now there's over 50. So you know, it's really worked, in terms of bringing this community together to help get containerized applications up and running. So let's take a quick look. I mean, we're going to do a quick demo on Helm.

And I'll show you how quickly and easily we can actually get Helm up and running, all right? Let me get the command up. So that's all Helm. I'm actually going to install this by using Helm Redis. So we'll clear that up. So this is basically the only command that you need to run. So hey, believe it or not, like that one command, Helm, this command right here, Helm install dash dash name via stable Redis is really all we need to do. And Redis is actually installed right now. To kind of prove that I'll do a kube control get pods. And we see our old Hello notice still running up there, but we also have this my Redis node that's being created right now, right? So if we wait a little bit that will come up. The instructions that it came up with here was, you know, to get your password do that. You know, it says Redis can be accessed via that port. And then that's your cluster there. To connect, copy and paste these things. So I'll get my Redis password. Since it tells me to run this I'll run that.

So I'll run that. And there. I guess that gave me my Redis password. Yeah. The pod itself is now running. So it took less than a minute to get that pod up and running. To connect to you Redis server, it says just got to this kube control and go to this. So I'll do that. So that is actually connecting me to Redis that's running. And then to log in, or to connect to Redis through the command line, it says to do this. So I'll just follow instructions and do that, right? So we're actually in a Redis instance right now. To sort of prove that, for anyone who is fairly well-versed in Redis I'll do a couple of quick commands here. So I'll set temp to– is that not coming up? OK. There it is. So I'll set temp to GCP Next. So I set up a name value– So Redis, for those who aren't aware, it's a simple, noSQL, noSQL database, name value pairs. So I set up a name of temp to have value GCP Next. So if I get the value of temp it pulls GCP Next. It's really wonderful.

All it really does. If I set a variable like x to 10 then I can increment it. You know, and keeping incrementing it. And, you know, if I get x, you know, well– so I get stuff like that. If I exit out of this stuff, we have Redis up and running. So essentially what I was trying to prove to you is that, hey, we have Redis sort of up and running, and we have an actual instance of Redis up and running. And instead of going through a lot of the work that we did earlier with building the container, instead all I did here was I did the Helm install. And that was really all it took to get up and running. So if you want to get started with container development, yeah, take your existing applications, you know, containerize them, and run them in Container Engine. But if you want to build a WordPress instance, for example, quickly just do that do that in Helm. The various charts are available. We can just do a Helm search. And this list here are all of the currently stable Helm charts. So you can kind of see that a lot of the sort of typical applications that you might want are in there.

Because the list is getting really, pretty big you can also do a search, and say, hey, I want anything that has the keyboard Redis in it. And it'll give me the stable charts for Redis. So that's essentially how we use Helm charts. I think we can move back to the actual presence, the slides again. So that was Helm. So our first hello world application was pretty easy, right? And using Helm Charts was pretty easy. It was pretty easy to build a container. We push it up to container registry. We ran it in Container Engine. You know, using Helm we were able to containerize a lot of the common applications. But as you start writing more and more applications, and using containers, your application will likely start getting a lot more complex than kind of the examples I gave today, right? Inevitably you'll come across a need where you need to start sharing data. You need to mount volumes across multiple pods. So this is where persistent volumes comes. And I did want to talk through how we use persistent volumes in Kubernetes.

So let's start with identifying the problem that we're trying to solve. So first, we have this basic unit of scheduling Kubernetes, which is a pod, right? So in Kubernetes, looking at Kubernetes' concepts for a second, we had containers that we built earlier, and containers in Kubernetes are organized in pods. A pod consists of one or more containers, right? And typically when you have a container it doesn't actually share its file system with anybody else, right? So things normally don't show their file system, but you have a pod that has multiple containers in it. And when you start organizing your pods, like, well, one of the questions I do get sometimes is well, which containers should be inside a pod? And it kind of depends on you. Like if you want to start sharing things, like, that's the same network, you want to guarantee that it's going to be running on the same node then, yeah, share these. And some examples might be if you have an Engine X container, and then if you're running a PHP application, so you have a PHP container.

Might want to combine those two because you're going to sort of scale them up appropriately at the same time. But if you had, like, a database container and a front end container, you may not, because you might want to scale your databases differently than your front end, right? So what that we've looked at here is that pods are composed of these multiple containers. But it would be really nice if these different containers within a pod could still share the same value, you know, for speed, and for a lot of different purposes, right? So the second thing is that containers are ephemeral, right? Like we saw when we did the replications over there that we had four, and whenever I updated code all it really did was brought up four new pods. And it just killed the old ones, right? So these pods are these short-lived units of work. And imagine if there's like a piece of storage attached to that, right? So this means that if that container crashed or the machine is restarted then you end up losing all of your data, right?

So you know, with the Hello World app that we had, we didn't really use data. So it didn't really matter. But you know, for a more complex application that uses data back and forth, like this, can be a big problem. And you know, if you imagine trying to run applications such as like a stateful type of application, whether it be a database or some other larger application like that, I mean, having your data disappear is just a nonstarter, right? Like, it's not going to work for what we want to do. So those are the problems. And this is where Kubernetes volumes comes in, right? So at its core a volume is just a directory, that you could pre-populate some data in it. And that's accessible to all the different containers that are within a pod, right? And how that directory comes to be in the medium that backs it up the contents, what happens after a pod's terminated? Those are all sort of implementation details that we'll talk about. But it depends on the type of volume.

And you can configure all of that. If we take a closer look at this, like, Kubernetes, the volume plugins can be largely categorized into two types. So the first type are the persistent volumes. And, you know, there's obviously a lot of support for a lot of different persistent volumes out there, right? So these are storage that outlives the lifecycle of a pod. So in other words, it could be anything from an NFS Share that you see up there. Could be a cloud provider with virtual disks. So lifecycle of storage is independent from the lifecycle of the pod, right? So if the pod dies the storage is still alive. And so that's great. So if a pod referencing one of these volumes is created, storage is made available inside it, it's attached, it's mounted. But then, when the pod's terminated volumes are unmounted, could be detached, but the data's always still there, right? So the key is the data persists. So if another pod comes along, references the same storage volume, with the same mount point, then that storage will still have the data there.

So it's pretty convenient to be able to have a persistent data store, right? The ephemeral side are what we've been talking about till now, and is what's used in pods, and in a lot of volumes today, where as soon as the pod's created, you create this volume. But then when the pod's deleted the volume just disappears, right? So it's usually a scratch space, a shared space. And it's a space that could be shared between the containers and a pod. But it's going to go away. So the type of data that you would put in there is usually temporary type data. So let's walk through an example to help sort of clarify a lot of this, right? So here we have a pod. Pod, as we said earlier, the basic unit of scheduling Kubernetes. Contains one or more containers. And in this case, we have a single container. We called it Sleepy Container here. And that's the Google Container Registry image there. It's just a container that kind of sleeps, is really all it does.

But if we look at how to set up storage for this container, well, we configure storage there, right? In this case we used a Google Compute Engine persistent disk. We called it Panda Disk. And what this file here assumes is that disk has already been procured. It's already there. Supposing like that this disk already exists someplace, and this is going to reference it, right? So it's referencing it directly, right? So then the pod also defines where we should mount that disk that we saw, right? So in this example, all we did was just said, hey, we want to mount that volume data that we see up there, that Panda Disk, over to slash data. Seems pretty trivial. But there's actually a lot going on under the scenes. One of the concepts with Kubernetes is you have these notes that are running pods. And so when your pod is attached to a node, then Kubernetes knows to make an API call to say, hey, where is this persistent disk? It attaches that persistent disk to that node, which then attaches it to your pod.

Now, one of the things that Kubernetes also can do very well, and is a power of Kubernetes, is if a node is having issues then it will create a new node, and it'll drain out that node, and your pod will get moved to another node, right? It automatically sort of handles all that for you. So typically most Kubernetes developers don't really care where your containers are running. As long as it's running somewhere. So in that sense, your pod might be moved to a new node. And when it does that, Kubernetes is smart enough to automatically detect that change, detach this from the first node that was having a problem, attach it to this other node, and any time your pod sort of moves nodes your volumes will always follow it and Kubernetes always knows to make sure that it tracks it, and attaches it to the node itself. So there's lots sort of going on under the scenes in there. And Kubernetes takes care of this but, you know, I mean, this is pretty cool, and it works, but we can actually do a lot better than this.

And that's really what persistent volumes is. Persistent volumes is what Kubernetes does to manage volumes. And it's a little bit better than what we just looked at. Persistent volumes, it's an abstract storage layer that is used and managed by Kubernetes. So the way to describe this is we have two different roles. They could be the same person within your organization. We have this first role that's is cluster administrator that creates a number of persistent volumes that are available. So the admin provisions them. And then we have users that claim the volumes that they want to use. So in some ways, it's similar to how like an admin might create a cluster with a bunch of nodes, and then users say, I want replicas and pods running. And you know, they'll automatically get assigned to these nodes. So in that sense a user says, I need, you know, say, 100 gigabytes. And if there's no gigabytes available in any volume then it will get assigned to that, right? So in static provisioning, you know, that's a fairly simple use case.

And that if you make a claim and there's a preview available it'll attach it to that. Dynamic provisioning, which I'm going to go into a little more detail later, works through a thing called StorageClass. But that's actually pretty cool because, hey, we all sort of work and live in this world of cloud now, right? Does it really make a lot of sense to sort of pre-provision storage that's available when, ideally what you would do is a user says, I need 100 gigs. We don't say, hey, is there a 100-gig volume available for me to attach to this user? Instead you say, well, OK, I'm going to go out and get 100 gigs for you now. Find the storage for you somewhere in the cloud. And then attach it, right? So that's where dynamic provision comes in. And it's a new sort of unique feature that's within Kubernetes. You know we alpha launched it in 1.2. We beta launched it at 1.4. We're ready to make this generally available with 1.6, which is ready to launch in a few weeks.

At the end of March is when the 1.6 release date is. And that's when dynamic provisioning will be available as well. So let's walk through this a little bit. So this is a simple case of static provision and release. So we start here with the cluster administrator. This is what a yaml file looks like for the persistent volume. You know, we see here that there's two that they have assigned. One is a 10-gig volume there. And then we have another one that's 100 gigs. So the cluster administrator said, hey, these are the persistent volumes that I have available. There's two for you, right? And if you run that kube control create command on that yaml file, and then you run a get PV it says, hey, these are the two statuses available. So they're not actually assigned to anybody. So you know, pretty straightforward. Someone basically said, hey, I have these two volumes available for anyone to use, right? So in other words, the cluster administrator creates a persistent value, and makes some available for use.

Now we have this user that comes along. And the user creates what we call a persistent volume claim, right? This user is saying, hey I have a persistent volume claim that I'd like to have. I need 100 gigs of storage. So this is the claim that they're making. And so we have a couple of things here. We have the persistent volume that was assigned to that yaml. Here we have a persistent volume claim that's being made against that, right? So the user creates a persistent volume claim. And now if we take a step back and look at what happened, well, first we ran the persistent volume, and created the two volumes that we saw. Next we created the persistent volume claim. And now if we do a kube control get PVs we see that, hey, you know the claim was for 100 gigs. So we see that the 100 gig that was created is now bound and it's claimed by that claim, which is the test that has my PVC. So here we've statically bound the claim together. So if another user comes along with another claim for 100 gigs this is not available to that user at that point, right?

So the binder works to actually make that match. And you know, it's fairly straightforward in this case. Like they need 100 gig, and there happened to be 100 gigs available. So it was made available there. So now we end up sort of with this happy state where the user then creates a pod and in the pod they say which claim they want. And I think I have an example of that yaml file as well. And this is what that looks like. So the pod that we had before, if you remember, it's had that GCE persistent disk, that Panda disk. So at that point we had to make this assumption that Panda disk already exists. And that it's already been procured. And it's already been set up. We've abstracted this now to say, well, we don't have to make that assumption anymore. All we're saying is that we have this PV claim that we want, and so we know that in this other yaml file we had identified the PV claim to be 100 gigs. And so all we're saying here is I need that. I reference a claim.

So this pod is now referencing the claim itself. And it's no longer referencing the actual sort of mounted piece of persistent disk that we needed. So this is a lot better than what we had before, because now we've sort of abstracted this out. You can take this file, move it to any other server, and as long as you reference the claim then it all it'll just find a disk. If Panda disk were to go away and get deleted, like all your pods that are referencing it won't start breaking, right? In this case. Because the claim can get moved from one PV to another PV fairly easily. So what we kind of saw here was we have a pod that referenced that claim. The name is also tied to the persistent volume. And so the example here is, let's say the pod writes a piece of data. So now we have that little star in that folder up there because we have a specific piece of data that was written by that pod through the persistent volume claim to the persistent volume, right? But now let's get a little bit interesting here where the user says, well, I want to delete the pod.

I don't want the pod anymore. So if the user goes out and actually deletes this pod, well, what happens, right? Well, really, I mean, nothing really happens in a sense, because, as I mentioned earlier, we have pods, and we have persistent volume claims, and then we have persistent volumes. The persistent claim and the persistent volume is still bound together, right? The pod referenced the claim. But if the pod is gone, then fine. There's no reference to the claim. But the claim itself is still tied to the volume. So the data doesn't go away. And the claim is still sitting there. We just don't have any pods that are using that claim, right? So that's really what this illustrates is that the pod may go away, but the persistent volume claim is still sitting there. It's still bound to the same persistent volume that we had before. And the data is still not lost, right? And then let's say a user creates a new pod again, and is referencing– could be another pod, could be the same pod that they bring up again.

As long as it references that same my PVC, which is the PV claim, then it will attach to that same piece of persistent volume again. So now we're sort of back to our happy state again where we were before, where the pods were restarted, new pods came up, yet the data has never been lost. So that's pretty cool. All right. And that's really the state that we have here, again, right? But let's get a little bit more fancy now. And let's delete the pod. So let's say the user deletes the pod. And let's also say the user deleted the claim. Now we have an issue here, right? Because while the claim itself is gone, so now it's no longer bound, so if we did a kube control get PV, that 100-gig volume that was bound before is no longer bound anymore. So the data itself at this point is actually deleted. So there are a couple of ways that we can reclaim data here. The default way is we recycle it. I mean, essentially, behind this scenes, like it does an RM dash RF of all the data in there.

So the data is gone, right? From that perspective. And then it's going to be made available to other claims that decide to use that, right? So that's essentially what happens to that volume, in that sense. So that's really kind of the static case that we had with static provisioning of persistent volumes and persistent volume claims. Now, I did mention earlier that, you know, we have dynamic provision, right? So it's pretty cool. It's late-breaking type stuff. I mean, this is stuff that's coming out in Kubernetes. It's generally available, you know, in a few weeks. And you know, and like I said earlier, like living in the world of cloud, like we shouldn't have to pre-provision volumes anymore, right? I mean, that's something that maybe we had to do years ago, where, you know, you had someone that was setting up these persistent volumes saying, somebody might come along, and they might need 10, 100, 500 gigs, and I'm just going to allocate a bunch them available for my users to use.

Well, rather than having that paradigm, like we should be able to go through, and it should be all on demand now, right? Like if a user says whatever they need, you know, should be able to make API calls to a cloud storage type of provider, and dynamically, you know, build a volume, and then attach to that person. And this isn't only cloud. It can work within an organization that uses NSF mount. It's like you might have storage available that you can then create the right amount of storage available, and then mount it for them. Like that's fine, too, even if you're not within Cloud, right? But you shouldn't have to pre-provision everything just in the chance that someone's going to ask for it. So that's really what dynamic provisioning is here to address. Now, the key concept with dynamic provisioning is that it uses what's called the storage class, right? So in the example here we set up two different storage classes. We set up a storage class for slow, which is essentially it's a sort of regular disk.

It's PV standard. And in the case here, with the provisioner is we set it to be Kubernetes IO, GCEPD. So it's a Google Compute Engine Persistent Disk. The parameter section in this file, essentially, it's just like a large sort of blob name, value per blob, that we just pass on to whoever the provisioner is. So essentially, the reason why that's important is because whether that's a Google Compute Engine Persistent Disk, or any other provider, that parameters section, you can put whatever you want in there depending on whatever the provider is expecting. So if the provider decides to update their parameters with some new stuff, you can just add it in there. And it automatically will be updated. There's no need to update Kubernetes, or anything like that. And in this case, we set up a slower one for that. And then we have an SSD on the bottom, which is fast. You know, so essentially the cluster administrator– and like I said earlier, the cluster administrator and the user might be the same person.

There's no necessarily need for them to be different. But in the example here they're different roles. So in this case they've set up a slow and a fast. And to the users they're telling them, hey, if you want storage and you want to dynamically provision it, tell me. If you want slow storage, or fast storage, right? And it depends on what it is you need, right? Maybe if it's like some logging of all the traffic that's coming in, slow is fine, right? But if you want something that is going to react a lot faster where performance is much more important, maybe you want something that's fast. But you leave it up to the developer to choose that, right? So that's what that looks like. And with dynamic provisioning the users now consume storage the same way that they did before. But instead of in the claim saying, well, just how big it is, we had this new thing added that says storage class name. And here we said fast, right? So if we don't have that storage class name, then it'll just look for any existing statically assignable persistent volumes.

But if we have that storage class name in there, then it's going to look for this in the storage class portion that we looked at earlier, to make sure that's there. There was a change made between Kubernetes version 1.5 and 1.6. And that's why it looks a little bit different there. But you know, 1.6 is the new version. And that's where we made that change to say storage class name, instead of that volume.beta storage class thing that we had before. So that's really how dynamic provisioning works. It's really kind of the next step into what we're doing with volumes. So if this stuff was pretty interesting to you, and I hope it really was, please get involved in community. The Slack channel is up there. The GitHub code repository is over there. It's open source. So anybody can go into that code depository. You can download Kubernetes. You can try it out. You can also be a contributor as well. So you know, the Slack channel especially is full of people all day long talking about Kubernetes.

So you know, if that's your interests, and there are a lot of minded folks over there. So it's a pretty interesting group. Now, also– there it is. So there are a couple other talks here. Hopefully you're signed up for some of them. Right after this talk, across the hall, Kelsey Hightower's basically taking this talk to the next level. Like, you know, we talked about creating a container, building it, running it. He's talking about, well what would you do if you wanted to run that container in a production environment? So he talks a lot about securing that container. Some of the name spaces that we want to use. Liveness checks to make sure your containers are always up and running. Things like auto-scaling your pods so that you don't have to sit there if you get a sudden spike in traffic, you know, middle of the night, you're not getting paged. The system's auto-scaling everything for you, right? So he's really talking about how to take your containers from development, which we talked about, into actual production.

There's also some more about like Container Engine today, as well as tomorrow as well. I think I talked about this a little bit earlier. But upstairs on the third floor we have Whack-A-Pod. We also have Meet the Experts up there. If you have any further questions, can also talk to me at any time as well. Again, my name is Dan Paik. And I'm one of the product managers on the team. And Whack-A-Pod's a pretty fun game you can play up there, where we have nine pods running. As you start whacking each one the pods go down. We basically delete the pod. But Kubernetes will keep bringing them back up again. So you can see that, you know, it's possible if you're really fast to bring your application down. But it's usually not down for more than a second or so as it keeps coming up. And the Node.js code app that we went over earlier today at the first part of this, the link is over there if you want to take a look at that. If you wanted to run that yourself at home, and play with it in a Container Engine you can always do that.

It's free to do if you wanted to play around with that. So that's really kind of what I wanted to talk to you all about today. I wanted to make sure I left about 15 minutes or so for questions. So if there are any questions I'm always happy to take them. Hopefully I can answer them. But yeah, so there are two mics out there if anyone has questions. Or if not, you can always feel free to walk up and talk to me as well. Thank you very much. [MUSIC PLAYING]


Read the video

In this video, you’ll learn how to launch your app in minutes with Helm Charts and services in Google Container Engine.

Missed the conference? Watch all the talks here:
Watch more talks about Application Development here:

Leave a Comment

Your email address will not be published. Required fields are marked *

1Code.Blog - Your #1 Code Blog