Get full access to Building Serverless Applications with Google Cloud Run and 60K+ other titles, with a free 10-day trial of O'Reilly.
There are also live events, courses curated by job role, and more.
Cloud Run is a platform on Google Cloud that lets you build scalable and reliable web-based applications. As a developer, you can get very close to being able to just write your code and push it, and then let the platform deploy, run, and scale your application for you.
Public cloud has provided the opportunity to developers and businesses to turn physical servers and data centers into virtual ones, greatly decreasing lead time and turning big, up-front investments in physical servers and data centers into ongoing operational expenses. For most businesses, this is already a great step forward.
However, virtual machines and virtual networks are still a relatively low-level abstraction. You can take an even bigger leap if you design your application to take full advantage of the modern cloud platform. Cloud Run provides a higher level of abstraction over the actual server infrastructure and allows you to focus on code, not infrastructure.
Using the higher-level abstraction that Cloud Run provides doesn’t mean you tie yourself to Google Cloud forever. First, Cloud Run requires your application to be packaged in a container—a portable way to deploy and run your application. If your container runs on Cloud Run, you can also run it on your own server, using Docker, for instance. Second, the Cloud Run platform is based on the open Knative specification, which means you can migrate your applications to another vendor or your own hardware with limited effort.
You probably bought this book because you are interested in building a serverless application. It makes sense to be specific about what application means, because it is a very broad term; your phone runs applications, and so does a server. This book is about web-based applications, which receive requests (or events) over HTTPS and respond to them.
Examples of web-based applications include the sites you interact with using a web browser and APIs you can program against. These are the two primary use cases I focus on in this book. You can also build event processing pipelines and workflow automation with Cloud Run.
One thing I want to emphasize is that when I say HTTP, I refer to the entire family of HTTP protocols, including HTTP/2 (the more advanced and performant version). If you are interested in reading more about the evolution of HTTP, I suggest you read this well-written overview at MDN.
Now that I have scoped down what “application” means in the context of this book, let’s take a look at serverless. If you use serverless components to build your application, your application is serverless. But what does serverless mean? It’s an abstract and overloaded term that means different things to different people.
When trying to understand serverless, you shouldn’t focus too much on the “no servers” part—it’s more than that. In general, this is what I think people mean when they call something serverless and why they are excited about it:
In the next sections, I’ll explore these three aspects of serverless in more depth.
Eliminating infrastructure management means you can focus on writing your code and have someone else worry about deploying, running, and scaling your application. The platform will take care of all the important and seemingly simple details that are surprisingly hard to get right, like autoscaling, fault tolerance, logging, monitoring, upgrades, deployment, and failover.
One thing you specifically don’t have to do in the serverless context is infrastructure management. The platform offers an abstraction layer. This is the primary reason we call it serverless.
When you are running a small system, infrastructure management might not seem like a big deal, but readers who manage more than 10 servers know that this can be a significant responsibility that takes a lot of work to get right. Here is an incomplete list of tasks you no longer need to perform when you run your application logic on a serverless platform:
And that’s just about servers! Most businesses have higher and higher expectations for system availability. More than 30 minutes of downtime per month is generally unacceptable. To reach these levels of availability, you will need to automate your way out of every failure mode—there is not enough time for manual troubleshooting. As you can imagine, this is a lot of work and leads to more complexity in your infrastructure. If you build software in an enterprise environment, you’ll have an easier time getting approvals from security and operations teams because a lot of their responsibilities shift to the vendor.
Availability is also related to software deployments now that it is more common to deploy new software versions on a daily basis instead of monthly. When you deploy your application, you don’t want to experience downtime, even when the deployment fails.
Serverless technology helps you focus on solving your business problems and building a great product while someone else worries about the fundamentals of running your app. This sounds very convenient, but you shouldn’t take this to mean that all your responsibilities disappear. Most importantly, you still need to write and patch your code and make sure it is secure and correct. There is still some configuration you need to manage, too, like setting resource requirements, adding scaling boundaries, and configuring access policies.
Serverless products are built to increase and decrease their capacity automatically, closely tracking demand. The scale of the cloud ensures that your application will be able to handle almost any load you throw at it. A key feature of serverless is that it shows stable and consistent performance, regardless of scale.
One of our clients runs a fairly popular soccer site in the Netherlands. The site has a section that shows live scores, which means they experience peak loads during matches. When a popular match comes up, they provision more servers and add them to the instance pool. Then, when the match is over, they remove the virtual machines again to save costs. This has generally worked well for them, and they saw little reason to change things.
However, they were not prepared when one of our national clubs suddenly did very well in the UEFA Champions League. Contrary to all expectations, this club reached the semifinals. While all soccer fans in the Netherlands were uniting in support of the team, our client experienced several outages, which couldn’t be solved by adding more servers.
The point is that, while you might not feel the drawbacks of a serverful system right now, you might need the scalability benefits of serverless in the future when you need to handle unforeseen circumstances. Most systems have the tendency to scale just fine until they hit a bottleneck and performance suddenly degrades. The architecture of Cloud Run provides you with guardrails that help you avoid common mistakes and build more scalable applications by default.
The cost model of serverless is different: you pay for actual usage only, not for the preallocation of capacity. When you are not handling requests on a serverless platform, you pay nothing. On Cloud Run, you pay for the system resources you use to handle a request with a granularity of 100 ms and a small fee for every million requests. Pay-per-use can also apply to other types of products. With a serverless database, you pay for every query and for the data you store.
I present this with a caveat: I am not claiming that serverless is cheap. While most early adopters seem to experience a cost reduction, in some cases, serverless might turn out to be more expensive. One of these cases is when you currently manage to utilize close to 100% of your server capacity all of the time. I think this is very rare; utilization rates of 20 to 40% are much more common. That’s a lot of idle servers that you are paying for.
The serverless cost model provides the vendor with an incentive to make sure your application scales fast and is always available. They have skin in the game.
This is how that works: you pay for the resources you actually use, which means your vendor wants to make sure your application handles every request that comes in. As soon as your vendor drops a request, they potentially fail to monetize their server resources.
People often associate serverless with functions as a service (FaaS) products such as Cloud Functions or AWS Lambda. With FaaS, you typically use a function as “glue code” to connect and extend existing Google Cloud services. Functions use a runtime framework: you deploy a small snippet of code, not a container. In the snippet of code, you implement only one function or override a method, which handles an incoming request or an event. You’re not starting an HTTP server yourself.
FaaS is serverless because it has a simple developer experience—you don’t need to worry about the runtime of your code (other than configuring the programming language) or about creating and managing the HTTPS endpoint. Scaling is built in, and you pay a small fee per one million requests.
As you will discover in this book, Cloud Run is serverless, but it has more capabilities than a FaaS platform. Serverless is also not limited to handling HTTPS requests. The other primitives you use to build your application can be serverless as well. Before I give an overview of the other serverless products on Google Cloud, it’s now time to introduce Google Cloud itself.
Google Cloud started in 2008 when Google released App Engine, a serverless application platform. App Engine was serverless before we started using the word serverless. However, back then, the App Engine runtime had a lot of limitations, which in practice meant that it was only suitable for new projects. Some people loved it, some didn’t. Notable customer success stories include Snapchat and Spotify. App Engine got limited traction in the market.
Prompted by this lukewarm reaction to App Engine and a huge market demand for virtual server infrastructure, Google Cloud released Compute Engine in 2012. (That’s a solid six years after Amazon launched EC2, the product that runs virtual machines on AWS.) This leads me to believe that the Google mindset has always been serverless.
Here’s another perspective: a few years ago, Google published a paper about Borg, the global container infrastructure on which they run most of their software, including Google Search, Gmail, and Compute Engine (that’s how you run virtual machines). 1 Here’s how they describe it (emphasis mine):
Borg provides three main benefits: it (1) hides the details of resource management and failure handling so its users can focus on application development instead; (2) operates with very high reliability and availability, and supports applications that do the same; and (3) lets us run workloads across tens of thousands of machines effectively.
Let this sink in for a bit: Google has been working on planet-scale container infrastructure since at least 2005, based on the few public accounts on Borg. A primary design goal of the Borg system is to allow developers to focus on application development instead of the operational details of running software and to scale to tens of thousands of machines. That’s not just scalable, it’s hyper-scalable.
If you were developing and running software at the scale that Google does, would you want to be bothered with maintaining a serverful infrastructure, let alone worry about basic building blocks like IP addresses, networks, and servers?
By now it should be clear that I like to work with Google Cloud. That’s why I’m writing a book about building serverless applications with Google Cloud Run. I will not be comparing it with other cloud vendors, such as Amazon and Microsoft, because I lack the deep expertise in other cloud platforms that would be required to make such a comparison worth reading. However, rest assured, the general application design principles you will pick up in this book do translate well to other platforms.
It’s worth noting that Google has a particular advantage when it comes to environmental sustainability. I’m writing this book in 2020, a year with a global pandemic, record-breaking heat waves all over the world, and a wildfire season that never seems to stop. Hurricanes, lightning storms, floods, and snowstorms are getting more extreme and are undeniably linked to human activity and the way we produce and consume energy. Datacenter infrastructure consumes a lot of energy to run servers—and to keep them cool.
Google Cloud has been carbon neutral since 2007 and matched its energy use with 100% renewable energy in 2017. Their aim is to be completely carbon free in 2030.
Carbon neutrality is achieved by buying carbon offsets. Critics say that it’s not clear that carbon offsets actually reduce carbon emissions. Renewable energy match means you still use energy from fossil fuels but match that with clean energy purchases (this actually reduces carbon emissions). Carbon free is when you use only carbon-free energy at any point in time.
According to a 2019 Wired report, neither Microsoft Azure nor Amazon is close to achieving 100% renewable energy match, and Amazon is being criticized for not even providing a clear timeline toward that goal.
Take a look at Table 1-1 for an overview of Google Cloud serverless products that relate to application development. I’ve also noted information on each one’s open source compatibility to indicate if there are open source alternatives you can host on a different provider. This is important, because vendor tie-in is the number-one concern with serverless. Cloud Run does well on this aspect because it is API compatible with an open source product.
Messaging