12 Jul 2016

Six Reasons Why Your Cloud Strategy Must Include a Plan for Change

We have seen the same story play out over the years. A new disruptive technology appears and companies dive head first into learning and then implementing it. However, very little attention gets paid to the human impacts.

In 2016, here we go again. Companies are embracing cloud computing along with DevOps, infrastructure as code, and possibly even micro-services and containers, yet once again we see people and process issues slowing down broad adoption. When will we learn?

Why is it so important to focus on organizational change when adopting cloud computing within a large organization? Let’s look at a few examples of how the cloud changes the way we operate.

To meet the business demands of getting new services to market faster, many enterprises are embracing the DevOps movement. Enterprises who are successful moving towards an agile delivery model are able to deploy smaller change sets to production more frequently. Traditionally, enterprises subscribed to quarterly or biannual release cycles. These legacy deployment models accommodated large testing windows, numerous manual review gates and tons of planning sessions.

This mindset change from deploying large monolithic applications only a few times a year to deploying small services weekly or biweekly, drastically changes the operating model required to manage and run the applications. Testing windows are drastically compressed, driving the requirement to increase the amount of automated testing. There is no longer time for several manual review gates to review architecture, security, quality, etc.

Rapid deployments drive the need for high levels of automation, self-service provisioning, continuous security monitoring in production, proactive monitoring and many other modern day practices. Handoffs between silos must give way to a more collaborative approach to building and deploying software.

Many enterprises think that implementing continuous integration (CI) and continuous delivery (CD) will solve all their problems, and focus only within their development silo. Although CI/CD may greatly reduce the time it takes to build software, they do not address all the process that occurs before and after the build process, which is typically loaded with waste. DevOps is about the end-to-end software development lifecycle (SDLC), not just CI/CD. Enterprises must factor in the process and people changes required to make the entire SDLC flow optimally from left to right.

As companies shift from delivering large monoliths to delivering services, a major mindshift is required across the company. With a product, once it’s delivered from development, they’re done with it and another team typically takes over the maintenance and support of the application. When delivering a service, development is never done as long as that service is being consumed by customers. At a Puppet conference two years back, Mike Stahnke of Puppet delivered this classic quote about developing services: “you are never done until your last user is dead.”

With services, the days of throwing code over the wall to be someone else’s problem are over. The developers need to autonomously manage and maintain their services, which means they need application performance monitoring (APM) and logging tools to get fast feedback and alerts. Monitoring becomes more of a proactive, as opposed to a reactive, practice . Since many different applications and users may rely on it, the service must always be up. Operations teams still watch over the infrastructure, but developers watch over their services.

This new approach radically changes the silo based operating models of the past and requires closer collaboration between developers and operations. Team structures align to products and services with small teams that include a broad spectrum of domain expertise and skillsets. Functional silos don’t work well in a service oriented world.

Traditional vertical architectures are made up of two or three tiers (web, app, database), and capacity planning is required in order to allocate enough compute and storage resources to adequately scale. Scaling vertically means adding additional hardware or hardware components to the existing infrastructure. This task is performed by hardware experts. Applications are built with the expectation that the hardware will always be there, and an army of people are charged with making sure that assumption is a reality.

In the new model, infrastructure is code and architectures are built to scale horizontally. Hardware is treated as a commodity and it is expected to fail. Architecting in the cloud means building software that is agnostic of any hardware, and can automatically recover as compute nodes go offline and new nodes come online. This is often called immutable infrastructure.

Not only is this a drastic change to how architects and developers must approach software development, it is even more drastic a change to how applications are monitored, managed, secured and audited. Focusing on the technology alone and ignoring the political and social aspects of this change is a recipe for disaster. The shift to immutable infrastructure and distributed architectures disrupts traditional organizational structures and responsibilities.

Highly regulated enterprises frequently deploy very rigid controls to protect the company from the risk of security threats and vulnerabilities. Many of these controls are implemented with a series of processes, often manual, that drastically slow down the SDLC. In a world where we deliver once every three to six months, developers can work within these constraints. In the new world where we want frequent releases, the implementation of these controls must be reevaluated.

I can hear the security and compliance people screaming at that last statement. To be clear, I am not challenging why an enterprise requires the controls and policies they have in place. What I do challenge is how they have implemented those controls. Very often, the implementation of the controls and policies only takes into account the security and compliance stakeholders. In the new world of frequent releases, developers must be considered a key stakeholder as well.

There should be a balance between control and agility. An enterprise can be both secure and agile if they allow themselves to be. For example, many enterprises deploy a manual security review gate for all their applications. When deploying only a few times a year, this methodology can work. When you have multiple application teams deploying many times a month, not only does this not work, but it does not scale. You could not hire enough people to manually review the number of deployments that happen in a mature cloud environment.

The solution relies on automation and trust. Many security controls can be baked into the underlying infrastructure blueprints. As developers consume the approved hardened images, much of the security controls that formerly required a manual review are already in place. The build process should run a security code scan that allows the security experts to enforce policies and find potential security holes. The build can then be configured to fail if the scan score is not high enough to meet the established security standard. This is yet another opportunity to eliminate a manual review gate.

From a technology standpoint, this is very easy to implement. The challenge is getting the security and compliance stakeholders to buy into the approach.

A fool with a tool is still a fool. Too often I see people who are married to a tool or a vendor demand that their tool of choice on-prem be used in the cloud. Many of these tools were never built to run outside the firewall or on immutable infrastructure. These tools not only provide low value in the cloud, but often delay delivery by creating a huge amount of unnecessary work required to integrate the tool with the cloud.

Enterprises must reevaluate their tool choices as they move to the cloud. Pick the best tool for the job, not the tool that everyone is comfortable with. Another challenge in this area is that developers require more visibility into monitoring and logging data than ever before. Too often developers are forced to use tools that operators are comfortable with instead of tools that make developers more efficient in their job.

Once again, the technology here is easy. Changing the hearts and minds of people is where the difficulty lies.

A common theme in the five previous items is that building and running applications and services in the cloud changes roles and responsibilities for many people across many different teams. Security and operations must get more involved earlier in the SDLC. Development must assume more ownership of running their code. The end-to-end SDLC should go through a value stream mapping exercise to discover and remove wasteful processes. Ownership of certain tasks may move from one team to another. Some roles may go away, while new roles may be required.

For example, if a person is responsible for managing disk arrays and the company decides to move some or all of their storage to the cloud, that role drastically changes. Now instead of managing the infrastructure, a person in this role must ensure the storage API (e.g., S3) is meeting its SLAs and the data is secure and compliant. This will require retraining, or a different person may be needed to fill the role.

With testing, security and operational tasks “shifting left”, these skillsets require more earlier involvement in the SLDC. The roles and responsibilities of these resources change accordingly. The same holds true for architect and developer resources that now have to understand network and compute technologies more in-depth than ever before.

New roles are common when enterprises start adopting the cloud. Many enterprises create a cloud platform team whose focus is building guardrails around their cloud provider’s services to ensure that those services are consumed in a manner that still meets the security and compliance standards required to run applications. Another common added role is the build pipeline role. Many enterprises start seeing multiple CI/CD pipelines spring up across the application teams throughout the company, each with a different set of processes and tool choices. Often these enterprises create a team that focuses on establishing a standard way of creating build pipelines that is integrated with security and operation needs.

Cloud computing is transformational. Focusing solely on the technology without dedicating a significant amount of time and resources on organizational change and process reengineering can lead to suboptimal results. The cloud is very powerful and small teams can produce some amazing results in a short amount of time. But to adopt cloud at enterprise level, replicating what these small teams have built simply does not scale. There are many other stakeholders involved at scale. Security, compliance, legal, procurement, human resources, finance, service desk, operations, disaster recovery and others all have part to play. Adopting cloud without a focus on people and process will lead to failure of epic proportions. Don’t let this happen to your organization.

This article originally appeared on Cloud Technology Partners under the title Six Reasons Why Your Cloud Strategy Must Include a Plan for Change.