9.0 min to readCloud Services

Combine and align on DevSecFinOps to build and sustain a winning cloud strategy and culture

A man in a blue shirt smiles for the camera.
Taylor LewickAWS Sales Engineer
The road is curvy.

Before we start, please join me for a quick stroll down memory lane. 

In the earlier days of the cloud, I recall being amazed at how quickly and easily we could create new computing environments, complete with infrastructure like storage, networking, and load balancers.

Also, more advanced capabilities like auto-scaling and geographical domain name system (DNS) routing just worked. I’d click a few buttons, spend a half day or two configuring and testing things, and have a new compute environment up and running. 

What would have easily taken us at least six weeks—from getting a signed purchase order (PO) for new hardware, waiting on shipping, racking and stacking servers, installing the OS, configuration, testing, etc.—was now a two or three-day project.

Flash-forward a couple of years, and most large production cloud environments were inefficiently and sub-optimally administered with processes and tools borrowed from managing on-prem data centers. Tools like Terraform and CloudFormation were non-existent or in their infancy and hadn’t achieved widespread adoption. 

The Rise of DevOps

Most of us still treated our cloud instances/servers as managed pets. More importantly, a culture of DevOps, while well beyond its infancy, wasn’t widely used at most organizations. Frustrations were growing as we realized the speed of launching a new environment is not a tremendous value-add unless you can effectively manage the environment.

I can also recall being delighted and amazed at how quickly our tools and, just as, or more importantly, our mindsets would change, undergoing rapid evolution. I remember hearing discussions about books like The Phoenix Project and The DevOps Handbook.

Also, I learned about practices like lean and agile management and all the cool things companies like Netflix were doing to manage extremely large-scale, mission-critical cloud environments.

I can still remember—with childlike delight—the first time I deployed infrastructure in Amazon Web Services (AWS) using Terraform.

“Wait—I write a few dozen lines of code, and everything is just up and running?!”

Shifting Left on Security

At roughly the same time, leading companies working in the cloud were also developing best practices regarding security and site reliability engineering, and we began to increasingly hear about “shifting left” on security. I began to hear the phrase DevSecOps more often.

Discussions around concepts like least privilege access, Role Based Access Controls (RBAC), low or zero-trust, and “no more click-ops” were commonplace. And the value of “shifting left” on security became prevalent quite quickly.

FinOps Is Born

While the tech folks were busy geeking out learning and applying these new tools and DevOps principles to their cloud (and on-prem) environments, I noticed the finance folks were often frowny-faced or just plain irate when they discussed cloud projects.

Their DevOps moment, their tool sets, hadn’t really arrived yet.

It was an all-too-common experience to listen to a CFO or product/service owner saying things like, “I thought the cloud was going to save us money?”

Again, another rapid cloud evolution occurred as the market pivoted to meet demand by creating financial management offerings and developing the beginning of FinOps practices.

Cloud providers were suddenly advising you to tag everything and use savings plans—and a host of cloud spending and oversight tools sprung up seemingly overnight.

Breaking Down Silos

At its core, these IT/cloud-based (r)evolutions focused on breaking down previous silos that existed in most organizations. This is similar to how Kaizen and continuous improvement concepts, which originated with Japanese automakers, empowered anyone working on a manufacturing line to pull the Andon cord and stop the line when an issue was discovered.

Think about that for a second. How revolutionary that practice alone was and still is.

No longer was it only a manager or director who could stop a manufacturing line, but any employee working in the plant.

A silo was removed. A culture was created where no one feared calling out a mistake, and as soon as an error was discovered, they all worked together until it was fixed.

It wasn’t just a band-aid. It was truly fixed.

Ongoing Business Value

This one concept created tremendous ongoing business value because it embraced ideologies and values such as worker empowerment, transparency, and honesty. It eliminated the tendency to want to blame and shame messengers of bad news. They were shifting left, that is, moving responsibility for the overall health of operations into the hands of the people doing day-to-day operations.

Long-term performance was more valued over short-term, meaning everyone truly felt okay stopping the manufacturing process because they were confident that the issue would be fixed, leading to overall lower long-term defect rates and higher customer satisfaction, which in turn paid the company back several times and more than offset the profit hit from a short-term shutdown. This required buy-in from leaders (highly strategic), managers (mix), and line workers (highly tactical).

The Marriage of DevOps, SecOps, and FinOps

Today's goal should be for all IT organizations to marry the principles of DevOps, SecOps, and FinOps. They should be combined into DevSecFinOps. The best practices should be democratized, and everyone in the organization should feel empowered to suggest improvements.

Developers and infrastructure (network, security, database, etc.) engineers should work together to ensure their applications and architectures are as efficient, resilient, and performant as possible. Everyone should feel safe and respected enough to call out and organizationally say “Stop!” when they see security issues. As much as possible, cloud costs should be shared to foster an environment of transparency and shared responsibility.

Strategy Defines Culture

The above practices—Kaizen and continuous improvement for manufacturing or DevSecFinOps for IT—work because they bring together strategic and tactical best practices and remove organizational silos. Strategy and tactics must form a harmonious marriage, and organizations need to learn to stop favoring one over the other.

I often hear things like, “That person doesn’t think strategically,” or “All they care about is operations,” or “The executives and directors only think big picture; they won’t get it.”

An organization’s strategy will help to define its culture. An organization’s tactics will allow it to live that culture. You need both to be successful.

Let's look at a real-world cloud-based example to tie this all together.

I worked with a large healthcare provider a couple of years ago. This company was multi-cloud, predominately AWS based, but with presences in Microsoft Azure and Google Cloud Platform (GCP).

At their scale, they had hundreds of AWS accounts, so they had multi-account, landing zone, and AWS organization’s best practices that were well established. I was impressed with what they had built, and it was obvious they had taken a very mindful approach, incorporating many DevSecFinOps and cloud best practices into their environment. They told me they had recently stopped allowing any “click-ops.”

No human was allowed to make any manual changes in their upper (production, user accepting testing, and test) environments. Only the developer sandbox accounts allowed manual changes. All other environments required changes via Terraform and deployed with DevOps pipelines.

A small set of admins had access to a “break-glass” type of identity access management (IAM) role so they could submit a request. A manager or other admin would have to approve the request, and the position had a time limit.

Drift Detection

I asked them how they were doing drift detection, and they said they didn’t have a good method other than comparing Terraform plan outputs between runs. But Terraform will only detect drift on resources it knows about. If a human created a new instance or load balancer via the console, a Terraform plan wouldn’t show this.

They reasoned that with their IAM roles and policies, they should be well covered. I agreed, but I pointed out that there were still some scenarios where it would be helpful for them to get alerts.

For example, what if a new admin was added to the team and they weren’t fully aware or up to speed on their policies? Or if an established admin had to assume that break-glass role and make some corrective changes manually in the middle of the night?

It would be nice to have notification of the changes so the team could modify the Terraform files the next day. I pointed out that for maybe a few dollars a year, they could use a lambda function that would detect and alert on any drift.

Continuous Improvement

I knew in AWS that CloudTrail logs every API action. Furthermore, it logs every API action as read-only with a Boolean True or False flag. I said it would be easy to create a lambda function that would be invoked when a new CloudTrail file is written to S3.

It can look for any non-read-only API actions, and then it can look at the IAM user. If the IAM user doesn’t match any of their approved service roles, the function can send a drift detection alert to their teams’ Slack channel. The only state their team would need to maintain was a single flat file or DynamoDB entry with a list of the approved service/deployment and admin roles.

They thought this was an excellent idea.

When I made my small, tactical suggestion to this company, it was well received and easily implemented because they had already worked hard to build a culture of continuous IT improvement. They had removed previous silos, defined their strategic goals, and were well on their way to building out tactical processes to make daily operational management of their environment more manageable and scalable.

When I made my suggestion, it wasn’t met with defensiveness or hesitancy. Because it was another iterative improvement, the team quickly approved and implemented it.

I genuinely enjoy getting to know my customers and their environments, and I enjoy diving deep so I can bring my years of experience to help add value. At SoftwareOne, we have many cloud engineers and architects who are just as passionate about helping their clients.

We enjoy meeting clients where they are, from starting their cloud journey to assisting them in migrating or deploying critical infrastructure, modernizing, and evolving to become more cloud native.

Helping You with Your Cloud Journey

SoftwareOne has a lot of tools that we can use to help you assess where your organization is in terms of your cloud journey, ranging from Optimization and Licensing Assessments (OLAs) to Well-Architected Reviews to Migration Readiness Assessments to DevOps and Security assessments to FinOps reviews.

We can help you put out tactical fires and help you with long-term strategic planning. We know how to have conversations that will help shine a light on where you are and where you are trying to go, and we have the people and skills to help you get there.

A green field with a river running through it.

Optimize what you have & build what you don’t

Moving to the cloud is key to transforming business, unlocking innovation, and sharpening your competitive edge. Learn how an independent partner like SoftwareOne sees the big picture and works side-by-side with you to deliver on the promise of cloud.

Optimize what you have & build what you don’t

Moving to the cloud is key to transforming business, unlocking innovation, and sharpening your competitive edge. Learn how an independent partner like SoftwareOne sees the big picture and works side-by-side with you to deliver on the promise of cloud.

Author

A man in a blue shirt smiles for the camera.

Taylor Lewick
AWS Sales Engineer