Category Archives: General

Joint-ownership between DevOps and Software Development teams

A successful joint-ownership model between DevOps and Software Development teams requires a clear division of responsibilities that promotes collaboration, accountability, and efficient incident resolution.

Guiding Principles for Joint Ownership

Joint ownership means shared goals and continuous communication, while outlining specific responsibilities for each team.

For both teams, consider the following shared principles:

Shared Responsibility for Service Health: Both teams are invested in the reliability, performance, and availability of the service.

Blameless Postmortems: Focus on process and system improvements rather than individual blame during incidents.

Automation First: Prioritize automating repetitive tasks and manual toil.

Continuous Improvement: Regularly review processes, tools, and team performance to identify areas for enhancement.

SLO-Driven Development & Operations: Introduce and track Service Level Objectives (SLOs) to define acceptable service performance and guide priorities.

DevOps Team Responsibilities

The DevOps team will primarily focus on the infrastructure, deployment pipelines, observability, and overall operational health of the service.

Infrastructure Management

  • Provisioning, configuring, and maintaining the underlying infrastructure (e.g., VMs, containers, cloud services, networking) where the service runs.
  • Managing infrastructure as code (IaC) templates and ensuring their consistency.
  • Implementing and maintaining disaster recovery and backup strategies.
  • Capacity planning and scaling infrastructure to meet demand.

Deployment Pipeline Ownership & Automation

  • Designing, building, and maintaining robust CI/CD pipelines for the service.
  • Ensuring automated testing integration within the pipeline.
  • Implementing blue/green deployments, canary releases, or other advanced deployment strategies.
  • Managing deployment tools and platforms.

Observability & Monitoring

  • Setting up and maintaining comprehensive monitoring, logging, and alerting systems for the service.
  • Defining key metrics (e.g., latency, error rate, throughput, saturation) in collaboration with the Software Development team.
  • Managing observability platforms (e.g., Prometheus, Grafana, ELK stack, Datadog).
  • Establishing alert thresholds and notification mechanisms.

On-Call & Incident Management (Primary)

  • Being the primary on-call responders for service-related incidents.
  • Initial triage, investigation, and diagnosis of incidents.
  • Escalating to the Software Development team when code-level expertise is required.
  • Implementing temporary mitigations and workarounds during incidents.
  • Documenting incident timelines and actions taken.

Security & Compliance (Infrastructure Level)

  • Implementing security best practices for the infrastructure.
  • Managing access controls and credentials.
  • Ensuring infrastructure compliance with organizational policies and regulations.

Tooling & Platform Management

  • Evaluating, selecting, and maintaining tools used for operations, monitoring, and deployment.
  • Providing support and expertise on these tools to the Software Development team.

Service Level Objective (SLO) Definition & Tracking (Operational Aspects)

  • Collaborating with the Software Development team to define and track SLOs related to operational performance (e.g., uptime, response time of the infrastructure).
  • Reporting on SLO adherence from an infrastructure perspective.

Software Development Team Responsibilities

The Software Development team will focus on the application logic, code quality, functional correctness, and performance of the service.

Code Ownership & Quality

  • Writing, testing, reviewing, and submitting high-quality code for the service.
  • Ensuring unit, integration, and end-to-end tests are comprehensive and effective.
  • Adhering to coding standards and best practices.
  • Maintaining code documentation.

Application Architecture & Design

  • Designing the application’s architecture to be scalable, resilient, and maintainable.
  • Making technology stack decisions for the application.

Application Performance & Optimization

  • Optimizing application code for performance and resource efficiency.
  • Identifying and resolving performance bottlenecks within the application.
  • Conducting load and stress testing on the application.

Feature Development & Bug Fixing

  • Developing new features and functionalities for the service.
  • Prioritizing and fixing application-level bugs.

Application Logging, Metrics & Tracing

  • Implementing comprehensive logging within the application, providing relevant context for debugging.
  • Emitting application-specific metrics that are crucial for understanding service health (e.g., business metrics, internal queue sizes, API call counts).
  • Implementing distributed tracing within the application to aid in understanding request flows.
  • Ensuring logs and metrics are easily consumable by the observability stack.

On-Call & Incident Management (Escalation & Deep Dive)

  • Being secondary on-call responders, available for escalation from the DevOps team when incidents require deep application-level expertise or code changes.
  • Performing root cause analysis for application-related issues.
  • Implementing immediate code fixes or workarounds during incidents.
  • Participating in blameless postmortems and contributing to action items related to the application.

Service Level Objective (SLO) Definition & Tracking (Application Aspects)

  • Collaborating with the DevOps team to define and track SLOs related to the user experience and application functionality (e.g., login success rate, transaction completion time).
  • Reporting on SLO adherence from an application perspective.

Security & Compliance (Application Level)

  • Implementing secure coding practices.
  • Addressing security vulnerabilities identified within the application code.
  • Ensuring application compliance with data privacy and security regulations.

Joint Responsibilities

Together, both teams share and collaborate on the following responsibilities:

  • Service Level Objective (SLO) Definition & Review: Both teams must jointly define, review, and agree upon SLOs for the service, encompassing both operational and application aspects. These SLOs should drive priorities for both teams.
  • Release Planning & Management: Collaborative planning of releases, including understanding dependencies, potential risks, and rollback strategies.
  • Post-Mortem & Incident Review: Joint participation in blameless post-mortems to identify systemic issues and collaborate on preventative measures and improvements.
  • Documentation: Both teams are responsible for contributing to and maintaining comprehensive service documentation, runbooks, and architectural diagrams.
  • Knowledge Sharing & Training: Regular sessions to share knowledge, best practices, and new technologies. DevOps can train developers on operational tools, and developers can train DevOps on application internals.
  • Tooling Integration: Ensuring seamless integration between development tools (IDEs, SCM) and operational tools (CI/CD, monitoring).
  • Cost Management: Joint responsibility for optimizing cloud resource usage and managing service costs.

By clearly defining the responsibilities for each team, as well as those that are shared, while also emphasizing collaboration and shared goals, this joint-ownership model can foster a more resilient, efficient, and accountable approach to service management within the organization.

My avatar background

The avatar that I use online is one I’ve had since 2008, when it was created for me by a marketing firm as part of VMworld 2008 design and branding.

I think personal branding in the public sphere is important. So when given the opportunity, I always use this avatar (or via Gravatar) as my profile picture – except on more professional platforms like LinkedIn where I use a professional photo, or more personal platforms like Facebook.

I like the simplicity of it – the simple solid colors, basic expression. It’s instantly recognizable, and if someone has seen it before, they remember to associate it with me.

The avatar was made as a cartoon version of myself captured from a video where I’m being interviewed about VMware VDM (which eventually became View, and then Horizon View). The interview segment starts with me as the cartoon and then it morphs in to video footage of me speaking about VDI.

My coworker and I were invited to a private customer beta session at the VMware campus on Hillview Ave in Palo Alto in July 2008 to learn about, use, and give feedback on what would eventually become VMware View, and later, Horizon View. After the beta session, I and a few others participated in the video interview about how we were using VDI.

At the time, I worked for a healthcare organization which was really at the forefront of virtual desktops, having deployed them in 2007 as part of a move to a new hospital campus, including a datacenter relocation. Famously, no PCs were purchased for the majority of the building, rather, thin terminals were placed and we went all-in on VDI on VMware.

2008 was already a busy year for press about the organization’s use of VDI – a VMware press release in February discussed the hospital move and drive towards VDI.

An interview and photo shoot featured us on the cover of Network World Magazine May, 2008 issue.

Finally, I presented details of our VDI deployement at a well-attended West Michigan VMware User Group held at the hospital in a large conference room in November, 2008. At the time, much of the automation was developed in-house and worked quite well, so we were wary of switching to a commercial product.

The video interview at VMware ultimately aired as part of the VMworld 2008 conference kickoff at The Venetian Hotel in Las Vegas in August, 2008. VMware View was announced at the conference and released in December, 2008.

I could not attend in person but a review of photos taken at the event show the same style used throughout the conference, with many different faces rendered in this simple cartoon format.

The look was created by Emotive Brand which developed the strategy, messaging and design of the VMworld experience for several years including 2008.

Another version was also developed which featured a different shirt and color background.

I alternated between versions for a time before settling on the yellow-background version.

It’s quite convenient to have a go-to avatar for use when needed. I always enjoy finding new places to use it!