DevOps Fundamentals
Introduction
DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality.
Core Values:
- Culture of collaboration
- Automation of processes
- Continuous improvement
- Customer-centric action
- End-to-end responsibility
- Fast feedback loops
Core Principles
The Three Ways
First Way: Flow
- Left-to-right flow of work
- Small batch sizes
- Reduced work in progress
- Eliminated constraints
Second Way: Feedback
- Short feedback loops
- Problem detection
- Quality at source
- Understanding and response
Third Way: Learning
- Continuous experimentation
- Risk taking and learning
- Practice and repetition
- Organization improvement
Key Practices
Continuous Integration
# Example GitHub Actions workflow
name: CI Pipeline
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Node.js
uses: actions/setup-node@v2
with:
node-version: '16'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Build
run: npm run build
- name: Run linter
run: npm run lint
Continuous Deployment
# Example deployment workflow
name: CD Pipeline
on:
workflow_run:
workflows: ["CI Pipeline"]
branches: [main]
types: [completed]
jobs:
deploy:
if: ${{ github.event.workflow_run.conclusion == 'success' }}
runs-on: ubuntu-latest
steps:
- name: Deploy to production
uses: some-deploy-action@v1
with:
api_token: ${{ secrets.DEPLOY_TOKEN }}
environment: production
Essential Tools
Version Control:
- Git
- GitHub/GitLab/Bitbucket
CI/CD:
- Jenkins
- GitHub Actions
- CircleCI
- GitLab CI
Configuration Management:
- Ansible
- Puppet
- Chef
Containerization:
- Docker
- Kubernetes
Monitoring:
- Prometheus
- Grafana
- ELK Stack
Automation
Infrastructure as Code
# Terraform example
provider "aws" {
region = "us-west-2"
}
resource "aws_instance" "web" {
count = 3
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
tags = {
Name = "web-server-${count.index}"
Environment = "production"
}
}
resource "aws_security_group" "allow_http" {
name = "allow_http"
description = "Allow HTTP inbound traffic"
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
Configuration Management
# Ansible playbook example
---
- name: Configure web servers
hosts: webservers
become: yes
tasks:
- name: Install nginx
apt:
name: nginx
state: present
- name: Start nginx service
service:
name: nginx
state: started
enabled: yes
- name: Copy website files
copy:
src: files/website/
dest: /var/www/html/
mode: '0644'
Metrics & KPIs
Key Performance Indicators
Deployment Metrics:
- Deployment Frequency
- Lead Time for Changes
- Change Failure Rate
- Mean Time to Recovery
Operational Metrics:
- System Availability
- Error Rate
- Response Time
- Resource Utilization
Quality Metrics:
- Test Coverage
- Bug Resolution Time
- Technical Debt
- Code Quality Score
# Prometheus monitoring example
groups:
- name: deployment_metrics
rules:
- record: deployment_frequency
expr: count(deployment_timestamp) by (environment)
- record: lead_time
expr: deployment_timestamp - code_commit_timestamp
- alert: HighErrorRate
expr: error_rate > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: High error rate detected
description: Error rate is above 5% for 5 minutes