DevOps Fundamentals

Introduction

DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality.

Core Values:

  • Culture of collaboration
  • Automation of processes
  • Continuous improvement
  • Customer-centric action
  • End-to-end responsibility
  • Fast feedback loops

Core Principles

The Three Ways

First Way: Flow

  • Left-to-right flow of work
  • Small batch sizes
  • Reduced work in progress
  • Eliminated constraints

Second Way: Feedback

  • Short feedback loops
  • Problem detection
  • Quality at source
  • Understanding and response

Third Way: Learning

  • Continuous experimentation
  • Risk taking and learning
  • Practice and repetition
  • Organization improvement

Key Practices

Continuous Integration

# Example GitHub Actions workflow
name: CI Pipeline

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v2
    
    - name: Set up Node.js
      uses: actions/setup-node@v2
      with:
        node-version: '16'
        
    - name: Install dependencies
      run: npm ci
        
    - name: Run tests
      run: npm test
        
    - name: Build
      run: npm run build
        
    - name: Run linter
      run: npm run lint

Continuous Deployment

# Example deployment workflow
name: CD Pipeline

on:
  workflow_run:
    workflows: ["CI Pipeline"]
    branches: [main]
    types: [completed]

jobs:
  deploy:
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    
    steps:
    - name: Deploy to production
      uses: some-deploy-action@v1
      with:
        api_token: ${{ secrets.DEPLOY_TOKEN }}
        environment: production

Essential Tools

Version Control:

  • Git
  • GitHub/GitLab/Bitbucket

CI/CD:

  • Jenkins
  • GitHub Actions
  • CircleCI
  • GitLab CI

Configuration Management:

  • Ansible
  • Puppet
  • Chef

Containerization:

  • Docker
  • Kubernetes

Monitoring:

  • Prometheus
  • Grafana
  • ELK Stack

Automation

Infrastructure as Code

# Terraform example
provider "aws" {
  region = "us-west-2"
}

resource "aws_instance" "web" {
  count = 3
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t2.micro"

  tags = {
    Name = "web-server-${count.index}"
    Environment = "production"
  }
}

resource "aws_security_group" "allow_http" {
  name        = "allow_http"
  description = "Allow HTTP inbound traffic"

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Configuration Management

# Ansible playbook example
---
- name: Configure web servers
  hosts: webservers
  become: yes
  
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
        
    - name: Start nginx service
      service:
        name: nginx
        state: started
        enabled: yes
        
    - name: Copy website files
      copy:
        src: files/website/
        dest: /var/www/html/
        mode: '0644'

Metrics & KPIs

Key Performance Indicators

Deployment Metrics:

  • Deployment Frequency
  • Lead Time for Changes
  • Change Failure Rate
  • Mean Time to Recovery

Operational Metrics:

  • System Availability
  • Error Rate
  • Response Time
  • Resource Utilization

Quality Metrics:

  • Test Coverage
  • Bug Resolution Time
  • Technical Debt
  • Code Quality Score
# Prometheus monitoring example
groups:
- name: deployment_metrics
  rules:
  - record: deployment_frequency
    expr: count(deployment_timestamp) by (environment)

  - record: lead_time
    expr: deployment_timestamp - code_commit_timestamp

  - alert: HighErrorRate
    expr: error_rate > 0.05
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: High error rate detected
      description: Error rate is above 5% for 5 minutes