Madhosh Yagnik | DevOps Engineer
Madhosh Yagnik
Production infrastructure across AWS and Azure - IaC migrations, CI/CD pipelines, and automation that replaces slow manual work. I care about systems that are reliable, cost-efficient, and easy for the next person to understand.
01AWS Disaster Recovery & Infrastructure AutomationAWS›
Designed and delivered multiple DR and infrastructure automation solutions for a production banking environment. Every decision was weighed against cost.
- Migrated Bastion servers from CentOS 7 (EOL) to Ubuntu 24.04 LTS with no service disruption. Rewrote the setup guide for Debian and YUM-based distros.
- Built a cost-effective DR system: EventBridge + Lambda for automated RDS snapshots, cross-region copy, and health-monitored auto-recovery if primary failed.
- Developed a parallel EC2 DR solution with AMI rotation, cross-region copy, auto-launch on health check failure, and SNS alerting.
- Evaluated AWS Read Replicas and DMS; excluded both on cost grounds after full POC - documented the decision with analysis.
- Resolved urgent Bitbucket pipeline failures and stabilised CI/CD long-term.
- Recovered access to a Windows EC2 instance after the private key was lost.
02DevOps Backlog - Docker, CMake, Makefile, Git, LinuxLinux›
Cleared a backlog of 72 DevOps tasks across Docker, Linux, Git, Makefile, and CMake - 64 accepted on first review.
- Picked up CMake and Makefile tasks while learning both tools in parallel - no blocked work or delays.
- Daily stand-up transparency with clear progress tracking throughout.
03Full Azure Migration & CI/CD PipelineAzure›
Took a project running entirely on local setups and brought it to a production-ready Azure deployment in one month.
- Migrated full stack to Azure Cloud, resolving routing and network configuration issues along the way.
- Iterated CI/CD across three approaches - GH Actions, Azure DevOps via SSH, and finally an Azure agent-based pipeline with client-approved security controls.
- Optimised the frontend Dockerfile to serve static files - load time dropped from seconds to milliseconds.
- Dockerized all services; created Docker Compose stacks for consistent local and cloud environments.
- Added systemd services and cron jobs for self-starting apps at VM boot.
04Terraform IaC Migration & Cost OptimisationAWS›
Inherited a production AWS environment with no IaC, no state management, and known security gaps. Left it fully Terraform-managed, secured, and cheaper to run.
- Migrated all AWS resources to Terraform with Terraform Cloud for remote state and environment isolation. Negligible downtime during migration.
- Partnered with the security team to audit past incidents and implement preventive IAM controls.
- Reduced monthly AWS spend by ~$95-100 via right-sizing and cleanup.
- Moved the on-prem chatbot server from the office to the server room - eliminated recurring accidental disconnections.
- Delivered a complete handover; incoming engineer appreciated the thoroughness.
05Lightweight RDS Backup AutomationAWS›
Client was paying for daily automated RDS backups on a staging environment that did not need them. Replaced the default behaviour with a purpose-built, native solution.
- Disabled built-in backups; implemented monthly snapshot Lambda and quarterly cleanup Lambda retaining the latest snapshot.
- Scheduled via EventBridge with SNS alerts. Wrote complete manual recovery documentation.
06Production Server Management & Deployment SecurityLinux›
Ongoing management of production and staging for two separate products. Minimal setup, stable operations.
- Manage deployments and NGINX configurations; coordinate with hosting provider for system-level updates.
- Resolved CORS and React routing issues from misconfigured NGINX paths.
- Replaced Git token-based deployment with SSH deploy keys - documented in a one-page team guide, adopted on a separate Azure project for consistency.
07SSL, DNS Recovery & Chatbot StabilisationAWS + DNS›
Picked up a production chatbot platform mid-incident - expired SSL, broken auto-renewal, and a domain blocked by a major social platform.
- Diagnosed and fixed the failed auto-renewal mechanism; renewed certificates.
- Resolved a domain blockage that had disrupted chatbot operations.
- Set up a temporary subdomain for business continuity; decommissioned cleanly after the main domain stabilised.
- Managed full domain transition to a new subdomain, updating backend and dependent service configurations.
- Upcoming: scoping migration of chatbot logic from AWS to client's own GCP VM.
08OpenShift CI/CD & Lab AutomationConfidential›
Working across two internal projects - a completed OpenShift plugin platform (Backstage) and an ongoing lab provisioning system used by around 1000 engineers.
- Led end-to-end DevOps delivery for the OpenShift plugin platform (Backstage), taking it from local development to production. This effort led to the decommissioning of the legacy resource hub and development of the cluster platform to replace it.
- Implemented GitLab CI pipelines for linting, SonarQube scanning, and container image build/release automation. Supported plugin releases from v0.0.2 to v0.0.24 - stable with minimal maintenance since.
- Contributed to a monorepo-based solution enabling consistent developer deployments across internal teams.
- Actively migrating RHEL7 Lab Controllers to RHEL9 as part of a vulnerability remediation effort.
- Automated VPN connection setup - removed manual OTP and credential steps, now single-click.
- Automated recovery for 'Broken' lab machines—originally a parallel Bash/xargs script that reduced lookup time from ~1 hour to minutes. Refactored to Python for team consistency and now runs via GitLab pipeline schedules for ongoing automated maintenance. This operation had been manual for over a decade.