Abdul Rehman

Senior Cloud & DevOps Engineer

Specializing in scalable infrastructure, platform reliability, and secure software delivery across AWS and Azure. Kubernetes, GitOps (ArgoCD, FluxCD), CI/CD, and Web3/blockchain infrastructure.

Abdul Rehman

About

DevSecOps engineer with hands-on experience designing, securing, and operating multi-cloud infrastructure across AWS, Azure, and high-performance Linux environments. I focus on systems that are reliable, auditable, and cost-effective—not on buzzwords.

I build end-to-end CI/CD pipelines (GitHub Actions, Jenkins, AWS CodePipeline), deploy production workloads on Kubernetes, Docker, and Docker Swarm, and harden environments with IAM, network security, and access control. At Funavry I design scalable Kubernetes on AWS, GitOps with ArgoCD and FluxCD, and Web3/blockchain infrastructure; previously at IKONIC I ran Laravel and React/Python AI on EKS/AKS, and at Forbmax I architected DevSecOps pipelines with SonarQube, Trivy, OWASP, and ArgoCD.

I care about ownership: from design and automation to monitoring and incident response. The goal is always the same—enable fast, confident releases on platforms that stay up and stay secure.

Platforms & experience

Clouds, DevOps, AI/ML Ops, blockchain infra, and CI/CD—high-level experience across domains.

Clouds

Platforms I design and operate on.

  • ·AWS
  • ·Azure
  • ·Digital Ocean
  • ·GCP
  • ·Hetzner
  • ·Vultr
  • ·cPanel

DevOps & Orchestration

Production-grade infrastructure patterns.

  • ·Kubernetes & Docker at scale
  • ·Helm, Ingress (Nginx, Traefik)
  • ·Prometheus, Grafana, CloudWatch
  • ·ELK Stack, Zabbix

AI/ML Ops

Model deployment and inference infrastructure.

  • ·GPU workloads on EKS & AKS
  • ·Python AI apps & model serving
  • ·Containerized inference pipelines
  • ·ML observability & scaling

Blockchain Infrastructure

Nodes, RPC, and data pipelines.

  • ·Node & RPC high availability
  • ·Backup & secure key handling
  • ·API gateways & rate limiting
  • ·Indexing & sync pipelines

CI/CD & Security

Pipelines and security in the loop.

  • ·Jenkins, GitHub Actions, GitLab CI
  • ·Azure DevOps, CodePipeline
  • ·ArgoCD, GitOps
  • ·fluxcd
  • ·SonarQube, Trivy, OWASP

Focus & proficiency

High-level expertise across cloud, automation, and platform engineering.

5.4krating
Rate my profile
Cloud & Infrastructure5/5
CI/CD & DevOps5/5
Kubernetes & Containers5/5
Security & IAM4/5
IaC (Terraform, etc.)5/5
AI/ML Ops & GPU workloads4/5
Blockchain / Web3 infra4/5

Experience

Senior DevOps to Linux systems—platform reliability, security, and automation across production environments.

Senior DevOps Engineer

Funavry TechnologiesJan 2026 – Present · Islamabad, Pakistan

IT services company delivering software development, blockchain, and digital solutions for global clients.

  • ·Designed and managed scalable Kubernetes infrastructure on AWS (ECR, ECS, EC2, RDS) with high availability and performance.
  • ·Implemented GitOps-based deployment workflows using ArgoCD and FluxCD for automated, reliable delivery across multiple environments.
  • ·Optimized cluster scalability with Karpenter and KEDA for dynamic node provisioning and event-driven workload scaling.
  • ·Built and maintained secure CI/CD pipelines with GitHub Actions and AWS CodePipeline (build, test, deployment automation).
  • ·Implemented service mesh with Istio and Kiali for traffic observability and service monitoring within Kubernetes clusters.
  • ·Deployed and supported Web3 / blockchain-based infrastructure for reliable operation of distributed workloads in cloud.
  • ·Strengthened cloud security: IAM policies, RBAC, network security across AWS; Defguard VPN for secure remote access.
  • ·Developed automation and operational workflows using AWS Lambda for infrastructure management and scaling.

DevOps Engineer

IKONICMar 2025 – Jan 2026 · Islamabad, Pakistan

Custom web and mobile software for scalable digital products.

  • ·Designed, deployed, and maintained Laravel/PHP applications on Docker Swarm with high availability and scalable orchestration.
  • ·Built and supported React/Python AI applications and GPU-based AI bots on AWS EKS and Azure AKS.
  • ·Implemented CI/CD pipelines with GitHub Actions (automated build, security checks, multi-environment deployments).
  • ·Managed and secured cloud infrastructure across AWS and Azure (ECS, EKS, RDS, Amplify); network security, firewalls, access controls.
  • ·Optimized containerized workloads for performance, cost efficiency, and operational reliability in production.

DevOps Engineer

ForbmaxAug 2023 – Mar 2025 · Islamabad, Pakistan

Cloud infrastructure and DevOps-focused technology company.

  • ·Architected end-to-end Jenkins pipelines: checkout → build → unit tests → SonarQube → Trivy/OWASP → artifact → image → integration tests → deployment.
  • ·Implemented DevSecOps at every pipeline stage; GitOps with ArgoCD for declarative Kubernetes deployments.
  • ·Deployed and automated containerized applications on Kubernetes with Docker and Helm for scalable delivery.
  • ·Enhanced observability with Prometheus and Grafana for real-time monitoring and alerting.
  • ·Configured Ingress (Nginx, Traefik), autoscaling, resource tuning; managed EC2, ECR, IAM, and AWS security best practices.

Linux System Administrator

Onyx Tech2022 – 2023 · Lahore

Enterprise Linux and infrastructure operations.

  • ·Administered RHEL 7/8: system configurations, security patches, updates (YUM/RPM); LVM, partitioning, RAID for storage.
  • ·Configured and maintained Apache, DNS, NFS, FTP in production; backup and recovery with rsync and tape backups.
  • ·System monitoring and performance optimization (top, htop, iostat, netstat, pidstat, custom scripts); SUID, SGID, sticky bits.
  • ·Network configuration: IP addressing, routing, firewall rules (iptables).

Projects

Problem, solution, and impact—architecture-focused.

Forbmax

Multi-cloud CI/CD & DevSecOps pipeline

Jenkins, SonarQube, Trivy, OWASP, ArgoCD, Kubernetes, Helm

Problem: Manual deployments and inconsistent quality/security gates.

Solution: End-to-end pipeline: checkout → build → tests → SonarQube → Trivy/OWASP → image → integration tests → deployment; GitOps with ArgoCD.

Impact: Automated, repeatable releases with security and quality at every stage.

IKONIC

GPU AI workloads on EKS & AKS

AWS EKS, Azure AKS, Docker, GitHub Actions

Problem: AI bots and Python models needed reliable, scalable deployment.

Solution: Containerized apps and GPU-based bots on managed Kubernetes; GPU node pools, resource limits, multi-environment promotion.

Impact: Inference and training run reliably in production with clear rollout and scaling.

IKONIC

Laravel + Docker Swarm production stack

Docker Swarm, Laravel, PHP, AWS/Azure

Problem: Laravel apps required high availability and easy updates.

Solution: Docker Swarm orchestration with health checks, rolling updates, and cloud networking/security groups.

Impact: Stable, scalable web stack with minimal downtime on releases.

Forbmax

Observability & alerting stack

Prometheus, Grafana, Kubernetes

Problem: Limited visibility into cluster and application health.

Solution: Prometheus metrics, Grafana dashboards, and alerting for SLOs and failures.

Impact: Real-time visibility; issues caught before they affect users.

Infrastructure & DevOps Projects

Terraform-driven platforms: modules, remote state, env separation, and security by default.

Secure Enterprise CI/CD Platform

Terraform, Jenkins, HashiCorp Vault
Problem

CI/CD ran on shared runners with no secret isolation, inconsistent env parity, and no audit trail for pipeline config changes.

Architecture

Terraform-provisioned Jenkins controllers per env (dev/stage/prod) in private subnets; Vault for dynamic secrets (DB, API keys). State in S3 + DynamoDB; Jenkins config and job DSL versioned and applied via Terraform null_resource + local-exec where needed.

Terraform

Modules: vpc, jenkins-controller, vault-auth, s3-backend. Remote state (S3 + DynamoDB lock). Providers: aws, hashicorp/vault. Env separation via workspaces + tfvars per env. IAM roles for Jenkins nodes (least-privilege); no long-lived creds in config.

Security

Secrets in Vault only; Jenkins nodes assume IAM roles. Network: private subnets, security groups allow only required egress. Pipeline logs and config changes in CloudTrail/config. RBAC in Jenkins and Vault.

Impact

Single pipeline definition, three isolated envs, zero secrets in code. Rollbacks and audits are traceable; releases are repeatable and compliant.

Multi-Cloud Kubernetes Platform

Terraform, AWS EKS, GKE
Problem

Teams needed a single pattern to run workloads on AWS and GCP without duplicating ops playbooks or losing env parity.

Architecture

Terraform modules abstract EKS and GKE: node pools, add-ons (CoreDNS, metrics), IRSA / workload identity. Shared modules for VPC (AWS), VPC (GCP), and app-level Helm releases. State per cloud + per env in S3 and GCS with locking.

Terraform

Modules: eks-cluster, eks-nodegroup, gke-cluster, gke-nodepool, vpc-aws, vpc-gcp, helm-release. Backend: S3 + DynamoDB (AWS), GCS + bucket (GCP). Providers: aws, google. Env separation via directories (envs/dev, envs/prod) and tfvars. IAM/Service accounts scoped per cluster.

Security

Private endpoints where possible; node groups in private subnets. Pod-level IAM (IRSA / workload identity). Encryption at rest and in transit; security groups and firewall rules from Terraform. No broad wildcards.

Impact

Same Terraform patterns for EKS and GKE; new envs are a copy of tfvars and a plan/apply. Reduced drift and consistent networking and RBAC across clouds.

AI Infrastructure & MLOps Platform

Terraform, GPU node pools, model serving
Problem

ML teams needed dedicated GPU capacity, reproducible envs for training and inference, and a clear path from experiment to production without manual infra tickets.

Architecture

Terraform-managed GPU node pools (EKS/GKE) with taints and tolerations; separate namespaces for training vs serving. S3/GCS buckets for artifacts and models; optional Kubeflow or custom operators applied via Helm. Inference deployed as K8s services with HPA and resource limits.

Terraform

Modules: gpu-nodegroup, artifact-bucket, namespace-ml, helm-ml-stack. Remote state S3/GCS + lock. Providers: aws, google, helm, kubernetes. Env separation: dev/stage/prod via tfvars and workspace or directory. IAM and RBAC for data access and node identity.

Security

GPU nodes in private subnets; no direct internet. Model artifacts in encrypted buckets; access via IAM/service account. Network policies restrict pod-to-pod; secrets for model registry and API keys in Vault or K8s secrets (encrypted).

Impact

ML teams get GPU capacity and namespaces on demand; training and serving envs are reproducible. Model promotion is a config change and apply, not a ticket queue.

Blockchain Infrastructure Platform

Terraform, nodes, RPC infra
Problem

Node operators needed repeatable, secure deployment for full nodes and RPC endpoints with high availability and clear backup/restore.

Architecture

Terraform provisions VMs or K8s workloads per chain; dedicated storage and networking. RPC layer (load balancer + health checks) in front of node group; optional read replicas. State and backups in object storage with lifecycle rules.

Terraform

Modules: node-instance, rpc-lb, storage-volume, backup-bucket. Remote state S3/GCS + DynamoDB/GCS lock. Providers: aws or google, optional helm for K8s-based nodes. Env separation via tfvars (mainnet vs testnet). IAM/service accounts for node and backup access only.

Security

Nodes in private subnets; RPC exposed via LB with rate limiting and optional WAF. Key material in Vault or KMS; no keys in Terraform state. Backups encrypted; access logged. Network segmentation between node and RPC layers.

Impact

New chains or regions are a module instantiation and apply. Node and RPC availability improved; recovery from backup is documented and tested.

Chaos Engineering & Resilience Platform

Terraform, observability, fault injection
Problem

Outages were found in production; there was no systematic way to test failure modes or validate SLOs under load and faults.

Architecture

Terraform defines the observability stack (Prometheus, Grafana, alerting) and the chaos namespace and RBAC. Fault injection runs as scheduled or on-demand jobs (e.g. Litmus, Chaos Mesh, or custom pods). Dashboards and alerts are code; runbooks referenced in Terraform outputs.

Terraform

Modules: prometheus-stack, grafana-dashboards, alertmanager-config, chaos-namespace, chaos-rbac. Remote state S3 + DynamoDB. Providers: aws, kubernetes, helm. Env separation: dev/stage/prod with separate state and tfvars. IAM and K8s RBAC limit who can run chaos and what they can target.

Security

Chaos scope limited by RBAC and network policies; blast radius contained to non-prod or approved targets. Observability data access controlled; no PII in metrics. Audit log of who ran which experiment and when.

Impact

Failure modes are exercised before production; SLOs and runbooks are validated. Incidents decreased and MTTR improved because failure paths are known and documented.