Skip to content
View pmady's full-sized avatar
๐ŸŽฏ
Focusing
๐ŸŽฏ
Focusing

Block or report pmady

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please donโ€™t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
pmady/README.md

Hi there, I'm Pavan ๐Ÿ‘‹

LinkedIn Blog Medium Dev.to ResearchGate Profile Views

About Me

Senior Cloud Platform Engineer at W.W. Grainger, Inc. and CNCF Golden Kubestronaut. Deep expertise in cloud-native GPU/AI infrastructure, Kubernetes ecosystems, and platform engineering. Building open-source tools for GPU workload autoscaling, observability, and topology-aware incident response.

๐Ÿ“Š GitHub Stats

GitHub Stats

Stats updated on 2026-04-26 01:18 UTC

๐Ÿ† Certifications

Kubestronaut

Kubestronaut - One of the elite professionals who have achieved all five Kubernetes certifications from the CNCF:

  • KCNA (Kubernetes and Cloud Native Associate)
  • CKA (Certified Kubernetes Administrator)
  • CKAD (Certified Kubernetes Application Developer)
  • CKS (Certified Kubernetes Security Specialist)
  • KCSA (Kubernetes and Cloud Native Security Associate)

๐ŸŒฑ Open Source Contributions

CNCF Contributor

Actively contributing to CNCF (Cloud Native Computing Foundation) and ASWF (Academy Software Foundation) projects:

CNCF (Cloud Native Computing Foundation)

Project Description Contributions
Dragonfly P2P-based file distribution and image acceleration client#1665 - Add Hugging Face backend support with hf:// protocol, client#1673 - Add ModelScope backend support with modelscope:// protocol, d7y.io#386 - Add hf:// protocol documentation, d7y.io#398 - Add P2P-accelerated AI model downloads blog post, helm-charts#455 - Add injector support to helm chart, helm-charts#480 - Replace deprecated bitnamilegacy/mysql with bitnami/mysql
Kubernetes Production-Grade Container Orchestration #53891 - Document deployment.kubernetes.io/* annotations, #53892 - Add kubectl apply view-last-applied documentation
TiKV Distributed transactional key-value database #19225 - Add AGENTS.md for AI agent guidance
Volcano Cloud-native batch scheduling for AI/HPC #5095 - GPU NUMA topology awareness in scheduler, apis#229 - Add GPUInfo type to NumatopoSpec CRD, resource-exporter#12 - GPU NUMA topology discovery via sysfs
KEDA Kubernetes Event-driven Autoscaling keda-docs#1658 - Removing metricName from the kedadocs, #7538 - GPU/AI inference scaler architectural analysis
Metalยณ Bare metal host provisioning for Kubernetes #624 - Fix redirect links in tryit.md
OpenTelemetry Observability framework #8632 - Add .NET troubleshooting page
kpt Kubernetes-native packaging and resource management #4278 - Fix kpt fn doc command for KRM functions expecting input

ASWF (Academy Software Foundation)

Project Description Contributions
OpenColorIO Color management library #2229 - Add release signing workflow, #2230 - Add Dependabot configuration, #2243 - Add Vulkan unit test framework
OpenCue Cloud rendering management system #2134 - Add scheduled subscription recalculation task
OpenImageIO Image processing library #4976 - Fix IBA::compare_Yee() channel access
RAWtoACES RAW to ACES image conversion #222 - Add build developer documentation
xSTUDIO Playback and review application #186 - Fix broken build guide links

Total: 26 PRs across 15 projects in CNCF and ASWF foundations!

Personal Projects

Project Description Contributions
keda-gpu-scaler KEDA External gRPC Scaler for GPU/AI workloads CI Native NVML metrics, DaemonSet deployment, pre-built scaling profiles (vLLM, Triton, training), Helm chart, scale-to-zero
otel-gpu-receiver OpenTelemetry Collector receiver for GPU metrics NVIDIA GPU metrics via NVML, OpenTelemetry-native, Prometheus exporter, multi-GPU support
kube-topology-agent K8s topology discovery & automated root-cause analysis Knowledge graph of cluster resources, AlertManager webhook integration, GPU workload classification, blast-radius analysis
Golden Kubestronaut Learning Kubernetes certification study guides and resources #23 - Dark mode persistence, #24 - PDF generation workflow

๐Ÿš€ Featured Projects - Looking for Contributors!

I'm actively developing these open source projects and welcome contributors of all skill levels!

๐ŸŽฎ KEDA GPU Scaler

Stars License

KEDA External gRPC Scaler for GPU/AI workloads

  • ๐ŸŽฎ Native NVML - Direct GPU metrics via go-nvml
  • ๐Ÿš€ Scaling Profiles - vLLM, Triton, training presets
  • ๏ฟฝ DaemonSet - Per-node GPU metric collection
  • ๐Ÿ”„ Scale-to-Zero - GPU-aware idle detection

Tech Stack: Go, gRPC, NVIDIA NVML, Kubernetes, Helm

Referenced in KEDA #7538

Stars License

OpenTelemetry Collector receiver for GPU metrics

  • ๐Ÿ”‹ NVIDIA NVML - GPU utilization, memory, temperature
  • ๏ฟฝ OTel Native - Standard OTLP export pipeline
  • ๏ฟฝ Multi-GPU - All devices on the node
  • ๐Ÿ“ˆ Prometheus - Built-in Prometheus exporter

Tech Stack: Go, OpenTelemetry Collector SDK, NVML

Stars License

K8s knowledge graph & automated root-cause analysis

  • ๐Ÿ—บ๏ธ Knowledge Graph - Real-time resource topology
  • ๏ฟฝ Root-Cause Traversal - Graph-based incident investigation
  • ๐ŸŽฎ GPU Aware - Training/inference/batch classification
  • ๐Ÿ”” AlertManager - Webhook integration for auto-investigation

Tech Stack: Go, Kubernetes API, Gorilla Mux, Helm

๐Ÿค How to Contribute

  1. Pick a project that interests you
  2. Check Issues labeled good first issue or help wanted
  3. Fork & Clone the repository
  4. Submit a PR - I review all PRs promptly!

All contributions welcome:

  • ๐Ÿ’ป Code contributions
  • ๐Ÿ“– Documentation improvements
  • ๐Ÿ› Bug reports
  • ๐Ÿ’ก Feature suggestions
  • โญ Star the repos!

More projects: KubeAI Autoscaler ยท Ingress2Gateway ยท LLMOps

โ˜๏ธ Cloud Platforms

  • AWS - Primary cloud platform for production workloads
  • Azure - Previous experience with enterprise deployments

๐Ÿ”ง Technologies & Tools

Container Orchestration & GitOps

  • Kubernetes - Production cluster management, multi-tenancy, and workload orchestration
  • ArgoCD - GitOps-driven continuous delivery and application lifecycle management
  • Docker - Container image building and runtime management
  • Crossplane - Kubernetes-native infrastructure provisioning and composition

Observability

  • Prometheus & Grafana - Metrics collection, alerting, and dashboard visualization
  • Splunk - Enterprise log aggregation and security analytics
  • Datadog - Full-stack monitoring and application performance management
  • OpenTelemetry - Vendor-neutral distributed tracing and telemetry collection

Policy Management

  • Kyverno - Kubernetes-native policy engine for security and compliance
  • OPA (Open Policy Agent) - Unified policy enforcement across the stack

CI/CD

  • GitHub Actions - Cloud-native workflow automation and CI/CD pipelines
  • Jenkins - Enterprise CI/CD automation and pipeline orchestration
  • Flux - GitOps toolkit for Kubernetes continuous delivery
  • UrbanCode Deploy - Enterprise application release automation

Big Data

  • PrestoDB & Trino - High-performance distributed SQL query engines for analytics
  • Apache Superset - Modern data exploration and business intelligence platform
  • Alluxio - Unified data orchestration for compute and storage
  • Jupyter Notebooks - Interactive data science and machine learning workflows

๐Ÿค– Interests

Deeply interested in the convergence of AI/ML and Kubernetes - enabling organizations to run machine learning workloads at scale on cloud-native infrastructure. Exploring MLOps practices, GPU scheduling, and AI platform engineering.

๐Ÿ“ Blog

Sharing insights on DevOps best practices, Kubernetes deep-dives, and cloud-native architecture:

๐Ÿ‘‰ pavanmadduri.wordpress.com

๐Ÿ’ฌ Let's Connect

Always open to connecting with fellow engineers and enthusiasts in the cloud-native and AI/ML space!

  • ๐Ÿ’ฌ Collaborate - Open an issue or discussion on any of my repositories
  • ๐Ÿค Partner - Interested in contributing to CNCF or ASWF projects together
  • ๐Ÿ‘‹ Network - Happy to exchange ideas and share experiences

Let's build something great together! ๐Ÿš€

Pinned Loading

  1. llmops llmops Public

    ๐Ÿš€ The Ultimate Curated List of LLMOps Tools, Frameworks, and Resources - A comprehensive collection of the best tools for Large Language Model Operations

    Shell 8 4

  2. pmady pmady Public

    8 1

  3. golden-kubestronaut-learning golden-kubestronaut-learning Public

    A comprehensive learning resource for achieving Kubestronaut and Golden Kubestronaut status through CNCF certifications

    Markdown 12 7

  4. kubeai-autoscaler kubeai-autoscaler Public

    Go 10 5

  5. ingress2gateway ingress2gateway Public

    Convert Kubernetes Ingress objects to Gateway API resources - Web GUI and REST API

    Python 7 3

  6. keda-gpu-scaler keda-gpu-scaler Public

    KEDA External gRPC Scaler for GPU workloads โ€” native NVML metrics via DaemonSet, no Prometheus required

    Go 16 10