Responsibility: Lead deployment and configuration of on-prem Linux-based platforms for AI workloads; Design and configure KVM virtualization; Architect and implement GPU-enabled environments for production LLM inference; Deploy and operate containerized LLM serving stacks in production; Design and validate GPU utilization, isolation, and health monitoring; Integrate deployments with CI/CD pipelines and security controls; Apply security hardening, RBAC, encryption, and audit-ready configurations; Design HA and DR strategies; Lead troubleshooting, performance tuning, and stabilization.