About this role
Emergent builds autonomous coding agents that replace traditional software development by generating, testing, and deploying production applications directly from plain-language intent. Our systems run in production at global scale and are used to build millions of real applications.
Since public launch, Emergent has reached $100M ARR in 8 months. 6M+ users across 190+ countries have built 6.5M+ applications on Emergent. We've raised $100M+ , backed by Khosla Ventures, SoftBank, Google, Lightspeed, Prosus, Together, and Y Combinator.
We're solving the hard part of AI-driven software creation: correctness, reliability, security, and scale in real production systems. The team is built by repeat founders, Olympiad medalists, IIT & IIM alumni, and leaders from Google, Amazon, and Dropbox.
We're hiring builders who want ownership, speed, and impact at global scale.
What You'll Be Responsible For
Platform & Infrastructure
• Maintain stability of our platform consisting of distributed microservices closely interacting with Kubernetes and cloud providers (GCP, AWS)
• Manage Kubernetes workloads with ArgoCD (GitOps) — deploy, monitor, and troubleshoot application syncs, resource trees, and rollouts
• Debug and resolve complex Kubernetes issues across clusters
• Manage CDN and edge infrastructure (Cloudflare) for performance, caching, and traffic management
• Automate infrastructure lifecycle operations and workflows
Observability & Incident Response
• Own the observability stack: Grafana (dashboards, Loki logs, Prometheus metrics), New Relic (APM, golden metrics, transaction analysis)
• Enhance monitoring, alerting, and distributed tracing across services
• Participate in on-call rotation via PagerDuty , handle incident response, and perform root cause analysis
• Proactively identify reliability risks before they become incidents
AI Agent Infrastructure
• Support the platform that runs AI agent workloads — job scheduling, trajectory tracking, environment provisioning, deployments and cost attribution
• Develop Kubernetes controllers and operators to extend platform capabilities for agent orchestration
Collaboration & Internal Tooling
• Work closely with product and backend teams to ensure platform scalability and reliability
• Build internal tools, automate workflows, and integrate systems to improve team productivity
• Stay current with Kubernetes releases, CNCF ecosystem updates, and cloud-native best practices
What We're Looking For
Core Requirements
• 4+ years of software/platform engineering experience with production systems
• Strong proficiency in Go or Python — you write production code in at least one daily
• Hands-on experience building and deploying services on Kubernetes — not just YAML, you've developed something that runs on K8s
• Experience with GitOps tooling (ArgoCD, Flux, or similar)
Systems Fundamentals
• Strong networking and DNS fundamentals — TCP/IP, HTTP, load balancing, DNS resolution, TLS, and debugging connectivity issues
• Solid Linux/OS fundamentals — process management, filesystem, memory, systemd, and comfortable debugging with tools like strace, tcpdump, and netstat
Data & Messaging Infrastructure
• Relational databases — experience with PostgreSQL, MySQL, or similar; indexing, query optimization, replication, and backup/restore procedures
• NoSQL databases — familiarity with MongoDB, DynamoDB, Redis, or similar for document/key-value workloads
• Caching — experience with Redis, Memcached, or similar for application and infrastructure-level caching
• Message queues & streaming — hands-on with Kafka, SQS, RabbitMQ, or similar for event-driven architectures
• Strong SQL skills for debugging and operational queries
Infrastructure & Observability
• Comfortable with the CNCF ecosystem — Helm, Kustomize, cert-manager, Ingress controllers, CNI/CSI interfaces
• Hands-on with at least one observability stack (Grafana/Prometheus/Loki, New Relic, Datadog, or similar)
• Familiarity with GCP and/or AWS — managed Kubernetes (GKE/EKS), networking, IAM, storage, and cloud-native services (SES, SQS, S3, etc.)
• Experience with CDN/edge platforms (Cloudflare, CloudFront, or similar)
Nice to Have
• Experience building Kubernetes Operators (kubebuilder, operator-sdk, or controller-runtime)
• Experience tuning Kubernetes core components (API server, kubelet, scheduler)
• Familiarity with AI/LLM infrastructure — token management, cost tracking, agent orchestration
• Experience with CI/CD pipelines (GitHub Actions, automated testing, deployment pipelines)
• Infrastructure as Code experience (Terraform, Pulumi, or similar)
• Previous work on large-scale distributed systems or platform-as-a-service
• Startup experience — you thrive in fast-paced, ambiguous environments
What You're Like
• You're a generalist who can context-switch between debugging a K8s deployment, setting up a Grafana alert, and configuring CDN rules — all in the same day
• You enjoy solving complex infrastructure challenges and automating away toil
• You dig deep — when something breaks, you find the root cause, not just the workaround
• You communicate clearly and can collaborate effectively in a fast-moving, distributed team
Tech Stack
We don't require previous experience with our entire stack, but enthusiasm for learning is key.
Go · Python · Kubernetes · ArgoCD · Helm · GCP · AWS · Cloudflare · Grafana · Prometheus · Loki · New Relic · PagerDuty · PostgreSQL · MongoDB · Redis · Kafka · GitHub
Why Emergent Labs
• YC S24 backed with strong investor support
• Building at the frontier of AI-powered software creation
• Small team, high ownership, real impact from day one
Benefits and Perks:
• 401(k)
• Health, dental, and vision insurance
• Unlimited Paid Time Off: take the time you need to recharge and come back refreshed
• Flexible Working Hours: work arrangements that fit your life and commitments
Let's build the future of software together.
Since public launch, Emergent has reached $100M ARR in 8 months. 6M+ users across 190+ countries have built 6.5M+ applications on Emergent. We've raised $100M+ , backed by Khosla Ventures, SoftBank, Google, Lightspeed, Prosus, Together, and Y Combinator.
We're solving the hard part of AI-driven software creation: correctness, reliability, security, and scale in real production systems. The team is built by repeat founders, Olympiad medalists, IIT & IIM alumni, and leaders from Google, Amazon, and Dropbox.
We're hiring builders who want ownership, speed, and impact at global scale.
What You'll Be Responsible For
Platform & Infrastructure
• Maintain stability of our platform consisting of distributed microservices closely interacting with Kubernetes and cloud providers (GCP, AWS)
• Manage Kubernetes workloads with ArgoCD (GitOps) — deploy, monitor, and troubleshoot application syncs, resource trees, and rollouts
• Debug and resolve complex Kubernetes issues across clusters
• Manage CDN and edge infrastructure (Cloudflare) for performance, caching, and traffic management
• Automate infrastructure lifecycle operations and workflows
Observability & Incident Response
• Own the observability stack: Grafana (dashboards, Loki logs, Prometheus metrics), New Relic (APM, golden metrics, transaction analysis)
• Enhance monitoring, alerting, and distributed tracing across services
• Participate in on-call rotation via PagerDuty , handle incident response, and perform root cause analysis
• Proactively identify reliability risks before they become incidents
AI Agent Infrastructure
• Support the platform that runs AI agent workloads — job scheduling, trajectory tracking, environment provisioning, deployments and cost attribution
• Develop Kubernetes controllers and operators to extend platform capabilities for agent orchestration
Collaboration & Internal Tooling
• Work closely with product and backend teams to ensure platform scalability and reliability
• Build internal tools, automate workflows, and integrate systems to improve team productivity
• Stay current with Kubernetes releases, CNCF ecosystem updates, and cloud-native best practices
What We're Looking For
Core Requirements
• 4+ years of software/platform engineering experience with production systems
• Strong proficiency in Go or Python — you write production code in at least one daily
• Hands-on experience building and deploying services on Kubernetes — not just YAML, you've developed something that runs on K8s
• Experience with GitOps tooling (ArgoCD, Flux, or similar)
Systems Fundamentals
• Strong networking and DNS fundamentals — TCP/IP, HTTP, load balancing, DNS resolution, TLS, and debugging connectivity issues
• Solid Linux/OS fundamentals — process management, filesystem, memory, systemd, and comfortable debugging with tools like strace, tcpdump, and netstat
Data & Messaging Infrastructure
• Relational databases — experience with PostgreSQL, MySQL, or similar; indexing, query optimization, replication, and backup/restore procedures
• NoSQL databases — familiarity with MongoDB, DynamoDB, Redis, or similar for document/key-value workloads
• Caching — experience with Redis, Memcached, or similar for application and infrastructure-level caching
• Message queues & streaming — hands-on with Kafka, SQS, RabbitMQ, or similar for event-driven architectures
• Strong SQL skills for debugging and operational queries
Infrastructure & Observability
• Comfortable with the CNCF ecosystem — Helm, Kustomize, cert-manager, Ingress controllers, CNI/CSI interfaces
• Hands-on with at least one observability stack (Grafana/Prometheus/Loki, New Relic, Datadog, or similar)
• Familiarity with GCP and/or AWS — managed Kubernetes (GKE/EKS), networking, IAM, storage, and cloud-native services (SES, SQS, S3, etc.)
• Experience with CDN/edge platforms (Cloudflare, CloudFront, or similar)
Nice to Have
• Experience building Kubernetes Operators (kubebuilder, operator-sdk, or controller-runtime)
• Experience tuning Kubernetes core components (API server, kubelet, scheduler)
• Familiarity with AI/LLM infrastructure — token management, cost tracking, agent orchestration
• Experience with CI/CD pipelines (GitHub Actions, automated testing, deployment pipelines)
• Infrastructure as Code experience (Terraform, Pulumi, or similar)
• Previous work on large-scale distributed systems or platform-as-a-service
• Startup experience — you thrive in fast-paced, ambiguous environments
What You're Like
• You're a generalist who can context-switch between debugging a K8s deployment, setting up a Grafana alert, and configuring CDN rules — all in the same day
• You enjoy solving complex infrastructure challenges and automating away toil
• You dig deep — when something breaks, you find the root cause, not just the workaround
• You communicate clearly and can collaborate effectively in a fast-moving, distributed team
Tech Stack
We don't require previous experience with our entire stack, but enthusiasm for learning is key.
Go · Python · Kubernetes · ArgoCD · Helm · GCP · AWS · Cloudflare · Grafana · Prometheus · Loki · New Relic · PagerDuty · PostgreSQL · MongoDB · Redis · Kafka · GitHub
Why Emergent Labs
• YC S24 backed with strong investor support
• Building at the frontier of AI-powered software creation
• Small team, high ownership, real impact from day one
Benefits and Perks:
• 401(k)
• Health, dental, and vision insurance
• Unlimited Paid Time Off: take the time you need to recharge and come back refreshed
• Flexible Working Hours: work arrangements that fit your life and commitments
Let's build the future of software together.
Tech stack
KubernetesGCPAWSCloudflarePythonPostgreSQL
About Emergent Labs
Emergent Labs is hiring for the software engineer - infrastructure role. NewJob aggregates active openings directly from Emergent Labs's applicant tracking system, so this listing is current.
More jobs at Emergent Labs →