M
Manus AI

DevOps & SRE Engineer

Singapore Posted 2025-08-27
Type
Full-time
Experience
2+ yr
Source
Ashby
Key Responsibilities

Cluster Operations & Management

- Manage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business units

- Ensure optimal performance, scalability, and reliability of distributed systems

Infrastructure Platform Development

- Design, build, and enhance infrastructure operation platforms

- Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging

- Drive platform standardization and automation initiatives

High Availability & Reliability

- Ensure maximum uptime for production services through proactive monitoring and incident response

- Continuously optimize service architecture, deployment strategies, and operational processes

- Implement and maintain SLA/SLO frameworks and reliability engineering practices

Automation & Process Improvement

- Lead the development of automated operations and maintenance systems

- Create self-service tools and workflows to improve team productivity

- Establish best practices for infrastructure such as code and configuration management

Required Qualifications

Experience & Education

- 2+ years of hands-on experience in Systems Operations, DevOps, or Site Reliability Engineering (SRE)

- Bachelor's degree in Computer Science, Engineering, or related technical field preferred

Cloud & Infrastructure

- Experience with public cloud platforms (AWS, Azure, or GCP) is highly valued

- Strong understanding of large-scale internet architecture and distributed systems

- Proven experience with infrastructure monitoring, logging, and observability tools

Technical Skills

- Proficiency in scripting and automation using Shell, Python, or similar languages

- Strong knowledge of containerization technologies (Kubernetes, Docker)

- Hands-on experience operating production-grade container clusters and managing CI/CD pipelines

- Strong familiarity with common infrastructure components: Nginx, MySQL, Redis, Kafka, Elasticsearch

Advanced Networking (Preferred)

- Experience with Service Mesh architectures, Cilium CNI, and eBPF technologies

- Understanding network security, load balancing, and traffic management

- Knowledge of cloud-native networking patterns and best practices

About Manus AI

Manus is a general AI agent that bridges minds and actions: it doesn't just think, it delivers results. Manus excels at various tasks in work and life, getting everything done while you rest. At Manus AI, we offer a highly collaborative and innovative environment where experts across engineering, research, and business come together to push the boundaries of AI applications. If you're passionate about cutting-edge technology and making a real impact, we’d love to hear from you!

Contact us: [email protected]
KubernetesDockerKafkaRedisAWSAzure
Manus AI is hiring for the devops & sre engineer role. NewJob aggregates active openings directly from Manus AI's applicant tracking system, so this listing is current. More jobs at Manus AI →
Apply on company site