DevOps Engineer Manager
About Us
Nalanda is a leading Spanish multinational dedicated to bridging the gap between large companies and their suppliers through an innovative digital platform. Our platform streamlines business processes such as document exchange, purchases, invoices, and vital business information. We specialize in coordinating activities between contractors and their suppliers, minimizing costs, time, and risks, while fostering transparent and effective business relationships.
We are a dynamic, forward-thinking company committed to building an inclusive workplace where talent thrives. At Nalanda, we believe that the development of people drives organizational success. Join us as we continue to build a culture of growth, inclusivity, and excellence.
We are also part of Once For All, an international group with a presence in the UK, France, Latin America, and more than 1,000 people working on digital solutions for supply chain management and regulatory compliance.
Role Summary
We are looking for an experienced and passionate Team Lead to join our team. You will be responsible for leading our DevOps team while staying hands-on technically, focusing on platform engineering, infrastructure automation, observability, identity, and supporting development teams. Your work will play a vital role in ensuring the reliability, scalability, security, and performance of the cloud infrastructure that powers our digital platforms and serves thousands of users.
You will collaborate closely with development teams, the SecOps team, operations, and other team leads to deliver high-quality platform solutions and maintain a robust, efficient infrastructure that enables our core business across multiple products. You should feel comfortable diving into infrastructure code and supporting development teams at a code level when needed.
Key responsibilities
Lead the DevOps team in alignment with company OKRs and head-of-area guidelines, staying hands-on technically while owning the team’s delivery cadence, quality, and outcomes.
Drive delivery and accountability: set and track measurable goals, run regular 1:1s and feedback, and continuously raise the team’s velocity and engineering standards.
Surface blockers and risks early, balance reactive operational load against roadmap work, and keep stakeholders informed on progress and trade-offs.
Facilitate daily work for development teams by removing platform blockers and improving developer experience through self-service capabilities, automation, and direct contributions to infrastructure-related code when needed.
Design, implement, and maintain AWS platform infrastructure using Terraform across multiple environments (dev/pre/prod) and multiple products, with proper security and access controls.
Lead by example through clean infrastructure code, technical ownership, and strong platform engineering practices.
Operate and evolve the identity and access platform (Keycloak, OAuth2/OIDC, API authentication and authorization) and zero-trust access patterns.
Implement and maintain comprehensive platform monitoring, alerting, and observability solutions (CloudWatch, SNS, Dynatrace) to ensure system reliability.
Modernize platform automation and CI/CD (GitLab CI, Atlantis/GitOps), including migrating legacy job execution (Rundeck) to containerized batch workloads.
Partner closely with the SecOps team to implement and operationalize security requirements on the platform — network controls (NACLs, Transit Gateway, peering), threat-detection pipelines, audit-remediation actions, and access provisioning.
Own relational database operations alongside the team (PostgreSQL/EnterpriseDB and MySQL): high availability, Multi-AZ, failover, major-version upgrades, and performance tuning.
Drive platform cost optimization and resource management (instance/family right-sizing, scaling strategy, data-transfer awareness).
Support the growth of the team by mentoring peers, guiding code reviews, and fostering a culture of technical excellence in platform engineering.
Collaborate with backend developers to align platform evolution with application needs, contributing to code reviews and shared architectural decisions.
Qualifications
5+ years of professional experience in platform engineering, DevOps, SRE, or infrastructure engineering, including large-scale cloud environments.
Experience leading and growing a technical team — setting expectations, managing performance, and improving delivery predictability.
Expert knowledge of AWS services (ECS, RDS, VPC, CloudWatch, SNS, etc.) and multi-account AWS Organizations.
Deep experience with Terraform and infrastructure as code, including modular architecture and multi-environment management.
Strong relational database operations experience (PostgreSQL/EnterpriseDB, MySQL): HA, Multi-AZ, failover, major-version upgrades, and performance tuning.
Solid Kubernetes/EKS experience, with the ability to mature a growing EKS platform.
Proven track record delivering high-performance, secure, and scalable platform infrastructure.
Proficiency with containerization (Docker), orchestration platforms, and CI/CD pipelines.
Experience with GitOps workflows and Atlantis for automated deployments.
Comfortable navigating and improving legacy infrastructure while advocating for long-term platform quality.
Experience with platform monitoring and observability tools, automation scripting, and version control (Git).
Comfortable switching context between infrastructure automation and application-level performance/debugging.
Understanding of platform security best practices, and experience working alongside security/SOC teams to implement controls and remediation.
Experience using AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude).
High proficiency in English (spoken and written).
Bonus points for
Knowledge of PostgreSQL/EnterpriseDB administration and optimization at an advanced level.
Java coding experience and the ability to investigate/debug application code without developer assistance, understanding how infrastructure choices impact application behavior and developer velocity.
Familiarity with zero-trust networking solutions.
Knowledge of disaster recovery planning and performance optimization best practices.
Experience acting as a technical bridge between infrastructure and application teams, including joint debugging, logging, and performance optimization.
What we offer
🤸🕒 Flexible working time.
🏞️💻 Teleworking.
☀️📅 Intensive working time in summer.
🍽️💳 Flexible benefits.
🫂🔝 A dynamic and inclusive workplace with opportunities for growth and development.
✨🏢 The chance to make a significant impact on our organizational culture and talent strategy.
💰🎁Competitive compensation package, with a salary range of €50,000 – €85,000 gross per year, depending on experience and fit.
- Department
- Dirección Tecnología Grupo
- Locations
- Nalanda Global - Madrid
- Remote status
- Hybrid