Stackable Cluster Platform Administrator

Experience: 
5+ years
Permanent
Developer
Kuala Lumpur, Malaysia

About the Role

We are seeking a skilled Stackable Cluster Platform Administrator to manage, optimize, and secure our data and compute infrastructure. The candidate will be responsible for the administration of Kubernetes clusters, Hadoop ecosystem (HDFS, Hive), Linux servers, virtualized environments, and network/firewall configurations. This role requires strong troubleshooting skills, performance tuning expertise, and hands-on experience with enterprise-scale distributed systems.

Key Responsibilities

  • Cluster & Platform Administration
    • Install, configure, and manage Kubernetes clusters and related components.
    • Administer and monitor Hadoop ecosystem services (HDFS, Hive, YARN, etc.).
    • Ensure stability, scalability, and high availability of the data platform.
  • Linux & VM Management
    • Administer Linux servers (RedHat, CentOS, Ubuntu) including patching, upgrades, and performance tuning.
    • Manage virtualized infrastructure (VMware/Hyper-V/KVM) for compute and storage workloads.
  • Security & Networking
    • Configure and manage firewall policies, VPNs, and network security rules.
    • Implement authentication, authorization, and encryption best practices across the platform.
  • Monitoring & Support
    • Set up monitoring, alerting, and logging for proactive issue resolution.
    • Troubleshoot incidents, perform root cause analysis, and apply fixes.
    • Work with DevOps/Data teams to support deployment and integration pipelines.
  • Documentation & Process
    • Maintain detailed documentation of configurations, processes, and procedures.
    • Automate routine tasks using shell scripting, Python, or Ansible.

Required Skills & Experience

  • Kubernetes administration (installation, scaling, upgrades, RBAC, Helm, operators).
  • Linux system administration (performance tuning, kernel parameters, networking).
  • Hadoop ecosystem: HDFS, Hive, YARN, and security integration (Kerberos/Ranger).
  • Virtualization platforms (VMware ESXi, KVM, or similar).
  • Firewall & networking knowledge (iptables, Palo Alto, Cisco ASA, or equivalent).
  • Strong scripting/automation skills (Bash, Python, Ansible, Terraform).
  • Familiarity with CI/CD, Git, and monitoring tools (Prometheus, Grafana, Nagios, ELK).

Preferred Qualifications

  • Experience with cloud-native deployments (AWS EMR, GCP Dataproc, or Azure HDInsight).
  • Knowledge of security hardening in big data platforms.
  • Hands-on experience with container security, ingress controllers, and service mesh.
  • Prior exposure to disaster recovery planning and capacity management.

Candidate Profile

The ideal candidate is a hands-on platform engineer/administrator with experience managing large-scale distributed clusters. They should have:

  • Strong problem-solving and analytical skills.
  • Ability to work cross-functionally with data engineers, security teams, and DevOps.
  • A proactive mindset for automation and continuous improvement.
  • Strong documentation and communication skills.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Check Your AI Readiness

Check Your AI Readiness

Get a personalized readiness score and actionable next steps for your AI journey.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.