We are looking for a talented and experienced Hadoop Administrator with strong expertise in managing and optimizing Hadoop clusters, Apache Kafka for streaming data, and proficiency in Scala for writing efficient, scalable code. In this role, you will oversee the installation, configuration, and maintenance of Hadoop ecosystem components while ensuring the system’s performance, scalability, and reliability in a big data environment.
Key Responsibilities:
- Administer, monitor, and manage large-scale Hadoop clusters, ensuring high availability and performance.
- Implement and maintain Apache Kafka for real-time data streaming and integration within the Hadoop ecosystem.
- Optimize and fine-tune Hadoop and Kafka environments to ensure smooth data processing pipelines.
- Design and implement scalable Scala applications for data ingestion, transformation, and processing in Hadoop.
- Troubleshoot performance issues and resolve system failures promptly.
- Collaborate with data engineers, data scientists, and other teams to ensure data workflows run smoothly across the Hadoop infrastructure.
- Ensure data security, access controls, and compliance requirements are met within the Hadoop and Kafka ecosystems.
- Develop and maintain automation scripts for cluster deployment, monitoring, and management.
- Provide support and guidance on best practices for data ingestion, storage, and processing in Hadoop.
- Stay up-to-date with the latest trends in Big Data technologies and propose improvements to the current architecture.
Required Skills & Qualifications:
- Proven experience as a Hadoop Administrator in a large-scale production environment.
- Strong expertise with Hadoop ecosystem tools such as HDFS, YARN, MapReduce, Hive, Pig, and Spark.
- Experience with Apache Kafka for real-time data streaming and integration.
- Proficiency in Scala for developing and optimizing data processing applications.
- Knowledge of Big Data tools such as Hive, HBase, or Flume is a plus.
- Strong understanding of Linux-based operating systems, networking, and system administration.
- Experience with Hadoop security including Kerberos and access control mechanisms.
- Familiarity with data ingestion frameworks and ETL processes.
- Solid troubleshooting and problem-solving skills in distributed systems.
- Ability to work in a fast-paced, dynamic environment and manage multiple priorities.
- Excellent communication and collaboration skills.
Preferred Qualifications:
- Bachelor's or Master’s degree in Computer Science, Information Technology, or related field.
- Certifications in Hadoop or other Big Data technologies.
- Experience with cloud platforms like AWS, Azure, or Google Cloud.