Data Engineer ( Spark / Streaming / Java)
#Data Engineer #Apache Spark #Databricks #Java #Apache Kafka #Batch Processing #Structured Streaming #Azure #SQL #Microservices #CI/CD #Docker #DDD
Are you ready to join our international team as a Data Engineer? We shall tell you why you should...
What product do we develop?
We are building an innovative solution, KMD Elements, on Microsoft Azure cloud dedicated to the energy distribution market (electrical energy, gas, water, utility, and similar types of business). Our customers include institutions and companies operating in the energy market as transmission service operators, market regulators, distribution service operators, energy trading, and retail companies.
KMD Elements delivers components allowing implementation of the full lifecycle of a customer on the energy market: meter data processing, connection to the network, physical network management, change of operator, full billing process support, payment, and debt management, customer communication, and finishing on customer account termination and network disconnection.
The key market advantage of KMD Elements is its ability to support highly flexible, complex billing models as well as scalability to support large volumes of data. Our solution enables energy companies to promote efficient energy generation and usage patterns, supporting sustainable and green energy generation and consumption.
We work with always up-to-date versions of:
Apache Spark on Azure Databricks
Apache Kafka
Delta Lake
Java
MS SQL Server and NoSQL storages like Elastic Search, Redis, Azure Data Explorer
Docker containers
Azure DevOps and fully automated CI/CD pipelines with Databricks Asset Bundles, ArgoCD, GitOps, Helm charts
Automated tests
How do we work?
#Agile #Scrum #Teamwork #CleanCode #CodeReview #Feedback #BestPracticies
We follow Scrum principles in our work – we work in biweekly iterations and produce production-ready functionalities at the end of each iteration – every 3 iterations we plan the next product release
We have end-to-end responsibility for the features we develop – from business requirements, through design and implementation up to running features on production
More than 75% of our work is spent on new product features
Our teams are cross-functional (7-8 persons) – they develop, test and maintain features they have built
Teams’ own domains in the solution and the corresponding system components
We value feedback and continuously seek improvements
We value software best practices and craftsmanship
Product principles:
Domain model created using domain-driven design principles
Distributed event-driven architecture / microservices
Large-scale system for large volumes of data (>100TB data), processed by Apache Spark streaming and batch jobs powered by Databricks platform
Your responsibilities:
Develop and maintain the leading IT solution for the energy market using Apache Spark, Databricks, Delta Lake, and Apache Kafka
Have end-to-end responsibility for the full lifecycle of features you develop
Design technical solutions for business requirements from the product roadmap
Maintain alignment with architectural principles defined on the project and organizational level
Ensure optimal performance through continuous monitoring and code optimization.
Refactor existing code and enhance system architecture to improve maintainability and scalability.
Design and evolve the test automation strategy, including technology stack and solution architecture.
Prepare reviews, participate in retrospectives, estimate user stories, and refine features ensuring their readiness for development.
Personal requirements:
Have 4+ years of Apache Spark experience and have faced various data engineering challenges in batch or streaming
Have an interest in stream processing with Apache Spark Structured Streaming on top of Apache Kafka
Have experience leading technical solution designs
Have experience with distributed systems on a cloud platform
Have experience with large-scale systems in a microservice architecture
Are familiar with Git and CI/CD practices and can design or implement the deployment process for your data pipelines
Possess a proactive approach and can-do attitude
Are excellent in English and Polish, both written and spoken
Have a higher education in computer science or a related field
Are a team player with strong communication skills
Nice to have requirements:
Apache Spark Structured Streaming
Azure
Domain Driven Development
Docker containers and Kubernetes
Message brokers (i.e. Kafka) and event-driven architecture
Agile/Scrum
Our offer:
Contract type: B2B
Work Mode: Flexible — this role supports on-site, hybrid, and remote arrangements, depending on your individual preferences.
Occasional on-site presence may be required — for example, onboard new team members, explore new business domains, or refine requirements in close collaboration with stakeholders or team building activities.
What does the recruitment process look like?
Phone conversation with Recruitment Partner
Technical interview with the Hiring Team
Cognitive test
Offer

KMD
KMD Poland is the KMD Group's largest unit outside of its Danish headquarters, with 600 IT and business specialists on board. Our innovative solutions utilize technologies such as .Net, Java, SAP, Angular, Azure, and Kub...
Data Engineer ( Spark / Streaming / Java)
Data Engineer ( Spark / Streaming / Java)