- Design, develop, and maintain real-time data streaming pipelines using Spark Streaming or Azure Functions.
- Load, merge, and process machine logs from Kafka, ensuring efficient data flow and transformation.
- Integrate processed data into Redis cache and send it to the data lake for long-term storage and analysis.
- Implement and optimize data processing solutions using Python.
- Apply software engineering best practices, including code reviews, version control, and continuous integration/deployment (CI/CD).
- Collaborate with cross-functional teams to understand requirements and deliver high-quality data solutions.
