Open Opportunity

Principal Data Engineer (Java, Spark, Hadoop, Spring, AWS)

Remote

Share This Career Opportunity

If you know someone who fits this opportunity description, you can share this opportunity with this link: http://bit.ly/3pySEUC

Key Responsibilities:

  • Responsible for leading the design and development of a Big Data predictive analytics SaaS based customer data platform using object oriented analysis, design and programming skills, and design patterns
  • Implement ETL workflows for data matching, data cleansing, data integration and management
  • Maintain existing data pipelines, and develop new data pipeline using big data technologies
  • Responsible for leading the effort of continuously improving reliability, scalability, and stability of the enterprise data platform
  • Contribute to and lead the continuous improvement of the software development framework and processes by collaborating with Quality Assurance engineers
  • Reproduce, troubleshoot and determine root cause of production issues
  • Help manage a high-performance team of Data Engineers
  • Contribute to and help lead team in design, build, testing, scaling and maintaining data pipelines from a variety of source systems and streams (Internal, third party, cloud based, etc.), according to business and technical requirements.
  • Deliver observable, reliable and secure software, embracing “you build it you run it” mentality, and focus on automation and GitOps.
  • Experience in CI/CD, IAC, platform automation and release management
  • Participate daily standup team meeting/bi-weekly sprint planning & sprint-end demo/retrospective and work cross-functionality with other teams in Lattice to drive the innovations of our products

Qualifications for this role:

  • Must want to be actively hands-on code using Java and Python (70-90%)
  • Experience building data pipeline framework for data workflow to process large-data sets
  • Willingness and passion for mentoring junior engineers and performing code reviews
  • Possess strong knowledge or familiarity with either Apache Beam or AWS managed services for data (Glue, Athena, Data Pipeline, Flink, Spark) and Snowflake.
  • Experience of near Real Time & Batch Data Pipeline development in a similar Big Data Engineering role.
  • Develop data catalogs and data cleanliness to ensure clarity and correctness of key business metrics
  • Experience building streaming data pipelines using Kafka, Spark or Flink
  • Build and maintain data warehouses in support of BI tools
  • Experience in processing structured and unstructured data into a form suitable for analysis and reporting with integration with a variety of data metric providers ranging from advertising, web analytics, and consumer devices
  • Strong expertise with 5+ years of experience in building enterprise techniques for large scale distributed system design and data processing
  • Strong knowledge of common algorithms, data structures, Object Oriented programming and design
  • Strong analytical and problem solving skills
  • Ability to hit the ground running and learn/adapt quickly
  • Excellent verbal and written communication skills
  • Self-driven, willing to work in a fast-paced, dynamic environment

Nice to have:

  • Experience with graph-based data workflows using Apache Airflow
  • Strong Test-Driven Development background, with understanding of levels of testing required to continuously deliver value to production.