If you know someone who fits this opportunity description, you can share this opportunity with this link: https://bit.ly/2AmzuZO
Openings with a very well established company in Covina, CA. They did over $140M in revenue last year, clearing $40M before taxes. Recently brought in a new rockstar CTO who I’ve worked with for 15 years since we’d worked together at Overture/Yahoo! back in the early 2000s. He’s basically building out a Tech start-up within this company taking them from old mainframe systems to state of the art, React front end, GoLang backend, Google Cloud as well as building out a brand new state of the art 10,000 SF offices.
As a senior data engineer, you will help lead the effort of transforming the data systems. You will work with a diverse set of data sources and systems, and provide the design, and definition of a new data platform.
- Define the data architecture, development platform and tools for collection, processing, and storage/retrieval of petabyte-scale data sets.
- Establish and enforce a set of development practices for ensuring minimal defects and promoting efficient collaboration.
- Design and implement core data pipeline components utilizing Google Cloud Platform tools and technologies.
- Define interfaces to internal and external systems based on standard transports and formats.
- Define and evolve data models, layouts, and access patterns for all data collections.
- Provide proactive support for data platform pipelines and components and respond to ad-hoc requests.
- Bachelor’s degree in Computer Science or equivalent, or relevant experience.
- 7+ years of design and development of data pipelines and/or data-intensive applications encompassing the following areas:
- Batch and real-time delivery of data across multiple technologies (pub-sub, APIs, log processing).
- Developing and deploying modules inside data processing frameworks such as Spark and MapReduce.
- Experience with each of the following technologies: relational databases, NoSql databases, data warehousing.
- Building data pipelines using standard workflow tools and concepts.
- Proficiency with SQL and Java, Scala, or Python.
- Experience with building and deploying data applications on any Cloud infrastructure (Google Cloud Platform, Amazon Web Services, Azure).
- Robust knowledge of data warehousing technology and concepts.
- Experience with building data processing components using standard big data technology stack (or equivalent): MapReduce, Spark, Spark SQL, Spark Streaming, Hive, Pig, Impala, HBase
- Experience interacting with and/or administering any of the following relational databases: PostgreSQL, MySQL, Oracle, SQL Server.
- Experience with one of the following pub/sub technologies: Google Cloud Pub/Sub, Kafka, Kinesis.
- Cloud platform certification (big data related) is a plus.
- Experience leading development teams focused on building big data related applications.
- Deep understanding of distributed environments and resource management.
- Strong communication skills (verbal and written) and ability to communicate with both business and technical teams.