If you know someone who fits this opportunity description, you can share this opportunity with this link: https://bit.ly/2CYXm7X
Openings with very well established company in Covina, CA. They did over $140M in revenue last year, clearing $40M before taxes. Recently brought in a new rockstar CTO who I’ve worked with for 15 years since we’d worked together at Overture/Yahoo! back in early 2000s. He’s basically building out a Tech start-up within this company taking them from old mainframe systems to state of the art, React front end, GoLang backend, Google Cloud as well as building out brand new state of the art 10000 SF offices.
As a data engineer, you will design and build components for the big data platform. You will work with business teams to understand product requirements, explore solutions, and collaborate with technical teams on design, build, test and
deployment of various data components and supporting applications and services.
- Review requirements, provide feedback, and engage in the exchange of ideas with teams across the org.
- Explore and test solutions on Google Cloud Platform (GCP) and work with
operations on planning, setup, and execution.
- Implement core functionality for the data platform utilizing GCP big data
- Implement supporting applications and services including pipeline
orchestration and resilient workflow execution.
- I mplement data access APIs across multiple data sources and technologies.
- P rovide clear, detailed documentation for data components and processes.
- Provide proactive support for data platform pipelines and components and
respond to ad-hoc requests.
Required Education and Experience
- Bachelor’s degree in Computer Science, Informatics, Information Systems, or
- 5+ years of experience in a data engineering role including design and/or
development in the following areas:
- Big data toolset such as Hadoop, Spark, Beam, Kafka
- Relational, NoSql and analytical databases such as Postgres, HBase,
- Cassandra, Amazon Redshift, Google Big Query
- Workflow and orchestration such as Airflow, Cloud Dataflow, Cloud Composer
- Data analysis toolset such as pandas, NumPy, Jupyter
- Data serialization and encoding such as Avro, Protocol Buffers, Thrift
- Streaming technologies such as Spark Streaming, Kafka Streams, Flink
- Messaging such as Kafka, Google Pub/Sub, AWS Kinesis
- Machine learning such as TensorFlow, XGBoost
- Advanced knowledge of SQL and experience writing complex data manipulation queries across a variety of databases.
- Advanced knowledge of Python, Golang, Scala, or Java, and experience using at least one for processing large datasets within a bid data environment.
- Experience implementing tooling for data ingestion, data transformation, data mapping, metadata management
- Experience building/deploying software using toolsets within common cloud infrastructures such as Google Cloud Platform (GCP), Amazon Web Services (AWS), Azure.
- Deep understanding of the inner-workings of data systems, experience with optimization and performance tuning.
- Experience designing and creating data models across multiple unstructured datasets.
- Deep understanding of distributed environments and resource management.
- Strong communication skills (verbal and written) and ability to communicate with both business and technical teams.