JOB DESCRIPTION:
- Strong in Python/Pyspark scripting, minimum 3+ yrs, (Focus on Python/Pyspark and relax on snowflake)
- Must have hands on experience implementing AWS Big data lake using EMR and Spark.
- Working experience with Spark, Hive, Message Queue or Pub/Sub, Streaming technologies (3+ years)
- Have 6+ years of experience developing data pipelines using mix of languages (Python, Scala, SQL etc.) and open source frameworks to implement data ingest, processing, and analytics technologies.
- Experience leveraging open source big data processing frameworks, such as Apache Spark, Hadoop and streaming technologies such as Kafka.
- Hands on experience with newer technologies relevant to the data space such as Spark, Airflow, Apache Druid, Snowflake (or any other OLAP databases).
- Experience developing and deploying data pipelines and real-time data streams within a cloud native infrastructure preferably AWS
- Experience in using CI/CD pipeline (Gitlab)
- Experience in Code Quality implementation (Used Pep8/Pylint) tools or any other code quality tool.
- Experience of Python Plugins /operators like FTP Sensor, Oracle Operator etc.
- Implement Industry Standards /Best Practices.
- Excellent analytical and problem-solving skills
- Excellent verbal and written communication skills
Company Description:
Sky Solutions, LLC is a niche services company that provides next generation solutions to enterprises who want to improve their business outcomes. These next-gen solutions are created using the digital platforms the marketplace is investing in already.