Hybrid - Databricks Developer to implement robust data pipelines using Apache Spark on Databricks, perform data modelling using Medallion architecture, and
S.i. Systems
Vancouver, BC-
Number of positions available : 1
- Salary To be discussed
-
Permanent job
- Published on September 10th, 2025
-
Starting date : 1 position to fill as soon as possible
Description
S.i. Systems global client with office in Vancouver is seeking a Hybrid - Intermediate Databricks Developer to implement robust data pipelines using Apache Spark on Databricks, perform data modelling using Medallion architecture, and manage Delta Lake performance. This is a hands-on development role focused on engineering scalable, maintainable, and optimized data flows in a modern cloud-based environment.
Full time permanent role based in the Vancouver office (hybrid, 1 - 4 days / week onsite, negotiable)
Salary range from $100,000 - $200,000 CAD / annum
MUST HAVE SKILLS:
- 5+ years of experience in data engineering or big data development.
- Strong hands-on experience with Databricks and Apache Spark (PySpark/SQL).
- Proven experience with Azure Data Factory, Azure Data Lake, and related Azure services.
- Experience integrating with APIs using libraries such as requests and http.
- Deep understanding of Delta Lake architecture, including performance tuning and advanced features.
- Proficiency in SQL and Python for data processing, transformation, and validation.
- Familiarity with data lakehouse architecture and both real-time and batch processing design patterns.
- Comfortable working with Git, DevOps pipelines, and Agile delivery methodologies.
NICE TO HAVE SKILLS:
- Experience with dbt, Azure Synapse, or Microsoft Fabric.
- Familiarity with Unity Catalog features in Databricks.
- Relevant certifications such as Azure Data Engineer, Databricks, or similar.
- Understanding of predictive modeling, anomaly detection, or machine learning, particularly with IoT datasets.
JOB DUTIES:
- Design, build, and maintain scalable data pipelines and workflows using Databricks (SQL, PySpark, Delta Lake).
- Develop efficient ETL/ELT pipelines for structured and semi-structured data using Azure Data Factory (ADF) and Databricks notebooks/jobs.
- Integrate and transform large-scale datasets from multiple sources into unified, analytics-ready outputs.
- Optimize Spark jobs and manage Delta Lake performance using techniques such as partitioning, Z-ordering, broadcast joins, and caching.
- Design and implement data ingestion pipelines for RESTful APIs, transforming JSON responses into Spark tables.
- Apply best practices in data modeling and data warehousing concepts.
- Perform data validation and quality checks.
- Work with various data formats, including JSON, Parquet, and Avro.
- Build and manage data orchestration pipelines, including linked services and datasets for ADLS, Databricks, and SQL Server.
- Create parameterized and dynamic ADF pipelines, and trigger Databricks notebooks from ADF.
- Collaborate closely with Data Scientists, Data Analysts, Business Analysts, and Data Architects to deliver trusted, high-quality datasets.
- Contribute to data governance, metadata documentation, and ensure adherence to data quality standards.
- Use version control tools (e.g., Git) and CI/CD pipelines to manage code deployment and workflow changes.
- Develop real-time and batch processing pipelines for streaming data sources such as MQTT, Kafka, and Event Hub.
Requirements
undetermined
undetermined
undetermined
undetermined
Other S.i. Systems's offers that may interest you