Ce recruteur est en ligne!

Voilà ta chance d'être vu en premier!

Postuler maintenant

CONTRACT TO PERM- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027

S.i. Systèmes

Toronto, ON

Salaire À discuter
Emploi Contrat
Publié il y a 1 jour(s)
Ajouter aux favoris
1 poste à combler dès que possible

Voir le profil complet Postuler maintenant

Description

*CONTRACT TO PERM*- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027

Location Address: Hybrid - 44 King - 3 days/week onsite (days will vary depending on team)

Subject to change: 3-4 days onsite may be required based on business needs

Contract Duration: 6 months (Must convert to perm after 6 months)

Schedule Hours: 9am-5pm Monday-Friday; standard 37.5 hrs/week

Story Behind the Need

Business group: Global Banking and Markets Engineering (GBME) is the fast-moving, award-winning technology engine that powers Scotiabank’s Corporate, Investment Banking and Capital Markets businesses. Team works with all GBME applications to ensure they are reliable
Project: GBME is searching for SRE’s who are continuous learners are and are eager to boost capabilities of capital markets products and analytics platforms. Improvement and optimization of batch jobs of applications
Resource will be aligned to application portfolio in GBME and ensure their batches are optimized and running in a resilient way; measured by SLA adherence for batch jobs

Typical Day in Role:

Reliability & Performance: Ensure stability and optimize batch processing pipelines; reduce runtime and failure rates, engineering for resiliency.
Observability: Implement and maintain monitoring with Dynatrace; create dashboards, alerts, and runbooks.
Systems Engineering: Manage and tune Linux and Windows systems for performance and resilience.
Automation & Orchestration: Create/Modify and optimize Airflow DAGs; build CI/CD pipelines for automation.
Incident Management: Lead incident response, root cause analysis, and postmortems; enforce SLOs and reliability practices.
Security & Compliance: Apply security best practices and ensure regulatory compliance in systems and automation.

Candidate Requirements/Must Have Skills:

1) 7-10 years of relevant working experience - flexible

2) Linux Systems Expertise: Kernel/OS tuning, networking, filesystem optimization, process management, and troubleshooting.

3) Experience with application performance monitoring

4) Experience with a more modern development languages (Python required, Java and others an asset,

5) Proven experience optimizing batch workloads for performance, reliability, and cost. Strong understanding of distributed systems concepts retries, idempotency, backpressure, and data integrity. Strong understanding of backend systems and batch optimization.

6) Proven experience with containers and orchestration (Docker, Kubernetes).

7) Excellent incident management and root cause analysis skills.

Nice-To-Have Skills:

1) Dynatrace Mastery: Custom dashboards, KPIs, anomaly detection, tagging strategy, and alerting configuration.

2) Proficiency with CI/CD pipelines (GitHub Actions, Azure DevOps, Jenkins) and Infrastructure as Code (Terraform, Ansible).

3) Experience with some automated deployment.

4) Understanding of networking protocols and security principles

5) Capital Markets product knowledge

6) GCP Cloud experience

7) Experience working with real-time, high availability and low latency systems

8) Airflow Experience: DAG design best practices, SLA management, scheduler/executor tuning, and scaling strategies.

Education:

Bachelor’s degree in computer science, Engineering, or related field.

Cloud certifications an asset

IaC automation certifications an asset

Best VS. Average Candidate:

The ideal candidate is passionate about Site Reliability Engineering (SRE), with a strong focus on building reusable, efficient, and scalable environments. They thrive in an innovative, cross-functional team setting and bring a strong technical and engineering mindset to the role.

Key attributes of the successful candidate include:

Extensive batch processing experience and a hands-on approach to problem-solving.

Proficiency in programming, deep Linux system expertise, and solid application monitoring experience.

Ideally, a developer who has transitioned into an SRE role, combining development skills with reliability engineering practices.

Familiarity with typical SRE/DevOps tools is helpful but less critical for this position.

Candidate Review & Selection - Interview Process

2 rounds - 1 hour - in person at 44 King

1st with HM

2nd with GBME

Disclaimer:
AI may be used in evaluating candidates.
This posting is for an existing vacancy.

Apply

Exigences

Niveau d'études

non déterminé

Diplôme

non déterminé

Années d'expérience

non déterminé

Langues écrites

non déterminé

Langues parlées

non déterminé

No. référence interne

149783

Postuler maintenant

D'autres offres de S.i. Systèmes qui pourraient t'intéresser

Directeur principal, Marchés mondiaux du crédit, TI
Toronto,ON

Publié il y a 2 jour(s)
Développeur principal Murex
Toronto,ON

Publié il y a 7 jour(s)
CONSEILLER, SUCCÈS DES FOURNISSEURS, ASKUITY, CANADA
Toronto,ON

Publié il y a 8 jour(s)

Voir plus d'offres similaires

Chercher d'autres emplois

On crée le match parfait

Jobillico te propose instantanément les offres d’emploi qui te correspondent.

Importe ton CV

Plus d'offres

SCIENTIFIQUE DES DONNÉES, ASKUITY, CANADA
Toronto,

Tech Lead en hyperautomatisation – Power Platform
Toronto,

Intégrateur senior DevOps
Toronto,

Ce recruteur est en ligne!

*CONTRACT TO PERM*- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027

S.i. Systèmes

Description

Exigences

Plus d'offres similaires à "*CONTRACT TO PERM*- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027"

Chercher d'autres emplois

On crée le match parfait

Plus d'offres

Envoyer par courriel

CONTRACT TO PERM- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027

Plus d'offres similaires à "CONTRACT TO PERM- Senior Site Reliability Engineer with Linux and Python experience to improve and optimize the batch jobs of applications- 39027"