Data Science Student Open to Co-op 2026

Sahibjeet
Pal Singh

I turn raw, messy data into systems that work - ML pipelines, analytical dashboards, and decision-ready outputs that hold up under scrutiny.

5+ Projects Built
6M+ Records Analyzed
3+ Years at Amazon
10+ Tools & Libraries
welcome to my corner
a little about me

I'm a Data Science student at Simon Fraser University with a deep curiosity for how numbers tell stories. My work sits at the crossroads of statistical thinking and creative problem-solving — finding patterns where others see noise.

Before diving into data full-time, I spent 3+ years at Amazon where I learned to think in systems — monitoring KPIs, spotting bottlenecks, and optimizing workflows at scale. That operational mindset now shapes how I approach every dataset.

My coursework spans Data Structures, Probability & Statistics, and Software Engineering. When I'm not wrangling data, you'll find me building side projects, competing on Kaggle, or sketching ideas before they become code.

the journey so far

Warehouse Associate

Amazon
Nov 2022 - Present · Pitt Meadows, BC
  • Monitored operational KPIs including scan compliance, dispatch accuracy, and package throughput — contributing to 99%+ on-time delivery
  • Identified recurring error patterns in sorting workflows and created standardized process guides, reducing repeat issues by 15%
  • Collaborated with cross-functional teams of 20+ associates to streamline pick-and-pack operations

Bachelor of Science — Data Science

Simon Fraser University
Sept 2024 - Present · Burnaby, BC

Coursework: Data Structures & Algorithms, Probability and Statistics, Introduction to Software Engineering, Introduction to Computer Systems, Artificial Intelligence, Linear Algebra, Discrete Mathematics

Selected Work

Fraud Detection — Urgency Class Distribution

Fraud Detection Pipeline

A machine learning pipeline that classifies financial transactions not just as fraud or legitimate, but by how urgently they need to be investigated. Built for HackML 2026 on 6.2 million real-world transactions with extreme class imbalance.

Python XGBoost SMOTE Scikit-learn
Silent Patterns — Vancouver Crime Trends

Silent Patterns — Vancouver Crime

22 years of Vancouver Open Data across 24 neighbourhoods, queried in SQLite and visualised with ggplot2 faceted charts. The Central Business District accounts for 30.3% of all incidents — but counts without population denominators are not the same as risk.

R SQLite ggplot2 Vancouver Open Data
BC Surgical Wait Times Dashboard

BC Surgical Wait Times

An end-to-end analytics pipeline for 16 years of BC Ministry of Health surgical data — 202,953 rows cleaned with 10 documented decisions, 5 audience-specific SQL views in DuckDB, and a Power BI dashboard that surfaces the P90 story the median always hides.

Python DuckDB Power BI SQL
Renewable Energy Classifier Demo

Renewable Energy Classifier

A physics-grounded scoring engine that takes five climate inputs and recommends Solar, Wind, or Hydro — with full point breakdowns and a confidence margin. Validated against real NOAA station data using ANOVA and permutation feature importance across a 3000+ line R pipeline.

R ggplot2 NOAA Data Vanilla JS
Rubik's Cube Solver Demo

Rubik's Cube Solver

A solver that finds near-optimal solutions for any Rubik's Cube scramble in under 100ms, with a 3D animated interface that plays back every move step by step. Built using Kociemba's Two-Phase Algorithm with precomputed pruning tables.

Java Three.js Kociemba Algorithm

Tools & Skills

01

Analysis & Querying

PythonPython SQLSQL RR PandasPandas NumPyNumPy dplyr
02

Visualization & BI

Power BI DAX MatplotlibMatplotlib Seaborn ggplot2 Excel
03

Machine Learning

Scikit-learnSklearn XGBoost SMOTE K-Means DBSCAN Rnd Forest
04

Tools & Infra

GitGit DockerDocker FastAPIFastAPI ReactReact JavaJava Streamlit
don't be a stranger

Let's connect