Aside

Contact


Skills

Statistical:
Bayesian inference · Hierarchical models · Random forest · Experimental design

Programming:
R · Python · Julia · Stan · SQL · duckDB · Bash

Data Engineering:
ETL/ELT · Apache Arrow · Cloud-optimized data (Parquet, COG) · Azure (Blob, Databricks) · HPC

Reproducibily:
Quarto · Markdown · Jupyter · Pandoc · LaTeX · Make

DevOps:
Unix · Git · GitHub · CI/CD pipelines · Docker · automated testing

Web & Interactive Tools:
HTML/CSS · Shiny · API design basics


Outreach

· Developed and maintained the QCBS R Workshop Series, reaching nearly one thousand graduate students · 5 published papers & preprints with ~100 citations · Translated client needs into technical analyses and delivered clear, actionable results


Languages

Portuguese · Native
French · Full Professional
English · Full Professional

Disclaimer

CV source code hosted on

Last updated: 2025-11-19

PDF download available

Main

Willian Vieira PhD

Data scientist with a PhD in quantitative ecology who learned to extract signal from messy, uncertain spatial data. I combine Bayesian and machine learning statistics with data engineering practices to build automated, scalable pipelines. My focus is on making complex datasets reliable, accessible, and directly useful in decision-making.

Experience

Data Analyst

Habitat, Montreal, Canada

N/A

2025 - 2024
(1 yr 9 m)

Delivered custom geospatial machine learning models for client decision-making | Built and maintained reproducible pipelines in R, Python, and Julia for cloud-hosted spatial data | Designed a lakehouse architecture for streaming vector and raster datasets via cloud storage | Developed a metadata-driven pipeline framework enabling scalable Azure processing and improved data governance | Modernized workflows with containerization, CI/CD, and DevOps practices | Led the team-wide transition from Windows to Linux, validating full workflow compatibility and building an automated provisioning script to setup new Linux machines

Biostatistician

Environment and Climate Change Canada - Quebec, Canada

N/A

2022 - 2020
(part-time)

Developed a standardized sampling protocol ensuring unbiased, representative boreal bird surveys in Quebec | Led R&D to correct spatial bias from legacy survey sites, creating a novel method later adopted across other Canadian provinces | Produced a fully reproducible, open workflow with automated reports documenting methods and results
Code: , Report:

Teaching Assistant

Université de Sherbrooke - Sherbrooke, Canada

N/A

2022 - 2018
(part-time)

TA for undergraduate biology (+4 cohorts of ~40 students): Introduction to Scientific Programming and Methods in Computational Ecology | Contributed to course materials and built CI/CD pipelines to render and deploy teaching content | TA (1 yr) for undergraduate Probability & Statistics in engineering | TA for graduate Biodiversity Modelling Summer School, teaching Bayesian modelling with Stan

Education

PhD, Ecology

Integrative Ecology Lab - Université de Sherbrooke, Canada

N/A

2024 - 2017

How climate, competition, and forest management shape the limits of tree species distributions: from individuals to metapopulations
Mathematical modeling | Hierachical models | Bayesian statistics | Machine Learning | High Performance Computing (HPC) | Open Science

Masters 2, Agroecology and Resource Management

Bordeaux Sciences Agro, Bordeaux, France

N/A

2016 - 2015

Modelling the dispersion of weed species in agricultural landscapes

BSc in Agronomy

Universidade Federal de Santa Catarina, Florianópolis, Brazil

N/A

2015 - 2010

Defaunation impact on a threatened species: araucaria in southern Brazil