About Me

Master's student in Data Science at the University of Luxembourg with hands-on experience in machine learning, data analysis, time-series forecasting, NLP, and predictive modelling. I build end-to-end data science projects that connect data preprocessing, feature engineering, model development, evaluation, and dashboarding, with current interests in forecasting, credit risk analytics, and research-grade time-series imputation.

Current Role MSc Data Science Student
Current Focus ML, Forecasting, NLP, Credit Risk Analytics
Location Luxembourg, LU
Availability Open to internships, thesis work, and data science roles

Skills

I work across the full data science workflow: understanding the problem, preparing data, building features, training and evaluating models, and turning results into clear reports or dashboards.

Programming & Databases

Core technical stack

PythonAdvanced

Primary language for modelling, analysis, and dashboards.

SQLAdvanced

KPI marts, reporting views, joins, and analytics queries.

RStrong

Statistics, visualization, and coursework-based analysis.

DuckDBWorking

Local analytics warehouse and SQL-backed workflows.

MongoDB / Neo4jWorking

NoSQL databases and graph-oriented data modelling.

Machine Learnbing & AI

Predictive modelling

Scikit-learnAdvanced

Classical ML, evaluation, preprocessing, and pipelines.

PyTorchAdvanced

Deep learning models, sequence models, and research workflows.

XGBoost / LightGBMStrong

Boosted-tree baselines and tabular modelling.

TensorFlowStrong

Deep learning framework familiarity and applied coursework.

NLP / TransformersStrong

FinBERT, Hugging Face workflows, and sentiment modelling.

Statistics, Forecasting & Modelling

Quantitative methods

Probability & StatisticsAdvanced

Core statistical reasoning and model interpretation.

Feature Engineering / PCAAdvanced

Lag, rolling, calendar, exogenous features, and dimensionality reduction.

Forecast Evaluation / BacktestingAdvanced

Time-based validation, leakage control, and forecast comparison.

ARIMA / ARIMAX / Holt-WintersStrong

Classical time-series forecasting and validation-based tuning.

LSTM / GRUStrong

Deep sequence models for time-series forecasting workflows.

Data Analysis & Visualization

Exploration to insight

Pandas / NumPyAdvanced

Data cleaning, transformation, aggregation, and analysis.

Matplotlib / SeabornAdvanced

Exploratory plots, diagnostics, and visual reporting.

Plotly / ggplot2Strong

Interactive charts and grammar-of-graphics style visualization.

SciPy / TidyverseStrong

Scientific computing and R-based data workflows.

Data Scraping / Data StorytellingStrong

Collecting inputs and communicating insights clearly.

Development & Deployment

Building usable outputs

Git / GitHubAdvanced

Version control, project organization, and reproducibility.

StreamlitAdvanced

Interactive data science apps and project dashboards.

DockerStrong

Containerized projects and reproducible environments.

Flask / FastAPIWorking

Backend APIs and lightweight deployment foundations.

Jupyter / Quarto / NixStrong

Notebooks, reproducible reports, and development environments.

Analytics, BI & Reporting

Business-facing analytics

KPI ReportingAdvanced

Structured metrics for monitoring and decision support.

DashboardingAdvanced

Interactive reporting with Streamlit and BI-style layouts.

Data Quality ChecksAdvanced

Validation rules, consistency checks, and audit-friendly outputs.

Power BIStrong

Business intelligence dashboards and stakeholder reporting.

ExcelStrong

Analysis, reporting, and practical business workflows.

Projects

Selected projects showing practical work in forecasting, credit risk analytics, business intelligence, NLP, and applied machine learning.

Certifications

Certifications that support my formal and project-based work in machine learning and data science.

Machine Learning Specialization

Stanford University / Coursera

Covers supervised learning, unsupervised learning, model evaluation, neural networks, and practical machine learning workflows.

IBM Data Science Specialization

IBM / Coursera

Applied data science program covering Python, analysis workflows, data visualization, machine learning, and project-based data science practice.

Contact Me

Send a Message

For direct contact or to review my work, use the links below.

Stocks Next-Day Returns Prediction

Time SeriesOct 2025 – Mar 2026

Objective

Predict next-day AAPL log returns using daily market data while avoiding data leakage and preserving a realistic time-based modelling setup.

Approach

Built an end-to-end pipeline using yfinance data from 2013–2026, engineered endogenous, exogenous, calendar, lag, and rolling features, then applied train-only scaling and PCA for dimensionality reduction.

Results

Benchmarked against Naive, ARIMA, ARIMAX, and XGBoost baselines, then trained an LSTM-to-GRU model with Optuna-based hyperparameter search and structured evaluation reporting.

Tools Used

Python, PyTorch, scikit-learn, yfinance, Optuna, PCA, LSTM, GRU.

Project Link

View Project

Credit Risk BI & Decisioning Dashboard

BI DashboardNov 2025 – Feb 2026

Objective

Create a practical credit risk analytics tool for underwriting, portfolio monitoring, probability-of-default scoring, and decision policy simulation.

Approach

Designed a workflow from ingestion to data-quality contracts, DuckDB warehousing, PD modelling, SQL KPI marts, and Streamlit dashboards for model and business reporting.

Results

Implemented model diagnostics including ROC/PR, calibration, Brier score, threshold analysis, and a simulator that converts PD into approve, review, and decline decisions under business constraints.

Tools Used

Python, Streamlit, scikit-learn, DuckDB, SQL, Docker.

Project Link

View Project

Walmart Weekly Sales Forecasting

Retail ForecastingDec 2025 – Feb 2026

Objective

Forecast weekly Walmart sales and compare department-level forecasting against an aggregate-forecast-deaggregate strategy.

Approach

Implemented additive Holt-Winters forecasting from scratch with validation-based tuning, then evaluated performance across store and department time series.

Results

Compared models using MAE, RMSE, and WAPE, and modeled a rebate-contract scenario using horizon-2 forecasts and rolling bias correction to study KPI and accuracy trade-offs.

Tools Used

Python, Pandas, NumPy, Matplotlib, Jupyter Notebook.

Project Link

View Project

NVDA News Sentiment Portfolio

Financial NLPOct 2025 – Dec 2025

Objective

Analyze whether NVDA news sentiment can be aligned with next-day and next-3-day stock returns for downstream financial analysis.

Approach

Collected and de-duplicated headlines from NewsAPI and yfinance, fine-tuned FinBERT on manually labelled headlines, and aligned after-hours news to the next NASDAQ trading day.

Results

Created daily sentiment features including mean sentiment and tail-negativity, then mapped them to returns and abnormal returns for analysis.

Tools Used

Python, Hugging Face Transformers, FinBERT, NewsAPI, yfinance.

Project Link

View Project