🚀 Projects & Research

A collection of technical and applied work, ranging from algorithmic trading and crypto analytics to applied data science.


🔁 Trading Systems & Blockchain Analytics

MLTradingBot

Tools

Fully automated pipeline for alpha signal ingestion, feature engineering, model training, and execution logic. Real-time system that logs indicators, tracks token trajectories, and pushes alerts for low-cap Solana tokens.
Key Features: OHLCV parsing, LightGBM classification, live backtesting, Discord notifications, Dex Screener & SolanaTracker integration.


Solana Alpha Signal Tracker

Status Tools

Rebuilt signal bot in TypeScript for higher concurrency. Tracks token metadata, market cap growth, and volume trends post-DEX listing.
Highlights: Converted $20 → $4,000 in 5 weeks in live simulation, running advanced heuristics (wallet clustering, liquidity traps, rug-pull detection).


QRF Forecasting of Mid-Cap Solana Returns (MSc Dissertation — Ongoing)

Tools

Predicting 72-hour return intervals on mid-cap Solana tokens using Quantile Regression Forests and bootstrapped LightGBM. Benchmarked with statistical baselines and volatility-sensitive metrics.
Outcome: Research-grade ML forecasting framework with tail risk calibration and token-specific feature interpretation.


Podcast Listening Time - Kaggle Competition

Tools

End-to-end pipeline for predicting user engagement time on podcast episodes. Feature engineered dataset and tuned LightGBM for leaderboard submission.


📊 Academic Data Science Projects

Grade Model Tools
Skills: Mixed-effects modelling, spatial classification, war impact detection

Used GLMMs to classify conflict-driven fire activity across Ukraine. Temporal-spatial features and animated visualisations included.


Deepwater Horizon: Phytoplankton Disruption

Grade Focus

Long-term zonal anomaly tracking of Gulf phytoplankton post oil spill using STL decomposition and satellite chlorophyll data.


TB Risk in Brazil (GAMs)

Grade Model

Spatial-temporal risk modelling of tuberculosis with Tweedie GAMs. Maps seasonal and population-adjusted case surges across 557 regions.


Grade

Compared GLMs for AIDS case progression, including Poisson, Gaussian, and Tweedie fits with AIC-based model selection.


Space & Time Modelling of Global Temperatures

Grade

Combined ARIMA, kriging, and dynamic linear models to explore SST, AMOC and regional California temperatures over time.


Reaction Time Modelling & ML Classification

Grade

Hybrid Bayesian + ML workflow for patient data — MCMC diagnostics plus supervised classifiers (QDA, RF, SVM) on synthetic reaction time datasets.


Multi-Part Coursework (PCA, Clustering, Regression)

Grade

Data challenges spanning sea level rise modelling, power demand clustering, and gene prediction via PCA and logistic regression.


Want to know more about any of these?
→ Feel free to contact me or visit my GitHub for full repos and updates.