PhD candidate at National Taipei University of Technology working on multi-agent deep reinforcement learning for UAV-assisted networks, reconfigurable intelligent surfaces (RIS), and space-air-ground integrated networks (SAGIN) β with a hands-on focus on GPU-accelerated PHY (CUDA, cuSolver/cuBLAS, NVIDIA Sionna). I also apply the same RL/ML toolkit to systematic trading: regime detection, risk control, and production ML. π 4.00 / 4.00 GPA
Specialty domains: OFDM PHY Β· MIMO Β· LDPC Β· 5G NR Β· RIS Β· O-RAN Β· Federated Learning Β· MADDPG Β· PPO Β· CUDA kernels Β· Systematic trading
End-to-end OFDM PHY + AI-RAN stack where every claim is runnable. Real CUDA C++ MMSE channel estimation (custom kernel + cuSolver Cpotrf/Cpotrs + cuBLAS Cgemm), a Sionna 5G LDPC + TR38.901 TDL BLER link, and link adaptation (OLLA / model-based greedy / model-free learned policy) driven by the measured BLER curves.
Result: 133Γ vs NumPy on an RTX 4090 (verified to ~1e-4 vs a CPU reference) Β· model-free policy matches a tuned OLLA / greedy from ACK/NACK feedback alone, at the lowest BLER Β· figures, green CI, and a NVIDIA_REVIEW.md design/verification guide
A CUDA kernel-optimization case study on the MMSE-apply complex GEMM: naive β shared-memory tiled β cuBLAS, benchmarked on an RTX 4090, with an Nsight Compute profiling methodology. Keeps a NumPy/CuPy reference and checks correctness against a CPU baseline. Result: clean naiveβtiledβcuBLAS performance breakdown Β· GPU == CPU to ~1e-4 Β· CUDA compile-checked in CI Β· the GPU-kernel companion to the flagship
Standalone deep-RL study for 5G NR link adaptation: self-contained PyTorch PPO vs an OLLA industry baseline, 28-index MCS table (3GPP TS 38.214), non-stationary SNR with mobility/handover scenarios. Result: PPO learns a competitive policy from scratch (~3 min CPU training), evaluated fairly head-to-head vs OLLA Β· 15/15 unit tests
Federated learning for CSI feedback compression: CsiNet autoencoder + FedAvg under non-IID channel statistics, aligned with the 3GPP Release 18 AI-RAN study item. Result: FedAvg matches centralised (~β2 dB NMSE) and beats local-only by ~2 dB Β· 16/16 unit tests
ris-beamforming-optimizerβ RIS phase optimization (manifold + deep-learning methods)oran-resource-allocation-xappβ O-RAN xApp-style resource scheduling with DRL
π§ͺ frix-project β friction-realistic execution benchmark
An honest, reproducible benchmark for intraday order execution on real crypto LOB β classical (TWAP/VWAP/POV/AlmgrenβChriss) vs deep RL (PPO/SAC) vs an LLM meta-controller, under one friction model (fees, queue, partial fills, adverse selection). Result (the honest null): no ML method beats a simple classical schedule on BTC or ETH β on ETH, deep RL is significantly worse (PPO, p = 0.035). The benchmark caught four simulator artifacts that had faked a ~2 bps edge; a training-length ablation rules out undertraining Β· paired-bootstrap significance Β· 13 unit tests Β· CI
π§ market-regime-engine
Regime detection, risk allocation and live health monitoring β the open-sourced production layer of a systematic book. Six-indicator regime classifier (BULL/NEUTRAL/BEAR/CRISIS) with hard crisis gates, inverse-volatility weighting with portfolio vol targeting, and a monitoring battery with a trailing-drawdown kill-switch.
Result: explainable-by-construction regime calls Β· fail-safe tradeable flag for automated halts Β· 9/9 unit tests Β· CI
πΌ AI-Capital (public sample β full system private)
Multi-asset systematic trading system on QuantConnect: cross-asset momentum, regime detection, volatility targeting and crisis routing, iterated through 11+ versions under strict out-of-sample validation with bias auditing (look-ahead, weight-cap, vol-estimation pitfalls). Status: active paper-trading track record Β· public repo contains a representative architecture sample
Systematic factor research on a professional simulation platform; first passing alpha cleared the platform's evaluation thresholds.
8+ IEEE publications in AI-native wireless networks:
- Hybrid federated learning with MADDPG for UAV-assisted access networks
- Reconfigurable intelligent surface optimization for 6G
- SAGIN architectures with LEO (Starlink) integration
- Channel estimation and beamforming for next-gen PHY
π ORCID
- π PhD candidate, National Taipei University of Technology β 4.00 / 4.00 GPA
- π’ NVIDIA NGC 6G Developer Program β Member, 2026 cohort
- π¨βπ« Advisors: Prof. Hsin-Piao Lin Β· Assoc. Prof. Rong-Terng Juang
π§ Email Β· πΌ LinkedIn Β· π ORCID π Taipei, Taiwan Β· π«π· π¬π§ πΉπΌ