•Contributed to Meta's PyTorch OpenEnv library for RL post-training, implementing environment wrappers, reward shaping modules, and observation preprocessing utilities ∙Applied learnings to win 1st place at Meta AI/AMD/PyTorch Synthetic Data Hackathon for LLM fine-tuning using GRPO with custom reward weights