Building Veqiro 🚀 AI workforce for founders — marketing, legal, finance, ops & more on one platform.

Built an intelligent Tic Tac Toe simulator with two competing agents—one trained via Reinforcement Learning (PPO) and the other using an optimized MiniMax algorithm with memoization. Trained RL agents using Q-learning with ε-greedy exploration, converging to near-optimal play after 10,000+ episodes. Implemented MiniMax with memoization, achieving perfect adversarial play and outperforming under-trained RL agents. Created an interactive environment in Jupyter Notebook, enabling step-by-step simulation of RL vs RL, RL vs MiniMax, and MiniMax vs MiniMax matches. Added dynamic performance tracking with visual and tabular analytics for win/loss/draw stats. Demonstrated the contrast between model-free learning and deterministic search, showcasing learning curves and decision optimization. Tech Stack: Python · Jupyter Notebook · NumPy · Matplotlib

Built an intelligent Tic Tac Toe simulator with two competin

Tic Tac Toe AI – Reinforcement Learning vs Mi

Built an intelligent Tic Tac Toe simulator with two competin