Blog
Welcome to my blog! Here I share insights from my daily readings of AI research papers and write about interesting open-source projects I’m working on. I hope these posts are helpful to the community and spark interesting discussions.
AI Daily Paper Summary
I read AI and ML research papers daily and post concise summaries here. Each entry covers the key contributions, methods, and takeaways from recent papers across topics like deep learning, reinforcement learning, NLP, computer vision, and more.
Latest — May 05, 2026
- LLMs Get Lost In Multi-Turn Conversation — LLMs show consistent performance drops in multi-turn conversations with underspecified instructions vs. single-turn evals. ICLR 2026 Outstanding Paper. arxiv
- Transformers are Inherently Succinct — Theoretical proof that transformers encode formal languages far more succinctly than classical representations like finite automata. ICLR 2026 Outstanding Paper. arxiv
- The Polar Express — GPU-friendly polynomial approximations for the matrix sign function that are provably optimal, improving Muon optimizer throughput and stability. ICLR 2026 Honorable Mention. arxiv
- AcademiClaw — Bilingual benchmark of 80 long-horizon real student academic tasks for evaluating AI agents, bypassing contamination issues. arxiv
- MolmoAct2 — Open-weight, hardware-agnostic VLA model with real-time latency for robot manipulation from Ai2. arxiv
- Code World Model Preparedness Report — Meta’s public safety assessment of their code-generation model against catastrophic-risk domains. arxiv
- VLA-RFT — RL fine-tuning framework for robot VLA policies using world simulators as verified-reward environments. ICLR 2026. arxiv
- 12 Angry AI Agents — Multi-agent LLM benchmark based on jury deliberation from 12 Angry Men, testing deliberation and persuasion. arxiv
- Odysseus — RL-based approach scaling VLMs to 100+ turn game decision-making without human trajectory supervision. arxiv
- EngiAgent — Fully connected multi-agent architecture for solving open-ended engineering problems under data and physical constraints. arxiv
Interesting Projects
In this section, I share outcomes, results, and lessons learned from open-source projects I’m working on. The goal is to give back to the community and document the journey of building and experimenting with different ideas.
