Blog
Welcome to my blog! Here I share insights from my daily readings of AI research papers and write about interesting open-source projects I’m working on. I hope these posts are helpful to the community and spark interesting discussions.
AI Daily Paper Summary
I read AI and ML research papers daily and post concise summaries here. Each entry covers the key contributions, methods, and takeaways from recent papers across topics like deep learning, reinforcement learning, NLP, computer vision, and more.
Latest — May 21, 2026
- Anti-Self-Distillation for Reasoning RL — PMI-based suppression of over-confident teacher tokens sharpens focus on genuine deliberation steps; consistent gains on math/reasoning benchmarks without an external teacher. arxiv
- Mega-ASR — Scalable compound-data pipeline for in-the-wild² ASR robustness, tackling compositional noise/reverb/codec distortions that trip up current models. arxiv
- When Vision Speaks for Sound — Video MLLMs exploit visual-acoustic shortcuts instead of genuinely processing audio; paper exposes the “Clever Hans effect” and proposes evaluation protocols to catch it. arxiv
- Active Learners as Efficient PRP Rerankers — Reframes LLM pairwise reranking as active learning from noisy comparisons, improving NDCG@10 per LLM call within a fixed budget. arxiv
- Video2GUI — Fully automated framework that extracts GUI interaction trajectories from screen-recording videos for large-scale agent pretraining without human labeling. arxiv
- Train-Free Infinite-Frame Video Generation — Attention windowing + consistency regularization fixes train/inference mismatch in FIFO-diffusion, enabling coherent infinite-length video without fine-tuning. arxiv
- OpenComputer — Verifier-grounded framework for constructing reproducible software environments to evaluate and train computer-use agents on real desktop applications. arxiv
- GoLongRL — Fully open-source long-context RLVR recipe with 23K samples and multitask alignment to prevent capability regression during long-context RL training. arxiv
- Rank-1 RLVR Trajectories — RLVR weight updates are near-rank-1; extrapolating along this direction achieves strong reasoning gains with minimal actual training compute. arxiv
- OScaR: Extreme KV Cache Quantization — Principled replacement for per-channel quantization that handles Key tensor outliers, enabling sub-2-bit KV cache compression for longer contexts on the same hardware. arxiv
Interesting Projects
In this section, I share outcomes, results, and lessons learned from open-source projects I’m working on. The goal is to give back to the community and document the journey of building and experimenting with different ideas.
