Blog
Welcome to my blog! Here I share insights from my daily readings of AI research papers and write about interesting open-source projects I’m working on. I hope these posts are helpful to the community and spark interesting discussions.
AI Daily Paper Summary
I read AI and ML research papers daily and post concise summaries here. Each entry covers the key contributions, methods, and takeaways from recent papers across topics like deep learning, reinforcement learning, NLP, computer vision, and more.
Latest — June 12, 2026
- MiniMax Sparse Attention — Blockwise sparse attention achieving near-full-attention quality at nearly linear cost, enabling million-token contexts in production. arxiv
- MaxProof — RL-trained proof generation, verification, and critique-repair merged into one model; population-level test-time scaling for competition-level math. arxiv
- EvoArena — Benchmark tracking agent memory evolution across dynamically shifting tasks, stress-testing robustness to stale knowledge. arxiv
- SpatialClaw — Redesigned action interface for tool-augmented VLMs yields large gains on 3D localization, relation understanding, and motion prediction. arxiv
- EurekAgent — Formalizes “agent environment engineering” and shows well-designed environments unlock autonomous scientific discovery beyond human-designed approaches. arxiv
- InterleaveThinker — RL-trained interleaved text-image generation for unified multimodal models, unlocking visual narrative and step-by-step guidance capabilities. arxiv
- FORT-Searcher — Synthesizes shortcut-resistant search tasks to force genuine multi-step retrieval, fixing “shortcut collapse” in deep search training data. arxiv
- LabVLA — Vision-Language-Action models adapted for scientific lab settings, bridging AI reasoning and physical bench-work execution. arxiv
- HYDRA-X — First unified multimodal model using a single ViT for both image and video tokenization, providing a shared representation space for understanding and generation. arxiv
- WeaveBench — 114 long-horizon computer-use tasks across 8 real-world domains requiring sustained cross-interface orchestration (GUI + CLI + browser + code). arxiv
Interesting Projects
In this section, I share outcomes, results, and lessons learned from open-source projects I’m working on. The goal is to give back to the community and document the journey of building and experimenting with different ideas.
