Blog

Welcome to my blog! Here I share insights from my daily readings of AI research papers and write about interesting open-source projects I’m working on. I hope these posts are helpful to the community and spark interesting discussions.


AI Daily Paper Summary

I read AI and ML research papers daily and post concise summaries here. Each entry covers the key contributions, methods, and takeaways from recent papers across topics like deep learning, reinforcement learning, NLP, computer vision, and more.

Latest — July 03, 2026

  1. Program-as-Weights — New programming paradigm that compiles natural-language function specs into compact neural adapters, letting a 0.6B model match Qwen3-32B at 1/50th the memory for offline/on-device AI. arxiv.org/abs/2607.02512
  2. EvoPolicyGym — First benchmark evaluating how well agents iteratively improve executable RL policies under fixed budgets, with trajectory-level diagnostics across 16 environments. arxiv.org/abs/2607.02440
  3. AgenticSTS — Bounded-memory testbed for long-horizon LLM agents that isolates the effect of individual memory components under a fixed-context contract. arxiv.org/abs/2607.02255
  4. Morphing into Hybrid Attention — Learned layer-selection method for converting full-attention Transformers into efficient hybrid models, outperforming heuristic placement strategies. arxiv.org/abs/2606.30562
  5. AgenticDataBench — Comprehensive benchmark for LLM-based data agents spanning ingestion, transformation, analysis, and visualization across the full data science pipeline. arxiv.org/abs/2607.01647
  6. Multi-Resolution Flow Matching — Training-free diffusion acceleration via staged multi-resolution sampling, achieving >5× speedup on text-to-image models with no retraining. arxiv.org/abs/2607.01642
  7. WorldDirector — Controllable video world model that decouples semantic motion orchestration from pixel rendering, maintaining persistent dynamic object memory across viewpoints. arxiv.org/abs/2607.02517
  8. Breaking Failure Cascades — Step-aware RL for medical multimodal reasoning that penalizes cascading errors in clinical image analysis, improving diagnostic reliability. arxiv.org/abs/2606.31825
  9. SkillCoach — Self-evolving rubric system that helps LLM agents improve skill selection and execution in large skill repositories without human annotation. arxiv.org/abs/2607.01874
  10. Distribution-wise Rewards for Visual Generation — Framework that finetunes generative models with distribution-level rewards to prevent reward hacking and mode collapse in RLHF for images. arxiv.org/abs/2607.02291

Browse all paper summaries →


Interesting Projects

In this section, I share outcomes, results, and lessons learned from open-source projects I’m working on. The goal is to give back to the community and document the journey of building and experimenting with different ideas.

Browse all project posts →