πŸ“„ Notable* Recent AI/ML arXiv Papers

Last updated just now...

πŸ“„ AI-Driven Structure Refinement of X-ray Diffraction
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16372v1
πŸ‘₯ Authors: Bin Cao (possible past Microsoft (United States) affiliation), Qian Zhang (possible past University Of Washington affiliation), Zhenjie Feng, Taolue Zhang, Jiaqiang Huang, Lu-Tao Weng, Tong-Yi Zhang
Abstract

Artificial intelligence can rapidly propose candidate phases and structures from X-ray diffraction (XRD), but these hypotheses often fail in downstream refinement because peak intensities cannot be stably assigned under severe overlap and diffraction consistency is enforced only weakly. Here we introduce WPEM, a physics-constrained whole-pattern decomposition and refinement workflow that turns Bragg's law into an explicit constraint within a batch expectation--maximization framework. WPEM models...

πŸ“„ Multi-agent cooperation through in-context co-player inference
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16301v1
πŸ‘₯ Authors: Marissa A. Weis, Maciej WoΕ‚czyk, Rajai Nasser, Rif A. Saurous (possible past Google (United States) affiliation), Blaise AgΓΌera Y Arcas (possible past Google (United States) affiliation), JoΓ£o Sacramento (possible past Eth Zurich affiliation), Alexander Meulemans
Abstract

Achieving cooperation among self-interested agents remains a fundamental challenge in multi-agent reinforcement learning. Recent work showed that mutual cooperation can be induced between "learning-aware" agents that account for and shape the learning dynamics of their co-players. However, existing approaches typically rely on hardcoded, often inconsistent, assumptions about co-player learning rules or enforce a strict separation between "naive learners" updating on fast timescales and "meta-lea...

πŸ“„ Geometric Neural Operators via Lie Group-Constrained Latent Dynamics
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16209v1
πŸ‘₯ Authors: Jiaquan Zhang, Fachrina Dewi Puspitasari, Songbo Zhang, Yibei Liu, Kuien Liu, Caiyan Qin, Fan Mo, Peng Wang (possible past Peking University affiliation), Yang Yang (possible past Tencent (China) affiliation), Chaoning Zhang
Abstract

Neural operators offer an effective framework for learning solutions of partial differential equations for many physical systems in a resolution-invariant and data-driven manner. Existing neural operators, however, often suffer from instability in multi-layer iteration and long-horizon rollout, which stems from the unconstrained Euclidean latent space updates that violate the geometric and conservation laws. To address this challenge, we propose to constrain manifolds with low-rank Lie algebra p...

πŸ“„ Rethinking ANN-based Retrieval: Multifaceted Learnable Index for Large-scale Recommendation System
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16124v1
πŸ‘₯ Authors: Jiang Zhang (possible past Google (United States) affiliation), Yubo Wang, Wei Chang, Lu Han (possible past Google (United States) affiliation), Xingying Cheng, Feng Zhang, Min Li, Songhao Jiang, Wei Zheng, Harry Tran, Zhen Wang, Lei Chen, Yueming Wang, Benyu Zhang, Xiangjun Fan, Bi Xue, Qifan Wang (possible past Google (United States) affiliation)
Abstract

Approximate nearest neighbor (ANN) search is widely used in the retrieval stage of large-scale recommendation systems. In this stage, candidate items are indexed using their learned embedding vectors, and ANN search is executed for each user (or item) query to retrieve a set of relevant items. However, ANN-based retrieval has two key limitations. First, item embeddings and their indices are typically learned in separate stages: indexing is often performed offline after embeddings are trained, wh...

πŸ“„ OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16110v1
πŸ‘₯ Authors: Tianwei Lin (possible past Baidu (China) affiliation), Zhongwei Qiu, Wenqiao Zhang, Jiang Liu, Yihan Xie, Mingjian Gao, Zhenxuan Fan, Zhaocheng Li, Sijing Li, Zhongle Xie, Peng Lu, Yueting Zhuang, Yingda Xia, Ling Zhang (possible past Nvidia (United States) affiliation), Beng Chin Ooi (possible past National University Of Singapore affiliation)
Abstract

Computed Tomography (CT) is one of the most widely used and diagnostically information-dense imaging modalities, covering critical organs such as the heart, lungs, liver, and colon. Clinical interpretation relies on both slice-driven local features (e.g., sub-centimeter nodules, lesion boundaries) and volume-driven spatial representations (e.g., tumor infiltration, inter-organ anatomical relations). However, existing Large Vision-Language Models (LVLMs) remain fragmented in CT slice versus volum...

πŸ“„ Improving Interactive In-Context Learning from Natural Language Feedback
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.16066v1
πŸ‘₯ Authors: Martin Klissarov, Jonathan Cook, Diego Antognini, Hao Sun, Jingling Li, Natasha Jaques (possible past University Of California, Berkeley affiliation), Claudiu Musat, Edward Grefenstette (possible past University Of Oxford affiliation)
Abstract

Adapting one's thought process based on corrective feedback is an essential ability in human learning, particularly in collaborative settings. In contrast, the current large language model training paradigm relies heavily on modeling vast, static corpora. While effective for knowledge acquisition, it overlooks the interactive feedback loops essential for models to adapt dynamically to their context. In this work, we propose a framework that treats this interactive in-context learning ability not...

πŸ“„ How Uncertain Is the Grade? A Benchmark of Uncertainty Metrics for LLM-Based Automatic Assessment
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.16039v1
πŸ‘₯ Authors: Hang Li (possible past Huawei Technologies (China) affiliation), Kaiqi Yang, Xianxuan Long, Fedor Filippov, Yucheng Chu, Yasemin Copur-Gencturk, Peng He (possible past Tencent (China) affiliation), Cory Miller, Namsoo Shin, Joseph Krajcik, Hui Liu, Jiliang Tang
Abstract

The rapid rise of large language models (LLMs) is reshaping the landscape of automatic assessment in education. While these systems demonstrate substantial advantages in adaptability to diverse question types and flexibility in output formats, they also introduce new challenges related to output uncertainty, stemming from the inherently probabilistic nature of LLMs. Output uncertainty is an inescapable challenge in automatic assessment, as assessment results often play a critical role in informi...

πŸ“„ Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15827v1
πŸ‘₯ Authors: Zhen Wu, Xiaoyu Huang, Lujie Yang, Yuanhang Zhang, Koushil Sreenath, Xi Chen (possible past University Of California, Berkeley affiliation), Pieter Abbeel (possible past University Of California, Berkeley affiliation), Rocky Duan, Angjoo Kanazawa (possible past University Of California, Berkeley affiliation), Carmelo Sferrazza, Guanya Shi, C. Karen Liu (possible past Stanford University affiliation)
Abstract

While recent advances in humanoid locomotion have achieved stable walking on varied terrains, capturing the agility and adaptivity of highly dynamic human motions remains an open challenge. In particular, agile parkour in complex environments demands not only low-level robustness, but also human-like motion expressiveness, long-horizon skill composition, and perception-driven decision-making. In this paper, we present Perceptive Humanoid Parkour (PHP), a modular framework that enables humanoid r...

πŸ“„ MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15733v1
πŸ‘₯ Authors: Qiang Zhang (possible past Tsinghua University affiliation), Jiahao Ma, Peiran Liu, Shuai Shi, Zeran Su, Zifan Wang, Jingkai Sun, Wei Cui, Jialin Yu, Gang Han, Wen Zhao, Pihai Sun, Kangning Yin, Jiaxu Wang, Jiahang Cao, Lingfeng Zhang, Hao Cheng (possible past Tencent (China) affiliation), Xiaoshuai Hao, Yiding Ji, Junwei Liang (possible past Carnegie Mellon University affiliation), Jian Tang, Renjing Xu, Yijie Guo
Abstract

Humanoid motion control has witnessed significant breakthroughs in recent years, with deep reinforcement learning (RL) emerging as a primary catalyst for achieving complex, human-like behaviors. However, the high dimensionality and intricate dynamics of humanoid robots make manual motion design impractical, leading to a heavy reliance on expensive motion capture (MoCap) data. These datasets are not only costly to acquire but also frequently lack the necessary geometric context of the surrounding...

πŸ“„ Spanning the Visual Analogy Space with a Weight Basis of LoRAs
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15727v1
πŸ‘₯ Authors: Hila Manor, Rinon Gal, Haggai Maron, Tomer Michaeli (possible past Technion – Israel Institute Of Technology affiliation), Gal Chechik (possible past Google (United States) affiliation)
Abstract

Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet $\{\mathbf{a}$, $\mathbf{a}'$, $\mathbf{b}\}$, the goal is to generate $\mathbf{b}'$ such that $\mathbf{a} : \mathbf{a}' :: \mathbf{b} : \mathbf{b}'$. Recent methods adapt text-to-image models to this task using a single Low-Rank Adaptation (LoRA) module, but they face a fundamental limitation...

πŸ“„ PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15669v1
πŸ‘₯ Authors: Xiachong Feng, Liang Zhao (possible past Baidu (China) affiliation), Weihong Zhong, Yichong Huang, Yuxuan Gu, Lingpeng Kong (possible past Google (United States) affiliation), Xiaocheng Feng, Bing Qin
Abstract

Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that supp...

πŸ“„ STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15620v2
πŸ‘₯ Authors: Shiqi Liu, Zeyu He, Guojian Zhan, Letian Tao, Zhilong Zheng, Jiang Wu, Yinuo Wang, Yang Guan, Kehua Sheng, Bo Zhang (possible past Tencent (China) affiliation), Keqiang Li, Jingliang Duan (possible past Tsinghua University affiliation), Shengbo Eben Li (possible past Tsinghua University affiliation)
Abstract

Reinforcement Learning (RL) has significantly improved large language model reasoning, but existing RL fine-tuning methods rely heavily on heuristic techniques such as entropy regularization and reweighting to maintain stability. In practice, they often suffer from late-stage performance collapse, leading to degraded reasoning quality and unstable training. Our analysis shows that the magnitude of token-wise policy gradients in RL is negatively correlated with token probability and local policy ...

πŸ“„ Dynamic Training-Free Fusion of Subject and Style LoRAs
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15539v1
πŸ‘₯ Authors: Qinglong Cao, Yuntian Chen, Chao Ma (possible past Shanghai Jiao Tong University affiliation), Xiaokang Yang (possible past Shanghai Jiao Tong University affiliation)
Abstract

Recent studies have explored the combination of multiple LoRAs to simultaneously generate user-specified subjects and styles. However, most existing approaches fuse LoRA weights using static statistical heuristics that deviate from LoRA's original purpose of learning adaptive feature adjustments and ignore the randomness of sampled inputs. To address this, we propose a dynamic training-free fusion framework that operates throughout the generation process. During the forward pass, at each LoRA-ap...

πŸ“„ CDRL: A Reinforcement Learning Framework Inspired by Cerebellar Circuits and Dendritic Computational Strategies
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15367v1
πŸ‘₯ Authors: Sibo Zhang (possible past Baidu (China) affiliation), Rui Jing, Liangfu Lv, Jian Zhang (possible past Tencent (China) affiliation), Yunliang Zang
Abstract

Reinforcement learning (RL) has achieved notable performance in high-dimensional sequential decision-making tasks, yet remains limited by low sample efficiency, sensitivity to noise, and weak generalization under partial observability. Most existing approaches address these issues primarily through optimization strategies, while the role of architectural priors in shaping representation learning and decision dynamics is less explored. Inspired by structural principles of the cerebellum, we propo...

πŸ“„ On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15322v1
πŸ‘₯ Authors: Taejong Joo, Wenhan Xia, Cheolmin Kim, Ming Zhang (possible past Peking University affiliation), Eugene Ie (possible past Google (United States) affiliation)
Abstract

Training large language models (LLMs) relies almost exclusively on dense adaptive optimizers with increasingly sophisticated preconditioners. We challenge this by showing that randomly masking parameter updates can be highly effective, with a masked variant of RMSProp consistently outperforming recent state-of-the-art optimizers. Our analysis reveals that the random masking induces a curvature-dependent geometric regularization that smooths the optimization trajectory. Motivated by this finding,...

πŸ“„ X-MAP: eXplainable Misclassification Analysis and Profiling for Spam and Phishing Detection
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15298v1
πŸ‘₯ Authors: Qi Zhang (possible past Tencent (China) affiliation), Dian Chen (possible past University Of California, Berkeley affiliation), Lance M. Kaplan, Audun JΓΈsang, Dong Hyun Jeong, Feng Chen, Jin-Hee Cho
Abstract

Misclassifications in spam and phishing detection are very harmful, as false negatives expose users to attacks while false positives degrade trust. Existing uncertainty-based detectors can flag potential errors, but possibly be deceived and offer limited interpretability. This paper presents X-MAP, an eXplainable Misclassification Analysis and Profilling framework that reveals topic-level semantic patterns behind model failures. X-MAP combines SHAP-based feature attributions with non-negative ma...

πŸ“„ Retrieval-Augmented Foundation Models for Matched Molecular Pair Transformations to Recapitulate Medicinal Chemistry Intuition
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16684v1
πŸ‘₯ Authors: Bo Pan, Peter Zhiping Zhang, Hao-Wei Pang, Alex Zhu, Xiang Yu (possible past University Of Washington affiliation), Liying Zhang, Liang Zhao (possible past Baidu (China) affiliation)
Abstract

Matched molecular pairs (MMPs) capture the local chemical edits that medicinal chemists routinely use to design analogs, but existing ML approaches either operate at the whole-molecule level with limited edit controllability or learn MMP-style edits from restricted settings and small models. We propose a variable-to-variable formulation of analog generation and train a foundation model on large-scale MMP transformations (MMPTs) to generate diverse variables conditioned on an input variable. To e...

πŸ“„ Factored Latent Action World Models
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16229v1
πŸ‘₯ Authors: Zizhao Wang, Chang Shi, Jiaheng Hu, Kevin Rohling, Roberto MartΓ­n-MartΓ­n (possible past Stanford University affiliation), Amy Zhang (possible past University Of California, Berkeley affiliation), Peter Stone
Abstract

Learning latent actions from action-free video has emerged as a powerful paradigm for scaling up controllable world model learning. Latent actions provide a natural interface for users to iteratively generate and manipulate videos. However, most existing approaches rely on monolithic inverse and forward dynamics models that learn a single latent action to control the entire scene, and therefore struggle in complex environments where multiple entities act simultaneously. This paper introduces Fac...

πŸ“„ Amortized Predictability-aware Training Framework for Time Series Forecasting and Classification
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16224v1
πŸ‘₯ Authors: Xu Zhang (possible past Tencent (China) affiliation), Peng Wang (possible past Peking University affiliation), Yichen Li, Wei Wang (possible past University Of Oxford affiliation)
Abstract

Time series data are prone to noise in various domains, and training samples may contain low-predictability patterns that deviate from the normal data distribution, leading to training instability or convergence to poor local minima. Therefore, mitigating the adverse effects of low-predictability samples is crucial for time series analysis tasks such as time series forecasting (TSF) and time series classification (TSC). While many deep learning models have achieved promising performance, few con...

πŸ“„ SEMixer: Semantics Enhanced MLP-Mixer for Multiscale Mixing and Long-term Time Series Forecasting
πŸ—“οΈ Published: 2/18/2026
πŸ”— http://arxiv.org/abs/2602.16220v1
πŸ‘₯ Authors: Xu Zhang (possible past Tencent (China) affiliation), Qitong Wang, Peng Wang (possible past Peking University affiliation), Wei Wang (possible past University Of Oxford affiliation)
Abstract

Modeling multiscale patterns is crucial for long-term time series forecasting (TSF). However, redundancy and noise in time series, together with semantic gaps between non-adjacent scales, make the efficient alignment and integration of multi-scale temporal dependencies challenging. To address this, we propose SEMixer, a lightweight multiscale model designed for long-term TSF. SEMixer features two key components: a Random Attention Mechanism (RAM) and a Multiscale Progressive Mixing Chain (MPMC)....

πŸ“„ Examining Fast Radiative Feedbacks Using Machine-Learning Weather Emulators
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.16090v1
πŸ‘₯ Authors: Ankur Mahesh, William D. Collins, Travis A. O'brien, Paul B. Goddard, Sinclaire Zebaze, Shashank Subramanian, James P. C. Duncan, Oliver Watt-Meyer, Boris Bonev, Thorsten Kurth (possible past Nvidia (United States) affiliation), Karthik Kashinath (possible past Nvidia (United States) affiliation), Michael S. Pritchard, Da Yang
Abstract

The response of the climate system to increased greenhouse gases and other radiative perturbations is governed by a combination of fast and slow feedbacks. Slow feedbacks are typically activated in response to changes in ocean temperatures on decadal timescales and manifest as changes in climatic state with no recent historical analogue. However, fast feedbacks are activated in response to rapid atmospheric physical processes on weekly timescales, and they are already operative in the present-da...

πŸ“„ Operationalising the Superficial Alignment Hypothesis via Task Complexity
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15829v1
πŸ‘₯ Authors: TomΓ‘s Vergara-Browne, Darshan Patil, Ivan Titov, Siva Reddy (possible past University Of Edinburgh affiliation), Tiago Pimentel (possible past Eth Zurich affiliation), Marius Mosbach
Abstract

The superficial alignment hypothesis (SAH) posits that large language models learn most of their knowledge during pre-training, and that post-training merely surfaces this knowledge. The SAH, however, lacks a precise definition, which has led to (i) different and seemingly orthogonal arguments supporting it, and (ii) important critiques to it. We propose a new metric called task complexity: the length of the shortest program that achieves a target performance on a task. In this framework, the SA...

πŸ“„ Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15828v1
πŸ‘₯ Authors: Yuxuan Kuang, Sungjae Park, Katerina Fragkiadaki (possible past University Of California, Berkeley affiliation), Shubham Tulsiani (possible past University Of California, Berkeley affiliation)
Abstract

Learning generalist policies capable of accomplishing a plethora of everyday tasks remains an open challenge in dexterous manipulation. In particular, collecting large-scale manipulation data via real-world teleoperation is expensive and difficult to scale. While learning in simulation provides a feasible alternative, designing multiple task-specific environments and rewards for training is similarly challenging. We propose Dex4D, a framework that instead leverages simulation for learning task-a...

πŸ“„ GLM-5: from Vibe Coding to Agentic Engineering
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15763v1
πŸ‘₯ Authors: Glm-5 Team, :, Aohan Zeng, Xin Lv, Zhenyu Hou (possible past Baidu (China) affiliation), Zhengxiao Du, Qinkai Zheng, Bin Chen, Da Yin, Chendi Ge, Chengxing Xie, Cunxiang Wang, Gengzheng Pan, Hao Zeng, Haoke Zhang, Haoran Wang, Huilong Chen, Jiajie Zhang, Jian Jiao, Jiaqi Guo, Jingsen Wang, Jingzhao Du, Jinzhu Wu, Kedong Wang, Lei Li (possible past Carnegie Mellon University affiliation), Lin Fan, Lucen Zhong, Mingdao Liu, Mingming Zhao, Pengfan Du, Qian Dong, Rui Lu, Shuang-Li, Shulin Cao, Song Liu, Ting Jiang, Xiaodong Chen, Xiaohan Zhang, Xuancheng Huang, Xuezhen Dong, Yabo Xu, Yao Wei, Yifan An, Yilin Niu, Yitong Zhu, Yuanhao Wen, Yukuo Cen, Yushi Bai, Zhongpei Qiao, Zihan Wang (possible past Tsinghua University affiliation), Zikang Wang, Zilin Zhu, Ziqiang Liu, Zixuan Li, Bojie Wang, Bosi Wen, Can Huang, Changpeng Cai, Chao Yu, Chen Li (possible past Tencent (China) affiliation), Chen Li (possible past Tencent (China) affiliation), Chenghua Huang, Chengwei Hu, Chenhui Zhang, Chenzheng Zhu, Congfeng Yin, Daoyan Lin, Dayong Yang, Di Wang, Ding Ai, Erle Zhu, Fangzhou Yi, Feiyu Chen, Guohong Wen, Hailong Sun, Haisha Zhao, Haiyi Hu, Hanchen Zhang, Hanrui Liu, Hanyu Zhang, Hao Peng (possible past Tsinghua University affiliation), Hao Tai, Haobo Zhang, He Liu (possible past Google (United States) affiliation), Hongwei Wang, Hongxi Yan, Hongyu Ge, Huan Liu (possible past Tsinghua University affiliation), Huan Liu (possible past Tsinghua University affiliation), Huanpeng Chu, Jia'ni Zhao, Jiachen Wang, Jiajing Zhao, Jiamin Ren, Jiapeng Wang, Jiaxin Zhang, Jiayi Gui, Jiayue Zhao, Jijie Li, Jing An, Jing Li (possible past Tencent (China) affiliation), Jingwei Yuan, Jinhua Du, Jinxin Liu, Junkai Zhi, Junwen Duan, Kaiyue Zhou, Kangjian Wei, Ke Wang (possible past Google (United States) affiliation), Keyun Luo, Laiqiang Zhang, Leigang Sha, Liang Xu, Lindong Wu, Lintao Ding, Lu Chen, Minghao Li, Nianyi Lin, Pan Ta, Qiang Zou, Rongjun Song, Ruiqi Yang, Shangqing Tu, Shangtong Yang, Shaoxiang Wu, Shengyan Zhang, Shijie Li, Shuang Li, Shuyi Fan, Wei Qin, Wei Tian, Weining Zhang, Wenbo Yu, Wenjie Liang, Xiang Kuang, Xiangmeng Cheng, Xiangyang Li, Xiaoquan Yan, Xiaowei Hu, Xiaoying Ling, Xing Fan, Xingye Xia, Xinyuan Zhang, Xinze Zhang, Xirui Pan, Xunkai Zhang, Yandong Wu, Yanfu Li, Yidong Wang, Yifan Zhu, Yijun Tan, Yilin Zhou, Yiming Pan, Ying Zhang (possible past Tencent (China) affiliation), Yinpei Su, Yipeng Geng, Yipeng Geng, Yong Yan, Yonglin Tan, Yuean Bi, Yuhan Shen, Yuhao Yang, Yujiang Li, Yunan Liu, Yunqing Wang (possible past Google (United States) affiliation), Yuntao Li, Yurong Wu, Yutao Zhang, Yuxi Duan, Yuxuan Zhang, Zezhen Liu, Zhengtao Jiang, Zhenhe Yan, Zheyu Zhang, Zhixiang Wei, Zhuo Chen, Zhuoer Feng, Zijun Yao, Ziwei Chai, Ziyuan Wang, Zuzhou Zhang, Bin Xu, Minlie Huang, Hongning Wang, Juanzi Li, Yuxiao Dong (possible past Microsoft (United States) affiliation), Jie Tang (possible past Tsinghua University affiliation)
Abstract

We present GLM-5, a next-generation foundation model designed to transition the paradigm of vibe coding to agentic engineering. Building upon the agentic, reasoning, and coding (ARC) capabilities of its predecessor, GLM-5 adopts DSA to significantly reduce training and inference costs while maintaining long-context fidelity. To advance model alignment and autonomy, we implement a new asynchronous reinforcement learning infrastructure that drastically improves post-training efficiency by decoupli...

πŸ“„ World Action Models are Zero-shot Policies
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15922v1
πŸ‘₯ Authors: Seonghyeon Ye, Yunhao Ge, Kaiyuan Zheng, Shenyuan Gao, Sihyun Yu, George Kurian, Suneel Indupuru, You Liang Tan, Chuning Zhu, Jiannan Xiang, Ayaan Malik, Kyungmin Lee, William Liang, Nadun Ranawaka, Jiasheng Gu, Yinzhen Xu, Guanzhi Wang (possible past Stanford University affiliation), Fengyuan Hu, Avnish Narayan, Johan Bjorck, Jing Wang (possible past Google (United States) affiliation), Gwanghyun Kim, Dantong Niu, Ruijie Zheng, Yuqi Xie, Jimmy Wu, Qi Wang (possible past Tsinghua University affiliation), Ryan Julian, Danfei Xu, Yilun Du (possible past Massachusetts Institute Of Technology affiliation), Yevgen Chebotar (possible past Google (United States) affiliation), Scott Reed (possible past Google (United States) affiliation), Jan Kautz (possible past Nvidia (United States) affiliation), Yuke Zhu (possible past Stanford University affiliation), Linxi "jim" Fan, Joel Jang
Abstract

State-of-the-art Vision-Language-Action (VLA) models excel at semantic generalization but struggle to generalize to unseen physical motions in novel environments. We introduce DreamZero, a World Action Model (WAM) built upon a pretrained video diffusion backbone. Unlike VLAs, WAMs learn physical dynamics by predicting future world states and actions, using video as a dense representation of how the world evolves. By jointly modeling video and action, DreamZero learns diverse skills effectively f...

πŸ“„ ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
πŸ—“οΈ Published: 2/17/2026
πŸ”— http://arxiv.org/abs/2602.15521v1
πŸ‘₯ Authors: Ziyu Zhao, Tong Zhu (possible past Nvidia (United States) affiliation), Zhi Zhang, Tiantian Fan, Jinluan Yang, Kun Kuang, Zhongyu Wei, Fei Wu (possible past Google (United States) affiliation), Yu Cheng (possible past National University Of Singapore affiliation)
Abstract

Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch is prohibitively expensive. A promising alternative is to convert pretrained dense models into sparse MoEs. Existing dense-to-MoE methods fall into two categories: \textbf{dynamic structural pruning} that converts dense models into MoE architectures with moderate sparsity to balance performance and inference effici...

*Notable papers are those with at least two authors from a "big" AI/ML lab.