← 返回
延长工具寿命:通过寿命引导的强化学习掌握通用工具的熟练使用
Prolonging Tool Life: Learning Skillful Use of General-Purpose Tools Through Lifespan-Guided Reinforcement Learning
| 作者 | Po-Yen Wu · Cheng-Yu Kuo · Yuki Kadokawa · Takamitsu Matsubara |
| 期刊 | IEEE Access |
| 出版日期 | 2026年2月 |
| 卷/期 | 第 14 卷 第 null 期 |
| 技术分类 | 智能化与AI应用 |
| 技术标签 | 强化学习 有限元仿真 机器学习 可靠性分析 |
| 相关度评分 | ★★ 2.0 / 5.0 |
| 关键词 |
语言:
中文摘要
针对不确定任务环境下通用工具寿命易受操作方式影响的问题,本文提出一种将剩余使用寿命(RUL)嵌入强化学习奖励机制的框架,结合有限元分析与Miner法则估算应力累积导致的寿命衰减,并引入自适应奖励归一化机制,显著延长工具寿命并实现仿真到实物的有效迁移。
English Abstract
In inaccessible environments with uncertain task demands, robots often rely on general-purpose tools that lack predefined usage strategies. These tools are not tailored for particular operations, making their longevity highly sensitive to how they are used. This creates a fundamental challenge: how can a robot learn a tool-use policy that both completes the task and prolongs the tool’s lifespan? In this work, we address this challenge by introducing a reinforcement learning (RL) framework that incorporates tool lifespan as a factor during policy optimization. Our framework leverages Finite Element Analysis (FEA) and Miner’s Rule to estimate Remaining Useful Life (RUL) based on accumulated stress, and integrates the RUL into the RL reward to guide policy learning toward lifespan-guided behavior. To handle the fact that RUL can only be estimated after task execution, we introduce an Adaptive Reward Normalization (ARN) mechanism that dynamically adjusts reward scaling based on estimated RULs, ensuring stable learning signals. We validate our method across simulated and real-world tool use tasks, including Object-Moving and Door-Opening with multiple general-purpose tools. The learned policies consistently prolong tool lifespan (up to $8.01\times $ in simulation) and transfer effectively to real-world settings, demonstrating the practical value of learning lifespan-guided tool use strategies.
S
SunView 深度解读
该研究在机器人操作层面探索寿命感知的智能决策,与阳光电源业务直接关联度较低。但其寿命建模(FEA+Miner规则)和强化学习驱动的可靠性优化思路,可启发PCS、组串式逆变器等功率设备在动态工况下的热-电-机械多物理场协同寿命预测与主动保护策略设计,尤其适用于PowerTitan等长期运行储能系统中对IGBT模块、电容等关键器件的RUL在线评估与降额控制优化。建议在可靠性实验室开展功率器件级寿命引导控制预研。