陈骋,苏成杰. 面向煤矿巡检机器人的高能效路径规划方法[J]. 煤矿安全,2024,55(6):211−216. doi: 10.13347/j.cnki.mkaq.20240016
    引用本文: 陈骋,苏成杰. 面向煤矿巡检机器人的高能效路径规划方法[J]. 煤矿安全,2024,55(6):211−216. doi: 10.13347/j.cnki.mkaq.20240016
    CHEN Cheng, SU Chengjie. Energy efficient path planning method for coal mine patrol robot[J]. Safety in Coal Mines, 2024, 55(6): 211−216. doi: 10.13347/j.cnki.mkaq.20240016
    Citation: CHEN Cheng, SU Chengjie. Energy efficient path planning method for coal mine patrol robot[J]. Safety in Coal Mines, 2024, 55(6): 211−216. doi: 10.13347/j.cnki.mkaq.20240016

    面向煤矿巡检机器人的高能效路径规划方法

    Energy efficient path planning method for coal mine patrol robot

    • 摘要: 针对现有矿用机器人路径路规划方法存在的效率低、收敛速度慢、易陷入局部最优等不足,提出了一种基于Actor-Critic算法的路径规划方法。首先根据巡检目标和障碍物的实时位置信息,计算巡检机器人的转向角,确定行进方向,可显著提高路径规划的效率;以能量消耗最小化和避免碰撞为目标,巡检机器人根据动态随机变化的矿山环境,学习巡检的目标顺序和行进速度;因为矿山环境动态连续变化,导致较高的状态维度,因此采用深度学习网络估计连续状态产生的动作和奖赏;为了提高学习效率,采用策略网络和价值网络2个网络,实现实时更新策略和价值。仿真结果表明:采用所提方法,巡检机器人可以在动态环境中规划出安全合理的巡检路线,能够以98%的成功概率和更低的能量消耗完成巡检作业。

       

      Abstract: In order to solve the shortcomings of the existing mining robot path planning methods, such as low efficiency, slow convergence speed, and easy to fall into local optimum, a path planning method based on Actor-Critic algorithm is proposed. Firstly, according to the real-time position information of the inspection target and the obstacles, the steering angle of the patrol robot is calculated and the forward direction is determined, which can significantly improve the efficiency of path planning. With the goal of minimizing energy consumption and avoiding collisions, the patrol robot learns the target inspection sequence and forward speed according to the dynamically changing mining environment. Because the dynamic and continuous changes of the mine environment lead to a high state dimension, the action and reward generated by the continuous state are estimated by the deep learning networks. In order to improve the efficiency of learning, two networks are adopted, namely the Actor network and the Critic network, to achieve real-time update of strategy and value. The simulation results show that the proposed method can design a safe and reasonable patrol route in a dynamic environment, and can complete the patrol task with a 98% success probability and lower energy consumption.

       

    /

    返回文章
    返回