本文已被:浏览 1797次 下载 720次
投稿时间:2016-08-08 修订日期:2017-06-15
投稿时间:2016-08-08 修订日期:2017-06-15
中文摘要: 为了解决进港航班排序中智能化程度不高的现实问题,提出了进港航班排序强化学习模型。首先确定了进港航班排序强化学习模型的状态、动作、智能体、环境、奖赏函数、约束条件、Q学习等,进港航班排序强化模型中的状态是各进港航班的到达时刻,动作是对航班到达时间的调整,智能体对航班的到达时刻进行调整,环境对动作做出反应,一个新的到达时间和奖赏值传给智能体。奖赏函数考虑了延误时间、经济成本、对后续航班的影响。该模型考虑了航班不能提前降落,分配的到达时间不早于计划的到达时间,进港航班流量不能超过机场的到达容量值等约束条件。使用双流机场进港航班数据对该模型进行了验证。对比分析了先到先服务和强化学习模型的排序、延误时间、延误成本、后续航班延误成本和奖赏值。先到先服务算法的奖赏函数值为3164,强化学习算法的奖赏函数为2880,强化学习模型更优。模型中奖惩函数的评价指标、权重、约束条件可以根据管制工作实际情况进行设置,该模型可以为空中交通管制人员进行进港航班排序提供决策支持。
Abstract:In order to solving the problem of low intelligence of arrival flights scheduling, arrival flights scheduling reinforcement learning model was proposed. First of all, the states, actions, agent, environment, rewards, constraint and Q learning of the model were defined. The state of arrival flights scheduling reinforcement learning was the arrival time of each arrival flight. The action of the model was the adjustment of the arrival time. The arrival time was adjusted by agent, the environment responded to the action, then a new arrival time and reward value were passed to the agent. The delay times, delay cost and impact on subsequent flights were chosen as the rewards. At the same time, the model considered the flight can not advance landing, the distributed arrival time was not earlier than the schedule arrival time and the arrival flow of the airport can not exceed the arrival capacity of the airport. The model was validated using of Shuangliu airport real flight data. First come first service model and reinforcement learning model were compared and analyzed from sequence, delay time, delay cost, delay cost of subsequent flights and rewards. The rewards value of first come first service model was 3164. The rewards value of reinforcement learning model was 2880. Reinforcement learning model was superior to first come first service model. The evaluation index, weight, constraint can be adjusted according to air traffic control actual working conditions. The model provided decision-making for air traffic controller.
文章编号:201600781 中图分类号: 文献标志码:
基金项目:国家空管委科研课题“军民航空管联合运行一体化模拟训练技术研究”(GKG201403004)
作者 | 单位 | |
武喜萍 | 四川大学 | wuxipingstar@126.com |
杨红雨 | 四川大学 | |
杨波* | 四川大学 | boyang@scu.edu.cn |
Author Name | Affiliation | |
wuxiping | Sichuan University | wuxipingstar@126.com |
yanghongyu | Sichuan University | |
yangbo | Sichuan University | boyang@scu.edu.cn |
作者简介:
引用文本:
武喜萍,杨红雨,杨波.进港航班排序强化学习模型研究[J].工程科学与技术,2017,49(Z2):173-178.
wuxiping,yanghongyu,yangbo.Research on Reinforcement Learning Model of Arrival Flights Scheduling[J].Advanced Engineering Sciences,2017,49(Z2):173-178.
引用文本:
武喜萍,杨红雨,杨波.进港航班排序强化学习模型研究[J].工程科学与技术,2017,49(Z2):173-178.
wuxiping,yanghongyu,yangbo.Research on Reinforcement Learning Model of Arrival Flights Scheduling[J].Advanced Engineering Sciences,2017,49(Z2):173-178.