论文标题

销售时间序列分析使用深度Q学习

Sales Time Series Analytics Using Deep Q-Learning

论文作者

Pavlyshenko, Bohdan M.

论文摘要

本文描述了在销售时间序列分析问题中使用深Q学习模型。与监督的机器学习相反,这是一种使用历史数据的被动学习,Q学习是一种积极学习,目标是通过最佳动作序列最大化奖励。在工作中考虑了用于最佳定价策略和供求问题的免费Q学习方法。研究的主要思想是表明,在时间序列分析中使用深度Q学习方法,可以通过最大化奖励功能来优化动作序列,而当可以使用参数模型的学习环境进行建模时,并且在使用基于历史数据的模型的情况下,可以对奖励函数进行建模。在定价优化的案例研究环境中,使用对额外价格和随机模拟需求的销售依赖性建模。在定价优化案例研究中,环境是使用销售依赖对额外价格和随机模拟需求进行建模的。在供需案例研究中,建议将历史需求时间序列用于环境建模,代理状态由促销行动,先前的需求价值和每周季节性特征代表。获得的结果表明,使用深度Q学习,我们可以优化价格优化和供应问题的决策过程。使用参数模型和历史数据的环境建模可用于学习代理的冷启动。在接下来的一步中,在冷启动后,训练有素的代理可以在实际的商业环境中使用。

The article describes the use of deep Q-learning models in the problems of sales time series analytics. In contrast to supervised machine learning which is a kind of passive learning using historical data, Q-learning is a kind of active learning with goal to maximize a reward by optimal sequence of actions. Model free Q-learning approach for optimal pricing strategies and supply-demand problems was considered in the work. The main idea of the study is to show that using deep Q-learning approach in time series analytics, the sequence of actions can be optimized by maximizing the reward function when the environment for learning agent interaction can be modeled using the parametric model and in the case of using the model which is based on the historical data. In the pricing optimizing case study environment was modeled using sales dependence on extras price and randomly simulated demand. In the pricing optimizing case study, the environment was modeled using sales dependence on extra price and randomly simulated demand. In the supply-demand case study, it was proposed to use historical demand time series for environment modeling, agent states were represented by promo actions, previous demand values and weekly seasonality features. Obtained results show that using deep Q-learning, we can optimize the decision making process for price optimization and supply-demand problems. Environment modeling using parametric models and historical data can be used for the cold start of learning agent. On the next steps, after the cold start, the trained agent can be used in real business environment.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源