论文标题

流式流动的深森林用于发展数据流分类

Streaming Active Deep Forest for Evolving Data Stream Classification

论文作者

Luong, Anh Vu, Nguyen, Tien Thanh, Liew, Alan Wee-Chung

论文摘要

近年来,深层神经网络(DNN)在机器学习的许多领域都取得了进步的动力。 DNNS的逐层过程启发了许多深层模型的发展,包括深层合奏。最引人注目的基于整体的模型是深森林,它可以实现高度竞争性的性能,而与DNN相比,具有更少的超参数。尽管在批处理学习环境中取得了巨大的成功,但尚未努力将深层森林适应不断发展的数据流的背景。在这项工作中,我们介绍了流媒体深森林(SDF)算法,这是一种高性能的深层合奏方法,专门适用于流分类。我们还提出了增强的可变不确定性(AVU)主动学习策略,以降低流媒体环境中的标签成本。我们将提出的方法与最新数据集中的最新流媒体算法进行了比较。结果表明,通过遵循AVU主动学习策略,只有70%标签预算的SDF显着优于其他所有实例培训的方法。

In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70\% of labeling budget significantly outperforms other methods trained with all instances.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源