论文标题

S3-net:通过单拍分段的快速轻巧的视频场景理解网络

S3-Net: A Fast and Lightweight Video Scene Understanding Network by Single-shot Segmentation

论文作者

Cheng, Yuan, Yang, Yuchao, Chen, Hai-Bao, Wong, Ngai, Yu, Hao

论文摘要

在视频中的实时理解在各种AI应用中至关重要,例如自动驾驶。这项工作为视频场景理解提供了快速的单次分割策略。所提出的NET(称为S3-NET)迅速定位和细分市场目标子剖面,同时提取物结构化的时间序列语义特征是基于LSTM的时空模型的输入。利用张力和量化技术,S3-NET旨在轻巧用于边缘计算。使用CityScapes,UCF11,HMDB51和MOMENTS数据集进行的实验表明,提议的S3-NET可准确提高8.1%,而基于3D-CNN的UCF11的方法,储存量为6.9倍,推理速度的储存速度和推理速度的降低为22.8 fps,而CityScapes则具有GTX1080808080TI的22.8 fps。

Real-time understanding in video is crucial in various AI applications such as autonomous driving. This work presents a fast single-shot segmentation strategy for video scene understanding. The proposed net, called S3-Net, quickly locates and segments target sub-scenes, meanwhile extracts structured time-series semantic features as inputs to an LSTM-based spatio-temporal model. Utilizing tensorization and quantization techniques, S3-Net is intended to be lightweight for edge computing. Experiments using CityScapes, UCF11, HMDB51 and MOMENTS datasets demonstrate that the proposed S3-Net achieves an accuracy improvement of 8.1% versus the 3D-CNN based approach on UCF11, a storage reduction of 6.9x and an inference speed of 22.8 FPS on CityScapes with a GTX1080Ti GPU.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源