论文标题

保护,展示,参加和讲述:授权图像字幕模型具有所有权保护

Protect, Show, Attend and Tell: Empowering Image Captioning Models with Ownership Protection

论文作者

Lim, Jian Han, Chan, Chee Seng, Ng, Kam Woh, Fan, Lixin, Yang, Qiang

论文摘要

总体而言,现有的知识产权(IP)对深神经网络的保护通常i)仅关注图像分类任务,ii)遵循一个标准的数字水印框架,该框架通常用于保护多媒体和视频内容的所有权。本文表明,当前的数字水印框架不足以保护图像字幕的任务,这些任务通常被视为Frontiers AI问题之一。作为一种补救措施,本文研究并提出了在复发性神经网络的隐藏记忆状态下的两个不同的嵌入方案,以保护图像字幕模型。从经验点来看,我们证明了一个伪造的钥匙将产生一个无法使用的图像字幕模型,从而击败了侵权的目的。据我们所知,这项工作是第一个在图像字幕任务上提出所有权保护的工作。同样,广泛的实验表明,所提出的方法不会损害Flickr30k和MS-Coco数据集上所有常见字幕指标上的原始图像字幕性能,同时它可以承受删除和模棱两可的攻击。代码可从https://github.com/jianhanlim/ipr-imagecaptioning获得

By and large, existing Intellectual Property (IP) protection on deep neural networks typically i) focus on image classification task only, and ii) follow a standard digital watermarking framework that was conventionally used to protect the ownership of multimedia and video content. This paper demonstrates that the current digital watermarking framework is insufficient to protect image captioning tasks that are often regarded as one of the frontiers AI problems. As a remedy, this paper studies and proposes two different embedding schemes in the hidden memory state of a recurrent neural network to protect the image captioning model. From empirical points, we prove that a forged key will yield an unusable image captioning model, defeating the purpose of infringement. To the best of our knowledge, this work is the first to propose ownership protection on image captioning task. Also, extensive experiments show that the proposed method does not compromise the original image captioning performance on all common captioning metrics on Flickr30k and MS-COCO datasets, and at the same time it is able to withstand both removal and ambiguity attacks. Code is available at https://github.com/jianhanlim/ipr-imagecaptioning

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源