论文标题
泄漏的好友:集成CPU-GPU系统上的跨组件秘密通道
Leaky Buddies: Cross-Component Covert Channels on Integrated CPU-GPU Systems
论文作者
论文摘要
图形处理单元(GPU)是当今计算平台范围内的无处不在的组件,从手机和平板电脑到个人计算机到高端服务器类平台。随着图形和视频工作负载的越来越重要,最近的处理器配备了集成在同一芯片上的GPU设备。集成的GPU与CPU共享一些资源,因此,从GPU到CPU的微体系攻击有可能,反之亦然。我们认为,这种类型的攻击,越过组件边界(GPU到CPU或VICE反之亦然)是新颖的,引入了独特的挑战,还为攻击者提供了新功能,当我们在这些环境中设计防御性时必须考虑这些功能。具体而言,我们考虑了由共享微体系组件(例如缓存)或共享争夺域(例如共享总线)产生的秘密通道攻击的潜力。我们通过开发两个可靠的秘密通道攻击来说明这两种类型的频道。第一个秘密通道使用英特尔集成的GPU体系结构中的共享LLC缓存。第二个是针对连接CPU和GPU与LLC的环形总线的基于争议的通道。跨组件通道引入了许多新的挑战,我们必须克服这些挑战,因为它们发生在使用不同计算模型的异质组件中,并使用不对称的内存层次结构互连。我们还利用GPU并行性来增加通信的带宽,即使不依赖公共时钟。基于LLC的通道可实现120 kbps的带宽,较低的误差率为2%,而基于争议的通道可提供高达400 kbps的误差率,错误率为0.8%。
Graphics Processing Units (GPUs) are a ubiquitous component across the range of today's computing platforms, from phones and tablets, through personal computers, to high-end server class platforms. With the increasing importance of graphics and video workloads, recent processors are shipped with GPU devices that are integrated on the same chip. Integrated GPUs share some resources with the CPU and as a result, there is a potential for microarchitectural attacks from the GPU to the CPU or vice versa. We believe this type of attack, crossing the component boundary (GPU to CPU or vice versa) is novel, introducing unique challenges, but also providing the attacker with new capabilities that must be considered when we design defenses against microarchitectrual attacks in these environments. Specifically, we consider the potential for covert channel attacks that arise either from shared microarchitectural components (such as caches) or through shared contention domains (e.g., shared buses). We illustrate these two types of channels by developing two reliable covert channel attacks. The first covert channel uses the shared LLC cache in Intel's integrated GPU architectures. The second is a contention based channel targeting the ring bus connecting the CPU and GPU to the LLC. Cross component channels introduce a number of new challenges that we had to overcome since they occur across heterogeneous components that use different computation models and are interconnected using asymmetric memory hierarchies. We also exploit GPU parallelism to increase the bandwidth of the communication, even without relying on a common clock. The LLC based channel achieves a bandwidth of 120 kbps with a low error rate of 2%, while the contention based channel delivers up to 400 kbps with a 0.8% error rate.