论文标题
Bigbird:大数据存储和分析在混合云中
BigBird: Big Data Storage and Analytics at Scale in Hybrid Cloud
论文作者
论文摘要
大规模实施大数据存储是一项复杂而艰巨的任务,需要高级基础架构。随着公共云计算的兴起,可以轻松利用各种大数据管理服务。作为Twitter“部分阴云密布”的关键部分,冷藏数据和分析系统正在移至公共云。本文展示了我们在使用Google Cloud平台中使用BigQuery设计可扩展的大数据存储和分析管理框架的方法,同时确保安全性,隐私和数据保护。本文还讨论了对公共云资源的局限性以及在大规模设计大数据存储和分析解决方案时如何有效克服它们的局限性。尽管本文讨论了Google Cloud Platform中的框架实现,但可以轻松地将其应用于所有主要的云提供商。
Implementing big data storage at scale is a complex and arduous task that requires an advanced infrastructure. With the rise of public cloud computing, various big data management services can be readily leveraged. As a critical part of Twitter's "Project Partly Cloudy", the cold storage data and analytics systems are being moved to the public cloud. This paper showcases our approach in designing a scalable big data storage and analytics management framework using BigQuery in Google Cloud Platform while ensuring security, privacy, and data protection. The paper also discusses the limitations on the public cloud resources and how they can be effectively overcome when designing a big data storage and analytics solution at scale. Although the paper discusses the framework implementation in Google Cloud Platform, it can easily be applied to all major cloud providers.