大数据架构设计与实施

大数据架构设计与实施(任务书,开题报告,外文翻译,论文11000字)
摘 要
随着计算机技术和互联网的高速发展以及智能终端的运用普及,各种信息的产生与传递速度越来越快,各类数据也呈现出爆炸式的增长现状,海量数据慢慢占据我们的生活。面对形形色色的巨大数据量,大家不仅仅只关注对于数据的获取,而是开始着手于从海量的数据中提取有价值信息,进行数据挖掘。但是,传统的数据存储和处理方式已经不能够适应数据的增长速度,并且这些海量数据的存储数据结构不统一,存储格式也多种多样,处理起来相当麻烦。
本文首先对大数据进行研究,然后在基于大数据的特点和实际需求的基础上,分析多种大数据相关的存储分析技术,最终选择针对大数据存储与处理的开源分布式计算平台Hadoop进行深入研究,并基于Hadoop进行大数据架构设计与部署,最终配置出批量数据处理平台。
关键词:海量数据;Hadoop; HDFS; Map/Reduce
Abstract
With the use of high-speed development of computer technology and the Internet as well as the popularity of intelligent terminals, produces a variety of information transmission speed faster and faster, but also various types of data showing the status of explosive growth, mass data slowly occupy our lives. Faced with the huge amount of data of all kinds, we not only concerned with obtaining the data, but started to extract valuable information from vast amounts of data, and data mining. However, the traditional way of data storage and processing is no longer able to accommodate data growth, and data structures of these huge amounts of data is not uniform, a variety of storage formats, the process is pretty cumbersome.
This article first study of big data, and then based on the characteristics of the large data and actual demand, on the basis of analyzing a variety of large data related storage analysis technology, finally choice for large data storage and processing of Hadoop open source distributed computing platform for further research, and based on Hadoop data architecture design and deployment, the final configuration of mass data processing platform.
Key Words: mass data; Hadoop; HDFS; Map / Reduce; control node
[资料来源:Doc163.com]


目 录
第1章 绪论 1
1.1 选题的背景和意义 1
1.2 国内外研究现状 1
1.2.1 国外研究现状 2
1.2.2 国内研究现状 2
1.3 本文主要研究内容及技术路线 3
第2章 大数据及其相关技术 4
2.1 大数据 4
2.1.1 大数据定义 4
2.1.2 大数据分析 5
2.2 NoSQL数据库 6
2.3 Hadoop 6
第3章 基于Hadoop的大数据架构设计 8 [来源:http://www.doc163.com]
3.1 Hadoop体系架构 8
3.2 Hadoop核心设计 8
3.2.1 HDFS 8
3.2.2 MapReduce 9
3.2.3 HBase 10
第4章 基于Hadoop的大数据架构部署 11
4.1 环境安装与配置 11
4.1.1 JDK 11
4.1.2 Cygwin 11
4.1.3 Hadoop 13
4.2 试运行 15
4.3 应用实例 16
第5章 结论 20
5.1 总结 20
5.2 展望 20
参考文献 21
致谢 22 [资料来源:http://doc163.com]