伊人99re_av日韩成人_91高潮精品免费porn_色狠狠色婷婷丁香五月_免费看的av_91亚色网站

中培偉業(yè)IT資訊頻道
您現(xiàn)在的位置:首頁 > IT資訊 > 學(xué)習(xí)交流 > The Best Practices of Enterprise-level Data Center Construction

The Best Practices of Enterprise-level Data Center Construction

2017-07-28 16:27:09 | 來源:中培企業(yè)IT培訓(xùn)網(wǎng)

At present, most of data centers used by engineering enterprises are built by using traditional technology with several disadvantages, which contain high construction cost, weak scalability and limited capacities of calculation and analysis. To meet the need of data storage, processing, analysis and application based on big data, enterprise-level data centers, which combined with many technologies such as parallel computing, analysis of large-scale data, linear expansion, support of all types of data, are able to effectively achieve the centralized integration and analysis of data resources in all of businesses, levels and types.

At present, data centers built by most enterprises in engineering industry accumulate a large amount of structured data, unstructured data, geographic information data and massive real-time data. At the same time, most of them use centralized server architectures (such as Oracle Rac), which leads to weak scalability, so that it cannot meet the increasing need of data storage. Besides, data processing is mainly based on single-point models, lacking the capacity of real-time parallel computing, so that it cannot meet the need of processing the massive data in real time. Data storage and processing can only cope with structure data; it cannot effectively stores, processes or analyzes unstructured data; it cannot provide the service of data storage and processing in all directions and types under the environment of big data; it cannot support the deep analysis of data.

The overall structure of enterprise-level data centers in engineering industry based on big data is shown in Figure 1. According to the layers, it can be divided into seven layers, including data source layer, data integration layer, data storage layer, analysis/service layer, business application layer, front end access layer, overall data management platform.

 

Figure1. The Overall Structure of Enterprise-level Data Centers in Engineering IndustryBased on Big Data

By using interface tables, interface files, data reception services and data information reception, data centers can achieve the acquisition of structured data, unstructured data and real-time data to meet the requirement of different data timeliness. In the data storage layer, data centers contain data storage platforms, distributed data platforms and streaming data platforms to store data with different characteristics and provide the related data services. Data centers provide the integrated result data through the ways like push in batches and real-time data service, and meet the requirements of data sharing and data application through the ways of asynchronous data push. Besides, data centers achieve the functions of comprehensive information display and functional analysis and decision-making, and meet the requirement of displaying all kinds of analysis results in front ends through integrated display in various front ends (such as PC terminal, large screen terminal and mobile terminal). Meanwhile, data centers provide data resource management, which means managing metadata, data quality, data standards, data models and data resources in data centers.

Data Integration Layer:

[including data acquisition and job scheduling]

Data acquisition

Data acquisition refers to delivering the structured data, unstructured data and real-time data of the collecting source systems. It contains interface table processing, message reception processing, data reception processing, real-time data acquisition processing and unstructured files processing.

Job scheduling

Job scheduling can achieve the scheduling of structured data, unstructured data and real-time data, the operation of inner data in data centers (including ETL, MapReduce, Sqoop, etc.), and the unified centralized scheduling of jobs pushed to each target system by data. It implements scheduling engines, provides the automatic and manual adjustment mode of the job and controls the execution order of the job based on the job dependency configuration information. Meanwhile, it controls the concurrency of the job and records the running results and the logs of the job.

Data Storage Layer:

[It contains traditional data repository platforms based on relational database as well as distributed data platforms and streaming data platforms based on Hadoop ecosystem, which can store different data and provide different data services.]

The data repository platform

The data repository platform uses hierarchical design, divided itself into buffer layer, integration layer, summary layer and market layer.

Buffer layer stores data collected from source systems by data centers. It can share the pressure of distributing data in bulk and in real time in source systems, avoiding the problems of performance pressure, jet lag of different versions, developing for many times, redundancy storage because of getting data repeatedly. Meanwhile, as a kind of data source, it can avoid the influence to data integration layer and summary layer because of the changes of the original systems (such as data structure, time window).

Integration layer is the business data after data cleaning, conversion and integration, which is the core data layer in data centers.

Summary layer forms statistic and aggregate enterprise data according to the subject dimension; it can form aggregate data according to the requirement of processing the subject reports; the storage of aggregate data is formed by storing aggregate data according to main body and calculating business data through the dimensions of data, main body and processing types .

Data market layer is the analytical data set for specific business units (such as business departments). The data in the layer is mainly based on the data of integration layer and summary layer, which also contains the specific analytical data supporting targets.

The distributed data platform

The distributed data platform mainly stores the following types of data: massive structured data, unstructured file data and dumping data of streaming data and relational database which are difficult to store in traditional relational database. According to the data storage requirement and the characteristic of distributed platform technology component, the platform can be divided into HBase-based data storage area and Hive-based data storage area.

Unstructured data layer stores the unstructured data from all source systems, which contains office documents, design drawings, text files, image files, etc.

Massive structured data layer stores the massive structured data from all structured systems.

Dumping layer of streaming data stores the periodic dumping data from streaming data platforms, help streaming data platforms to achieve the persistent storage of real-time data.

The streaming data platform

The streaming data platform includes real-time data integration layer, real-time data summary layer and business data buffer layer.

As for real-time data integration layer, in the integration layer of streaming data platform, the entry end of source systems uniformly use the way of Socket communication to interact to avoid the inconsistency of the data source. The data center systems monitor Socket of source systems. When there is data in source systems, the monitor procedures obtain the data and write the source information of monitored data in the corresponding message queue.

As for real-time data summary layer, it processes the source data of message queue in integration layer by using Storm in the way of streaming data. Besides, it aggregates, calculates and stores data according to the business requirements.

As for business data buffer layer, when the calculation to streaming data by Storm is finished, it can figure out the result data according to the specific business logic. (The corresponding architecture is shown in Figure 2).

 

Analysis/Service Layer:

It includes comprehensive information display platform, intelligent analysis and decision-making platform and data services (shown in Figure 3)

Comprehensive information display platform

Comprehensive information display platform, based on data storage layer, is an application including report query and comprehensive analysis to achieve the dynamic configuration to analysis of the page content, layout, components, CCTV, linkage relations, etc.

Intelligent analysis and decision-making platform

Intelligent analysis and decision-making platform includes several modules, such as data loading, data preprocessing, data mining algorithm, analysis model management and model operation scheduling. It provides technical support for data understanding, data preprocessing, algorithm modeling, model evaluation, model application, etc. Besides, to meet the requirement of big data analysis, it digs algorithms library combined with big data (It includes three types of mining algorithms. They are descriptive mining algorithm such as clustering analysis and correlation analysis, predictive mining algorithm such as classification analysis, evolution analysis and heterogeneous analysis as well as the mining algorithm of dedicated data analysis such as text analysis, speech analysis, image analysis, video analysis, etc.)

Data services

Data services mainly achieve real-time data services, subscription, release, batch data services, etc. Besides, it provides the cache function to enhance the overall performance of the system.

 

Data Management Layer:

It includes functions of metadata management, data quality management, main data management, data standard management and centralized job scheduling and monitoring (shown in Figure 4).

Metadata management

It can achieve the rapid search, acquisition, use and sharing to metadata in data centers. Besides, it can provide metadata support for data centers data sharing and exchange, multidimensional analysis, assistant decision making, data mining, etc.

Data quality management

It can achieve the normalized quality audit of data in data centers and ensure the real-time, complete and compliance of data receiving in business systems.

Main data management

It can achieve the unified management, application and maintenance of main data like materials, projects and contracts to ensure the consistency and stability of main data modification.

Data standard management

It can achieve the unified management of standard documents in data centers.

Centralized job scheduling and monitoring

It can achieve the unified dispatching management and monitoring of ETL interface operations and big data operations.

 

With the development of information level in engineering industry, the information systems have been fully integrated into all aspects of the businesses of enterprise production and management, which have accumulated a large number of structured data, unstructured data, geographic information data and massive real-time data. As a result, using big data-based enterprise-level data centers can make up the disadvantages of traditional technology, solve the problems of weak expansibility, high construction costs and limited capacities of calculation, analysis and mining and meet the requirement of storage, processing, analysis and application of all types of data under the environment of big data.


相關(guān)閱讀

主站蜘蛛池模板: 涩涩亚洲乱码精品 | 国产免费观看久久黄av片 | 亚洲有码转帖 | 17草视频 | 成人免毛片 | 久久精品一区二区三区四区 | 久久久综 | 97热在线| 国产精品人妻一区免费看8c0m | 国产精品亚洲一区二区三区在线 | 午夜久久国产 | 国产久操视频 | 伊人色综 | 欧美最猛性xxxx | 亚洲色精品VR一区区三区 | 国产精成人品一区 | 国产精品成人观看视频 | 欧美成人精品一区二区三区 | 亚洲日本青草视频在线怡红院 | 强奷漂亮人妻系列老师 | 高清毛片aaaaaaaaa片 | 麻豆果传媒成人A片免费看 亚洲国产精品婷婷久久久久 | 亚洲精品久久无码午夜一区二区 | 久久久久国产精品人妻AⅤ网址一 | 国产精品视频在线观看免费 | 一区二区色 | 亚洲免费精品视频 | 国产香蕉青春草原久久 | 一区二区三区四区欧美 | 中国一级毛片视频 | 欧美精品狠狠色丁香婷婷 | 老太奶性BBwBBWBBw | 雪花飘在线观看免费高清 | 毛茸茸xxxx | BGMBGMBGM欧美老妇 | 欧美一级片aaa | www.99日本精品片com | 波多野无码黑人在线播放 | av狼友永久免费网址观看 | 成人一区二区三区中文字幕 | 精品久久久久免费影院的功能介绍 |