cool hit counter Tencent Technology Class|Building a Log Analysis Platform Based on ElasticStack_Intefrankly

Tencent Technology Class|Building a Log Analysis Platform Based on ElasticStack

In order to let readers better understand "how to build a log analysis platform based on Elastic Stack", Tencent Technology Engineering Public No. has invited Chen Xi, an engineer from Tencent's Infrastructure Department, to record a simultaneous voice+PPT explanation in the "Tencent Technology Class" app.

The following is the text of the course.

With the rapid development of the Internet and the Internet of Things, software and hardware system architectures are becoming more and more complex, and it is becoming more and more difficult to analyze the logs generated by various systems. During the log analysis process, I believe most students will encounter the following problems.

1.Locating the problem took a lot of time Usually the modules of a system are scattered on individual machines, When locating the problem, Ops students can only log in machine by machine to check the logs。 In particular, distributed systems, Need to check the logs module by module, The process is more cumbersome, And a lot of wasted time.。

2. After locating the problem log, Difficult to filter valid information Logs are stored as plain text, In particular, systems in production environments print a lot of redundant logs。 In the context of an error message, When finding logs related to a problem, This can only be done with some simpleless、grep and other commands to do the filtering, Less efficient。 And the filtered result set may not be exactly as expected, It's more difficult to analyze。

3. High volume of logs, Easily removed mentioned earlier, Systems in production environments to provide more information in case of failure, Tends to print a relatively large number of redundant logs, Takes up too much local storage space, Encroachment on resources available for operations。 When the machine fails to log in, And no way to view the logs.。

Elastic Stack solves all the above problems! This session will focus on how to solve various problems encountered with log analysis through Elastic Stack.

Elastic Stack Architecture

Elastic Stack consists of four components, Beats, Logstash, Elasticsearch, and Kibana. They provide the ability to collect, clean, store, and visualize logs, and are a more complete architecture.

1.Beats: is a set of tools, Some light weight data collectors。 They need to be installed on each machine in the production environment, After they captured the local logs, towardLogstash perhapsElasticsearch data transmission。

2.Logstash: Dynamic data collection pipeline, Responsible for data cleansing、 formatting、 Adding collateral information, etc.。 Logstash The functions of the, It supports data collection from multiple systems, Another way to push data into various systems。

3.Elasticsearch: It is a distributed search engine, Storage of data、 The queries are all concentrated here。 It has more powerful cluster management features of its own, Support for horizontal expansion。 Data multi-partitioning、 Multi-Copy Storage, i.e., providing highly concurrent write query capabilities, And also ensure data reliability。

4.Kibana: Data Visualization Platform Supports a variety of rich charts, Visual presentation of log data。 An easy-to-use search interface is also provided, Simplifying the problem location process。

Looking at the architecture of the Elastic Stack, it is clear that it has the following capabilities.

1. A complete solution is provided, Includes log capture、 cleanse、 storage、 A complete tool chain for visualization。 No external dependencies, The architecture of the entire log analysis system is relatively simple。 fully functional, The requirements in the field of log analysis are basically covered。

2. Powerful analytical capabilities Elastic Stack That is, having the ability to format ordinary text intojson Structural Storage, Another rich query is provided、 polymerization interface, Facilitates statistical analysis。 Full-text indexing is also supported、 fuzzy matching、 Capabilities such as Chinese word separation, This is very useful for locating problems。

3. Reliable Distributed Storage Storage CoreElasticsearch It is a distributed storage system, Flexible and stretchable( Horizontal expansion), Multi-copy storage ensures data reliability, and has a relatively complete cluster management capability of its own, Easy to use。

Use Elastic Stack for log analysis

The previous section focuses on theElastic Stack architecture and basic competencies, Later on, we will focus on sharing how to Use Elastic Stack for log analysis。 Three main components are used:

1.Elasticsearch, Mainly responsible for log cleaning、 storage、 inquiry。

2.Filebeat, Log collection tools, beBeats A member of the family, As the name implies it is used to capture the content of text files。

3.Kibana, Provides primarily the ability to visualize data。

There are four components in the architecture of Elastic Stack, and this course removes Logstash. The reason is that although it is powerful, it is also more resource intensive and can easily become a bottleneck in the performance of the entire system. Its ability to format data logs is replaced by Elasticsearch's data pre-processing module. This helps to streamline the architecture and improve the overall high performance.

This diagram in the middle is the architectural scheme for this course and also represents the flow of log data, from left to right, through Beats (aka Filebeat), Elasticsearch, and Kibana. But to make it easier to understand, later sessions will introduce Elasticsearch first, as he is the core of the whole architecture, then Filebeat, the log collector, and finally Kibana, the visualization tool.

After this course, you will be able to learn how to turn plain text log files into formatted data, and how to quickly search for valid information through kibana, or turn log data into visual charts.

Introduction to Elasticsearch

The first part to introduce Elasticsearch, subsequently called ES. It is at the center of the Elastic Stack, the heart of the data storage, and therefore the most important part.

Here's a look at the overall architecture of ES and the Ince pipeline responsible for log cleaning.

The ES data model is divided into several layers, Index, Type, and Document. This diagram draws an analogy between these tiers and related concepts in traditional relational databases to make it easier to understand.

Index: similar to Database or Table of relational database, used to do the top-level classification of data. We can take the logs generated by different modules and store them in different Indexes.

Type: Similar to the concept of Table in relational databases, here there is some duplication with the concept of Index above. Therefore, in ES versions from 6.0 onwards, only one Type can exist for an Index, and support for Type will be removed in versions from 7.0 onwards. When using ES, don't specify more than one Type, and just divide the data classification entirely by Index.

Document: A concept similar to Row in a relational database, which is a row of data. Each Document in ES is a Json-format structure that can have multiple fields that hold different information.

There is also a separate concept of Mapping, analogous to schema in relational databases, which is used to specify the data type of each field. Normally, ES is Schema free. Mapping only needs to be specified if you encounter special types of fields, such as fields of type Date (time) that ES does not recognize by default, or fields of type String that do not need to be split. In this course, this concept will not be covered and we will just use the Mapping generated by ES by default.

The Index spoken of above is a logical concept that classifies the data stored in ES in terms of the namespace dimension (aka sub-table), with data of the same class placed in the same Index.

And when actually stored, an Index's data is distributed across multiple shards (called shards), with each shard hosting a portion of an Index's data. In turn, each slice can have multiple replicas, one of which is the primary copy, called primary, and the others are from the replica, called replica.

There are two Indices in the figure, Index1 and Index2. where Index1 has two Primary Shards, P1 and P2. Each Primary Shard has two more copies, for example, two R1 copies of P1.

At this point you understand the relationship between Index and Shard, and the concepts of Primary, Replica.

ES itself is again a multi-node distributed cluster, with shards being broken up and distributed across nodes. Different nodes can be distributed on different machines.

And by default ES does not have both primary and replica of the same shard on the same Node, so that even if one Node fails out of the cluster, it is guaranteed not to lose data because there is another copy in the cluster.

The figure shows P1 and two R1s, distributed over three Nodes (Node1, Node2, Node3), which carry exactly the same data. When Node1 and Node2 are down at the same time, there is still an R1 in the ES cluster that will not affect the data read and write services, much less lose data.

By ES's distributed architecture, you can summarize some of the benefits of ES:

high performance

o distributed search engine, Linearly scalable to enhance system performance

o severalshard Share of stress, Highly Concurrent Writes、 inquiry

high reliability

o Multi-Copy Storage, No data loss on node failure

o base frame、 Server room awareness, It is possible to set the sameshard The copies are assigned to different server rooms, Avoid data loss due to server room failure, Achieve master-slave disaster tolerance

easy management

o Comes with powerful cluster management features, Flexible and stretchable - shard Automatic equalization, Number of nodes can be increased or decreased at any time based on business changes


oRESTful interface, Easy to develop and debug

o Powerful aggregation and analysis capabilities, Can do complex statistical analysis( separate buckets、geo hash、 multilayer aggregation etc.)

By default, after downloading the ES installer, go to the ES directory and execute the following command directly and ES will start in the background and listen to port 9200 of the machine. We'll be able to read and write data through this port, no additional configuration of the ES is required.

The second part of this section of ES is devoted to a special module of ES, Ingest Pipeline, which is used to do pre-processing of write data.

Before the data is actually written to Index of ES, some modifications can be made to the data through Incest Pipeline. It is possible to define a number of processors inside it, and different processors can do different things to write data. For example, instead of Logstash, you can do the formatting of the logs through Grok Processor.

The diagram at the bottom shows how the Grok Processor works. For text written to ES (Simple Data), an expression (Grok Pattern) can be defined to specify how to do parsing of the input text. That is, the input text is split into parts, each of which is a separate field, according to certain rules. After parsing, the final structured data at the bottom is formed.

The so-called Grok Pattern is actually a regular expression, except that ES makes aliases for common regular expressions to make it easier for us to use them. So-called text parsing is also known as regular matching, and then the various parts that are matched are put into separate fields to form structured data.

The data in the figure is a sample log, divided into three parts, each parsed into three fields by regular matching. - The red part is parsed as the time field, representing the time when this log was generated. - The yellow part is resolved to the client field, representing the IP address of the client accessing the service. - The blue part is parsed as the duration field, representing the elapsed time of this service.

So how should a Pipeline be defined?

The figure shows the pipeline used in this course, and according to the ESPI, we can define a Pipeline with the REST PUT command.

The inest/pipeline in the URL is a fixed API for the ES definition pipeline, and the last apachelog is the name of the pipeline (pipeline_name) that we define, which is used when writing data later. A section of the apache log will be captured later using Filebeat, so the pipelien here takes the name apachelog.

Http's Bodyjson format definesprocessors, which contains three processor: grok, date, dateindexname. - Grok processor We've covered that the pattern used here is a predefined expression for parsing apache logs within ES, and we can use it directly. - Date processor is to change the timestamp field generated by grok to the Date type. Because the fields generated by grok by default are string type, you need to convert them here. - dateindexname processor: Usually our logs are time-dependent, in order to facilitate management, delete expired logs, etc., usually index is divided by time range, such as one index generated in a day, the data of the day are written to the same index, and then the next day the log generated will be written to a new index. Dateindexname processor automatically generates index name based on the time field (that is, the timestamp field generated above), so that you do not have to specify index name when writing data. The index name defined in the figure is a prefix apache_log@ plus the date of the day, and the actual resulting index name is shown in the bottommost part of the figure.

This concludes the ES portion of the presentation.

Filebeat principle

The second part introduces Filebeat, which is at the most upstream of the entire Elastic Stack and is responsible for log collection. The main focus is on the basics of Filebeat and how to configure it to start.

The basics of filebeat Filebeat is a member of the Beats family, Beats includes many tools, interested students can go to the official website to learn more.

As shown in the first image, Filebeat is a text collector that listens to text files, similar to the linux tailf command, and constantly collects new content in the files, then sends the collected data to Logstash or Elasticsearch.

It needs to be installed on a production environment, the machine that generates the log files, but it is written in the Golang language, which is more efficient; and it is simple, relatively lightweight, and takes up fewer machine resources.

It has the ability to restart the renewal pass. Filebeat saves the meta-information currently in the listening file in a registry file. As shown in the second figure, the path to the file, the inode information, and where the collector has read it are recorded. When the Filebeat process exits, the regstry file is read the next time it restarts, and for the files that are listened to, the contents of the file continue to be read from the location of the offset.

Filebeat is also stress-aware, and it dynamically adjusts the speed at which it delivers data based on the current load conditions of the ES cluster to prevent the ES cluster from being overstressed.

Basic configuration of filebeat

The configuration file for filebeat is in the filebeat root directory, in yaml format. By default, we only need to modify input (which files are collected) and output (where the data is output to).


oenabled: Default isfalse, change intotrue back,Filebeat Only then will the contents of the file be captured。

opaths: Path of the file to be captured, It's a sequence., Multiple paths can be assigned,filebeat will listen to the files in these paths at the same time。paths Wildcards are also supported, Convenient configuration of multiple files under one path。

output: The logs need to be exported to ES, so output.elasticsearch needs to be modified to.

ohosts: ES of each node of the clusterip harmonyport, It's a sequence., Addresses of multiple nodes in a cluster can be configured

opipeline: used when writing data to thepipeline, Here, fill in the fields that were previously inES Created inapache_log

This way filebeat is configured, and if you start it, you can just execute the filebeat binary. If you want to execute in the background, you can start it with the nohup command If you are worried about filebeat taking up too many machine resources, you can limit filebeat's resource usage by starting it with taskset tied core.

When the amount of logs collected is relatively large, the default configuration may not be able to meet the demand, so some commonly used tuning parameters are given here for your reference. Of course the optimal configuration of parameters varies from scenario to scenario, and you need to make the appropriate trade-offs and adjustments based on your actual usage scenario. For these parameters, without going into too much detail here, you can refer to the official documentation for filebeat. You can share your questions in the comments, and I will also sort out the reasons and effects of adjusting these parameters later when I have time.

Introduction to Kibana

The last section introduces Kibana, which is at the very downstream of Elastic Stack and provides mainly data analysis and visualization capabilities. The main focus is on the basic features of Kibana and how to query data and generate visual charts through it.

Kibana provides the ability to query, visualize, and statistically analyze data.

It is a separate process that needs to be downloaded and deployed independently, it can only bind to one ES cluster and cannot query data from multiple ES clusters at the same time.

Search the Discovery query interface to obtain target data with simple query criteria, improving the efficiency of problem location.

Visualize The Visualize feature, which is its biggest highlight, allows you to generate various charts as shown in the figure with the help of ES's aggregated query interface.

Data Management Console provides a command line interface that allows you to add, delete, and check data, and manage ES clusters through commands.

X-Pack Monitor When used in conjunction with X-Pack, a value-added service provided by ES, Kibana can also provide monitoring capabilities, which are not part of the scope of this course.

Starting kibana is also very simple, by default just execute the kibana binary directly from the kibana/bin directory, it will connect to port 9200 on the local machine by default, which is the default listening port for ES.

To query data via Kibana.

Step 1: First tell Kibana what Index we expect to query. As shown in the first image, via the Management page, configure an Index pattern to match the Index we are looking for. Filebeat uploads data through a pieline called apachelog and generates an Index at the beginning of the apachelog@, so you need to specify index pattern as apache_log@, so that it can match the index created by piepelien, as shown in the blue section of the figure.

Step 2: You need to select the time field for the data, here it is specified as the timestamp field defined in the pipeline.

The third step allows you to search for data in the Discovery interface. You can enter the query criteria in the text box marked in red in the figure. For example, to query apache logs with a response of 403, just type response:403. What is shown below is the corresponding apache log that was queried and is structured data, you can see that the client ip/verb/request etc is parsed out.

The specific syntax of the query can be found in the official Kibana documentation.

The main interfaces related to kibana data visualization are Visualize, Timelion, and Dashboard.

Because data visualization charts are done based on ES's aggregation queries, a certain understanding of ES's aggregation queries is required. So the main focus here is to introduce the functionality, and the specific creation method needs to be configured by referring to Kibana's official documentation after understanding ES's aggregation queries.

The Visualize interface is mainly used to create individual charts, such as the first pie chart, which shows that requests with a response code of 200 account for 82.06% of the total number of requests for apache. There is also a Chapter 2 area chart that shows the data throughput of apache at each moment.

The Timelion interface is mainly used to see the relationship between different indicators at the same point in time. The two charts in the figure where the red vertical lines are located are at the same moment, and by taking the values of each chart, you can see the correlation between different indicators at the same moment, which is easy to locate and analyze.

Dashboards are dashboards that allow you to place the various charts created above on the same Dashboard page and then name and save them. This way, every time you log in to Kibana you can see the charts and metrics at a glance directly from the Dashboard. And Dashboard can generate url links for easy sharing to others; it also provides links to iframes, which can be easily embedded in other system's front page.

This concludes the course and shares some of the work done by our ES team in the Infrastructure Department. We have developed two products based primarily on Elasticsearch, a source-generated Elasticsearch service and a temporal database, CTSDB.

Elasticsearch Service(CES):

This course is short, the content consists of more, and the use of ES is taught in a simpler way. For more information on the use and tuning of ES, you can directly access the article "Elasticsearch Tuning in Practice", or you can scan the QR code above to check it out and welcome your criticism.


For the introduction of timing data and Tencent Cloud CTSDB, you can directly access the article "Tencent's Only Timing Database: CTSDB Demystified", or you can scan the QR code above to check it, and we welcome your criticism.

For those who have a need for the above two products, you can search for CES and CTSDB on Tencent Cloud's official website, and there is a more detailed introduction in the column.

Thank you very much for taking part in today's course, we set up a column of Elasticsearch Lab in Tencent Cloud plus Community, regularly publish relevant articles to the outside world, but also exchange relevant issues, welcome to visit, thank you again!

1、UIActivityViewController System Native Share Faux Janes Share
2、Indepth explanation of all things about big data for you
3、AIR for xml file to binary file
4、DO deploy Python crawler application
5、Major technical advances since the birth of Generative Adversarial Networks GANs

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送