基于HA高可用搭建Hadoop-3.2.1实战搭建之ES-7.7.1+Kibana-7.7.1集群部署

技术2023-10-04 75

一、安装环境

操作系统 CentOS8

elasticsearch-7.7.1

kibana-7.7.1

机器：hadoop102、hadoop103、hadoop104、hadoop105、hadoop106

二、安装elasticsearch-7.7.1

1、上传文件elasticsearch-7.7.1-linux-x86_64.tar.gz、kibana-7.7.1-linux-x86_64.tar.gz到/opt/software目录

2、解压到/opt/module

[deploy@hadoop102 module]$ ls -tlr 总用量 24 drwxr-xr-x. 13 deploy deploy 211 2月 3 03:47 spark-2.4.5-bin-hadoop2.7 lrwxrwxrwx. 1 deploy deploy 12 5月 30 23:57 jdk-default -> jdk1.8.0_171 drwxr-xr-x. 8 deploy deploy 4096 5月 30 23:59 jdk1.8.0_171 lrwxrwxrwx. 1 deploy deploy 26 5月 31 00:32 zookeeper-default -> apache-zookeeper-3.6.1-bin drwxr-xr-x. 8 deploy deploy 159 5月 31 04:31 apache-zookeeper-3.6.1-bin drwxr-xr-x. 4 deploy deploy 43 5月 31 09:12 job_history lrwxrwxrwx. 1 deploy deploy 34 6月 2 01:15 spark-default -> spark-3.0.0-preview2-bin-hadoop3.2 drwxr-xr-x. 6 deploy deploy 99 6月 11 12:01 maven drwxr-xr-x. 11 deploy deploy 195 6月 13 23:42 hadoop-2.10.0 drwxr-xr-x. 10 deploy deploy 184 6月 14 00:17 apache-hive-3.1.2-bin drwxr-xr-x. 10 deploy deploy 184 6月 14 00:21 apache-hive-2.3.7-bin drwxr-xr-x. 11 deploy deploy 173 6月 14 01:10 hadoop-2.7.2 drwxr-xr-x. 3 deploy deploy 18 6月 14 02:01 hive drwxr-xr-x. 5 deploy deploy 4096 6月 14 02:11 tez lrwxrwxrwx. 1 deploy deploy 12 6月 14 02:26 hadoop-default -> hadoop-3.2.1 lrwxrwxrwx. 1 deploy deploy 21 6月 14 06:06 hive-default -> apache-hive-3.1.2-bin drwxr-xr-x. 11 deploy deploy 173 6月 14 06:32 hadoop-3.2.1 -rw-rw-r--. 1 deploy deploy 265 6月 14 06:46 TestDFSIO_results.log drwxr-xr-x. 14 deploy deploy 224 6月 14 09:46 spark-3.0.0-preview2-bin-hadoop3.2 -rw-------. 1 deploy deploy 8534 6月 14 12:31 nohup.out drwxrwxr-x. 13 deploy deploy 266 6月 14 20:16 kibana-7.7.1-linux-x86_64 lrwxrwxrwx. 1 deploy deploy 19 6月 14 20:19 elasticsearch-default -> elasticsearch-7.7.1 lrwxrwxrwx. 1 deploy deploy 25 6月 14 20:19 kibana-default -> kibana-7.7.1-linux-x86_64 drwxr-xr-x. 10 deploy deploy 167 6月 15 05:05 elasticsearch-7.7.1

3、切换到/opt/module/elasticsearch-7.7.1/config目录，修改elasticsearch.yml文件

修改yml配置的注意事项:

每行必须顶格，不能有空格

“：”后面必须有一个空格

# ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster 每台机器cluster.name保持一致----------------------------------- # # Use a descriptive name for your cluster: # cluster.name: my-es # # ------------------------------------ Node节点名称，每台机器不一致 ------------------------------------ # # Use a descriptive name for the node: # node.name: node-102 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # #path.data: /path/to/data # # Path to log files: # #path.logs: /path/to/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # 把bootstrap自检程序关掉 bootstrap.memory_lock: false bootstrap.system_call_filter: false # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): #网络部分改为当前的ip地址，端口号保持默认9200就行 network.host: hadoop102 # # Set a custom port for HTTP: # http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when this node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] #自发现配置：新节点向集群报到的主机名 discovery.seed_hosts: ["hadoop102","hadoop103","hadoop104","hadoop105","hadoop106"] # # Bootstrap the cluster using an initial set of master-eligible nodes: #7版本新增配置项，后续进一步补充 cluster.initial_master_nodes: ["node-102","node-103","node-104","node-105","node-106"] # # For more information, consult the discovery and cluster formation module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Various ----------------------------------- # # Require explicit names when deleting indices: # #action.destructive_requires_name: true #

以上修改完成一台服务器

接下来是分发文件到其他4台服务器

[deploy@hadoop102 module]$xsync elasticsearch-7.7.1 同时，分别修改hadoop103、hadoop104、hadoop105、hadoop106对应配置文件的node.name: node-102、network.host: hadoop102

三、修改Linux配置（不存在的就新建）

为什么要修改linux配置？

默认elasticsearch是单机访问模式，就是只能自己访问自己。

但是我们之后一定会设置成允许应用服务器通过网络方式访问。这时，elasticsearch就会因为嫌弃单机版的低端默认配置而报错，甚至无法启动。

所以我们在这里就要把服务器的一些限制打开，能支持更多并发。

问题1：max file descriptors [4096] for elasticsearch process likely too low, increase to at least [65536] elasticsearch

原因：系统允许 Elasticsearch 打开的最大文件数需要修改成65536 解决：vi /etc/security/limits.conf 添加内容： * soft nofile 65536 * hard nofile 131072 * soft nproc 2048 * hard nproc 65536 注意：“*” 不要省略掉分发文件 xsync /etc/security/limits.conf 问题2：max number of threads [1024] for user [judy2] likely too low, increase to at least [2048] 原因：允许最大进程数修该成4096 解决：vi /etc/security/limits.d/90-nproc.conf 修改如下内容： * soft nproc 1024 #修改为 * soft nproc 4096 分发文件 xsync /etc/security/limits.d/90-nproc.conf

问题3：max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]

原因：一个进程可以拥有的虚拟内存区域的数量。解决：在 /etc/sysctl.conf 文件最后添加一行 vm.max_map_count=262144 即可永久修改分发文件 xsync /etc/sysctl.conf

重启5台服务器。

四、编写集群启动脚本(cluster_es.sh)

#!/bin/bash es_home=/opt/module/elasticsearch-default kibana_home=/opt/module/kibana-default case $1 in "start") { for i in hadoop102 hadoop103 hadoop104 hadoop105 hadoop106 do ssh $i "source /etc/profile;${es_home}/bin/elasticsearch >/dev/null 2>&1 &" done nohup ${kibana_home}/bin/kibana >kibana.log 2>&1 & };; "stop") { ps -ef|grep ${kibana_home} |grep -v grep|awk '{print $2}'|xargs kill for i in hadoop102 hadoop103 hadoop104 hadoop105 hadoop106 do ssh $i "ps -ef|grep $es_home |grep -v grep|awk '{print \$2}'|xargs kill" >/dev/null 2>&1 done };; esac

上传到/home/deploy/bin

赋予777权限

五、启动集群

cluster_es.sh start

六、测试

测试命令： curl http://hadoop102:9200/_cat/nodes?v 成功启动结果： [deploy@hadoop102 module]$ curl http://hadoop102:9200/_cat/nodes?v ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 192.168.1.106 7 34 0 0.25 0.20 0.08 dilmrt - node-106 192.168.1.103 12 40 0 0.09 0.09 0.03 dilmrt - node-103 192.168.1.104 9 34 0 0.10 0.09 0.03 dilmrt - node-104 192.168.1.102 7 30 0 0.39 0.31 0.14 dilmrt - node-102 192.168.1.105 9 34 0 0.08 0.07 0.02 dilmrt * node-105

七、未启动成功，记得查看日志

[deploy@hadoop102 module]$ cd elasticsearch-default/ [deploy@hadoop102 elasticsearch-default]$ ls bin config data jdk lib LICENSE.txt logs modules NOTICE.txt plugins README.asciidoc [deploy@hadoop102 elasticsearch-default]$ cd logs [deploy@hadoop102 logs]$ ls -tlr 总用量 97176 -rw-rw-r--. 1 deploy deploy 0 6月 15 05:05 my-es_audit.json -rw-rw-r--. 1 deploy deploy 0 6月 15 05:05 my-es_index_indexing_slowlog.log -rw-rw-r--. 1 deploy deploy 0 6月 15 05:05 my-es_index_search_slowlog.log -rw-rw-r--. 1 deploy deploy 0 6月 15 05:05 my-es_index_search_slowlog.json -rw-rw-r--. 1 deploy deploy 0 6月 15 05:05 my-es_index_indexing_slowlog.json -rw-rw-r--. 1 deploy deploy 1043 6月 15 05:49 kibana.log -rw-rw-r--. 1 deploy deploy 361948 6月 15 23:17 my-es_server.json -rw-rw-r--. 1 deploy deploy 289878 6月 15 23:17 my-es.log -rw-rw-r--. 1 deploy deploy 673629 6月 15 23:18 gc.log.0.current -rw-rw-r--. 1 deploy deploy 28354899 6月 15 23:20 my-es_deprecation.log -rw-rw-r--. 1 deploy deploy 45716694 6月 15 23:20 my-es_deprecation.json

八、安装Kibana

1、编辑kibana.yml

[deploy@hadoop102 kibana-7.7.1-linux-x86_64]$ ls -tlr 总用量 1864 -rw-rw-r--. 1 deploy deploy 4057 5月 29 01:11 README.txt drwxrwxr-x. 2 deploy deploy 6 5月 29 01:11 plugins -rw-rw-r--. 1 deploy deploy 738 5月 29 01:11 package.json -rw-rw-r--. 1 deploy deploy 1807200 5月 29 01:11 NOTICE.txt -rw-rw-r--. 1 deploy deploy 13675 5月 29 01:11 LICENSE.txt drwxrwxr-x. 2 deploy deploy 24 6月 14 20:16 config drwxrwxr-x. 5 deploy deploy 43 6月 14 20:16 built_assets drwxrwxr-x. 6 deploy deploy 108 6月 14 20:16 node drwxrwxr-x. 3 deploy deploy 55 6月 14 20:16 optimize drwxrwxr-x. 1632 deploy deploy 49152 6月 14 20:16 node_modules drwxrwxr-x. 2 deploy deploy 132 6月 14 20:16 webpackShims drwxrwxr-x. 11 deploy deploy 160 6月 14 20:16 src drwxrwxr-x. 5 deploy deploy 129 6月 14 20:16 x-pack drwxrwxr-x. 2 deploy deploy 81 6月 15 06:33 bin drwxrwxr-x. 3 deploy deploy 46 6月 15 06:37 data [deploy@hadoop102 kibana-7.7.1-linux-x86_64]$ [deploy@hadoop102 kibana-7.7.1-linux-x86_64]$ cd config/ [deploy@hadoop102 config]$ ls -tlr 总用量 8 -rw-r--r--. 1 deploy deploy 5241 6月 14 22:37 kibana.yml [deploy@hadoop102 config]$

2、修改kibana.yml

server.host: "0.0.0.0"

elasticsearch.hosts: ["http://hadoop102:9200"]

# Kibana is served by a back end server. This setting specifies the port to use. #server.port: 5601 # Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values. # The default is 'localhost', which usually means remote machines will not be able to connect. # To allow connections from remote users, set this parameter to a non-loopback address. server.host: "0.0.0.0" # Enables you to specify a path to mount Kibana at if you are running behind a proxy. # Use the `server.rewriteBasePath` setting to tell Kibana if it should remove the basePath # from requests it receives, and to prevent a deprecation warning at startup. # This setting cannot end in a slash. #server.basePath: "" # Specifies whether Kibana should rewrite requests that are prefixed with # `server.basePath` or require that they are rewritten by your reverse proxy. # This setting was effectively always `false` before Kibana 6.3 and will # default to `true` starting in Kibana 7.0. #server.rewriteBasePath: false # The maximum payload size in bytes for incoming server requests. #server.maxPayloadBytes: 1048576 # The Kibana server's name. This is used for display purposes. #server.name: "hadoop102" # The URLs of the Elasticsearch instances to use for all your queries. elasticsearch.hosts: ["http://hadoop102:9200"] # When this setting's value is true Kibana uses the hostname specified in the server.host # setting. When the value of this setting is false, Kibana uses the hostname of the host # that connects to this Kibana instance. #elasticsearch.preserveHost: true # Kibana uses an index in Elasticsearch to store saved searches, visualizations and # dashboards. Kibana creates a new index if the index doesn't already exist. #kibana.index: ".kibana" # The default application to load. #kibana.defaultAppId: "home" # If your Elasticsearch is protected with basic authentication, these settings provide # the username and password that the Kibana server uses to perform maintenance on the Kibana # index at startup. Your Kibana users still need to authenticate with Elasticsearch, which # is proxied through the Kibana server. #elasticsearch.username: "kibana" #elasticsearch.password: "pass" # Enables SSL and paths to the PEM-format SSL certificate and SSL key files, respectively. # These settings enable SSL for outgoing requests from the Kibana server to the browser. #server.ssl.enabled: false #server.ssl.certificate: /path/to/your/server.crt #server.ssl.key: /path/to/your/server.key # Optional settings that provide the paths to the PEM-format SSL certificate and key files. # These files are used to verify the identity of Kibana to Elasticsearch and are required when # xpack.security.http.ssl.client_authentication in Elasticsearch is set to required. #elasticsearch.ssl.certificate: /path/to/your/client.crt #elasticsearch.ssl.key: /path/to/your/client.key # Optional setting that enables you to specify a path to the PEM file for the certificate # authority for your Elasticsearch instance. #elasticsearch.ssl.certificateAuthorities: [ "/path/to/your/CA.pem" ] # To disregard the validity of SSL certificates, change this setting's value to 'none'. #elasticsearch.ssl.verificationMode: full # Time in milliseconds to wait for Elasticsearch to respond to pings. Defaults to the value of # the elasticsearch.requestTimeout setting. #elasticsearch.pingTimeout: 1500 # Time in milliseconds to wait for responses from the back end or Elasticsearch. This value # must be a positive integer. #elasticsearch.requestTimeout: 30000 # List of Kibana client-side headers to send to Elasticsearch. To send *no* client-side # headers, set this value to [] (an empty list). #elasticsearch.requestHeadersWhitelist: [ authorization ] # Header names and values that are sent to Elasticsearch. Any custom headers cannot be overwritten # by client-side headers, regardless of the elasticsearch.requestHeadersWhitelist configuration. #elasticsearch.customHeaders: {} # Time in milliseconds for Elasticsearch to wait for responses from shards. Set to 0 to disable. #elasticsearch.shardTimeout: 30000 # Time in milliseconds to wait for Elasticsearch at Kibana startup before retrying. #elasticsearch.startupTimeout: 5000 # Logs queries sent to Elasticsearch. Requires logging.verbose set to true. #elasticsearch.logQueries: false # Specifies the path where Kibana creates the process ID file. #pid.file: /var/run/kibana.pid # Enables you specify a file where Kibana stores log output. #logging.dest: stdout # Set the value of this setting to true to suppress all logging output. #logging.silent: false # Set the value of this setting to true to suppress all logging output other than error messages. #logging.quiet: false # Set the value of this setting to true to log all events, including system usage information # and all requests. #logging.verbose: false # Set the interval in milliseconds to sample system and process performance # metrics. Minimum is 100ms. Defaults to 5000. #ops.interval: 5000 # Specifies locale to be used for all localizable strings, dates and number formats. # Supported languages are the following: English - en , by default , Chinese - zh-CN . #i18n.locale: "en"

3、启动服务

nohup /opt/module/kibana-7.7.1-linux-x86_64/bin/kibana &

4、测试

浏览器访问：http://hadoop102:5601/，登录成功页面

5、打开开发工具，直接操作ES

Processed: 0.010, SQL: 9