Requests are forwarded by nginx as the first layer. haproxy is the standard Loader Balance component provided by Rancher that proxies requests to specific applications based on rules and does load balancing at the same time if the application has multiple instances.
Identification of problems
- ping domain name, can pass => That means the network is working.
- Accessing the website address, the request status in the nginx log is502 perhaps 504 => It means that the request reached nginx and there was a problem with the subsequent gateway
note：502 Bad Gateway; 504 Gateway Time-out
- View all hosts in Rancher and find all Rancher network containers healthcheck The component is in initializing state, and containers between different hosts cannot be pinged => Confirmation that there is a problem with the Rancher network
The healthcheck status of all hosts is shown in the following screenshot.
- Looking at the logs for the healthcheck, rancher-agent, rancher-server, network-manager containers yields nothing => Embarrassed and very reactive to problems with third party tools used without in-depth knowledge
- Thinking about the last rancher network issue I dealt with, Rancher can't start healthcheck and lb To troubleshoot, follow the official rancher steps.
- Host not enabled UFW Services, excluding firewall interference
- Check that the console host IP is correct. find a clue as follows.
A host's IP becomes 172.17.0.1 , which is not the normal IP of the machine, is usually docker0 IP of the bridge
<img width="60%" src="https://media.chenyongjun.vip/2018/06/26/4941b27646624b84a7bf71ef210b35d7.png">
- ifconfig Check the problem host IP, 172.17.0.1 for docker0 of the IP. Rancher's website says it encountered the wrong IP Host re-registration required。
This is GG, we have to remove the containers on the host or stop
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0
inet6 fe80::42:9cff:fea1:bc40 prefixlen 64 scopeid 0x20<link>
ether 02:42:9c:a1:bc:40 txqueuelen 0 (Ethernet)
RX packets 144756223 bytes 17497382352 (16.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 124049363 bytes 79629803176 (74.1 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
- After removing this problematic host, restart the other host'shealthcheck service, the host communications are restored to normal. This concludes the problem identification.
The problem host was removed and re-added and this problem host returned to normal.
note：I forget how many times this is handledRancher There's a network problem.，Rancher One version up.， He's also stepped in a lot of holes
The problem resurfaces
Regardless of what caused the problem, it's curious how one host with the wrong IP can trigger all hosts avalanche What? Try to reproduce the problem.
Reproduction method : Add an IP in a normal network environment with docker0 The host on the bridge with the IP of 172.17.0.1
Reproduction of results : Add IP as 172.17.0.1 The network of the entire environment immediately becomes abnormal after the hosts, and the hosts cannot communicate with each other, reproducing the above problem
Questions to explore
Why does the host IP become 172.17.0.1?
FAQs on the Rancher website cross host communication Narrative.
Every so often, the IP of the host will accidentally pick up the docker bridge IP instead of the actual IP. These are typically 172.17.42.1 or starting with 172.17.x.x. If this is the case, you need to re-register your host with the correct IP by explicitly setting the CATTLE_AGENT_IP environment variable in the docker run command.
That is, every once in a while, the docker bridge IPs are occasionally used to replace the host's actual IPs, which are usually 172.17.42.1 or start with 172.17.x.x. If you encounter this situation, you need to re-add the host.
todo: doubts to be solved
Why does a problem with one host affect all hosts?
todo: doubts to be solved
>>1、Thinking and Implementing Interface Automation Testing2、Audis new A8L completes China debut with March launch in China3、Name one advantage you have over a robot lol Can you eat it4、Use of SpringCloudConfig distributed configuration center and pitfalls encountered5、iponeX heartsensing because its smart