Kubernetes is designed to integrate with major cloud provider's load balancers to provide public IP addresses and direct traffic into a cluster. Kubernetes does not have a built-in network load-balancer implementation. A bare-metal cluster or really any cluster deployed outside a public cloud and lacking expensive professional hardware, needs another solution.
Some professional network equipment manufacturers also offer controllers to integrate their physical load-balancing products into Kubernetes installations in private data centers. However, this can be prohibitively expensive.
Load Balancer vs Ingress¶
While Kubernetes does support Ingress, it is limited to only HTTP or HTTPS traffic, while MetalLB can support any network traffic. In a nutshell, MetalLB provides resolution of an unassigned IP address to a particular cluster node and assigns that IP to a Service, while Ingress uses a specific IP address and internally routes HTTP or HTTPS traffic to a Service or Services based on routing rules.
MetalLB is a network load balancer and can expose cluster services on a dedicated IP address on the network, allowing external clients to connect to Kubernetes services inside the Kubernetes cluster. It performs this via either Layer 2 (data link) using Address Resolution Protocol (ARP) or Layer 3 (transport) using Border Gateway Protocol (BGP).
Instead of using a NodePort to expose services (not ideal for a production implementation), MetalLB offers a network load-balancer implementation that integrates with the networking equipment you already have in place. This makes it an ideal solution for bare-metal and VM based clusters in non-public cloud environments.
ARP vs BGP¶
MetalLB works via either ARP or BGP to resolve IP addresses to specific hosts. So, when a client attempts to connect to a specific IP, it will ask "which host has this IP?" and the response will point it to the correct host (i.e., the host's MAC address).
The request is broadcast to the entire network, and a host that knows which MAC address has that IP address responds to the request. In this case, MetalLB's response will direct the client to the correct node. Once the traffic has arrived at a host, Kubernetes takes over directing the traffic to the correct pods.
Each "peer" maintains a table of routing information directing clients to the host handling a particular IP for IPs and the hosts the peer knows about, and it advertises this information to its peers. When configured for BGP, MetalLB peers each of the nodes in the cluster with the network's router, allowing the router to direct clients to the correct host.
Consumer-grade routers do not generally support BGP. Even professional routers that do support BGP can be difficult to configure. ARP can be just as useful and requires no configuration on the network to work. It can also be considerably easier to implement. A simple config to establish a peering session to a BGP peer is given below. For services that utilize DNS and require a static IP you can create multiple address pools and disable auto assignment to insure IPs are never dynamically allocated for a static pool.
configInline: peers: - peer-address: 10.0.0.1 peer-asn: 64501 my-asn: 64500 address-pools: - name: dynamic-pool protocol: bgp addresses: - 192.168.10.0/25 auto-assign: true - name: static-pool protocol: bgp addresses: - 192.168.10.128/25 auto-assign: false
MetalLB has added support for application specific metrics and alerting. You can enable the scraping of metrics in the custom values file. A sample config is shown below.
prometheus: # scrape annotations specifies whether to add Prometheus metric # auto-collection annotations to pods. See # https://github.com/prometheus/prometheus/blob/release-2.1/documentation/examples/prometheus-kubernetes.yml # for a corresponding Prometheus configuration. Alternatively, you # may want to use the Prometheus Operator # (https://github.com/coreos/prometheus-operator) for more powerful # monitoring configuration. If you use the Prometheus operator, this # can be left at false. scrapeAnnotations: false # port both controller and speaker will listen on for metrics metricsPort: 7472 # Prometheus Operator PodMonitors podMonitor: # enable support for Prometheus Operator enabled: true # Job label for scrape target jobLabel: "app.kubernetes.io/name" # Scrape interval. If not set, the Prometheus default scrape interval is used. interval: # metric relabel configs to apply to samples before ingestion. metricRelabelings:  # - action: keep # regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+' # sourceLabels: [__name__] # relabel configs to apply to samples before ingestion. relabelings:  # - sourceLabels: [__meta_kubernetes_pod_node_name] # separator: ; # regex: ^(.*)$ # target_label: nodename # replacement: $1 # action: replace # Prometheus Operator alertmanager alerts prometheusRule: # enable alertmanager alerts enabled: true # MetalLBStaleConfig staleConfig: enabled: true labels: severity: warning # MetalLBConfigNotLoaded configNotLoaded: enabled: true labels: severity: warning # MetalLBAddressPoolExhausted addressPoolExhausted: enabled: true labels: severity: alert addressPoolUsage: enabled: true thresholds: - percent: 75 labels: severity: warning - percent: 85 labels: severity: warning - percent: 95 labels: severity: alert # MetalLBBGPSessionDown bgpSessionDown: enabled: true labels: severity: alert extraAlerts: