Link Aggregation (LACP): Bonding Network Links for Speed and Redundancy
Link Aggregation bundles multiple physical network links into a single logical connection, increasing throughput and providing failover if a cable or port fails. Whether you are connecting a server with dual NICs to a switch or linking two switches together, LACP (802.3ad) is the industry-standard protocol that makes it work. This guide covers static vs dynamic LAG, MLAG and vPC for multi-chassis aggregation, practical use cases, and common mistakes.
What Is Link Aggregation?
Link aggregation — also known as port channelling, NIC teaming, or bonding — combines two or more physical Ethernet links into a single logical link called a Link Aggregation Group (LAG). The resulting logical interface provides the combined bandwidth of all member links and remains operational as long as at least one member link is active. For example, four 1 Gbps links aggregated together form a 4 Gbps logical link. If one cable is unplugged, the LAG continues to operate at 3 Gbps with no interruption to traffic — frames are redistributed across the remaining links automatically.
Link aggregation is defined by the IEEE 802.3ad standard (now incorporated into IEEE 802.1AX). It is supported by virtually every managed switch, server operating system, and network-attached storage device on the market. For Australian IT resellers, LAG is one of the most practical tools in your toolkit — it increases capacity without requiring faster (and more expensive) optics, and it provides resilience at the physical layer. A dual-attached server with LACP is far more robust than one hanging off a single cable and port.
Static LAG vs Dynamic LAG (LACP)
A static LAG (sometimes called a manual port channel or EtherChannel "on" mode) bundles links without any negotiation protocol. Both ends must be configured identically — same number of ports, same speed, same duplex. The advantage is simplicity: no protocol overhead and no dependency on LACP support. The disadvantage is that the switch has no way to detect a misconfiguration or a unidirectional link failure. If one side thinks it has a four-port bundle but the other side is configured differently, traffic can be black-holed with no alarm raised.
Dynamic LAG using LACP (Link Aggregation Control Protocol) adds a negotiation layer. Each switch port sends LACP Data Units (LACPDUs) to its partner, exchanging system ID, port priority, and aggregation key information. Both ends must agree on these parameters before the link is added to the bundle. LACP can detect mismatches (e.g., a port cabled to the wrong switch), unidirectional failures, and speed mismatches, placing offending ports into a suspended state rather than forwarding traffic into a black hole. For this reason, LACP is always recommended over static LAG unless the remote device does not support it.
How Traffic Is Distributed Across Member Links
A common misconception is that link aggregation provides per-packet load balancing across all member links. In reality, LAG uses a hashing algorithm to assign each traffic flow to a specific member link. The hash is typically computed from a combination of source and destination MAC addresses, IP addresses, and TCP/UDP port numbers. All packets belonging to the same flow (same source/destination pair) always traverse the same physical link, preserving packet ordering. This means a single TCP session between two hosts will never exceed the speed of one member link.
The benefit of aggregation becomes apparent when multiple flows are active simultaneously. A server with a four-port LAG serving 50 clients will distribute those 50 flows across all four links, achieving aggregate throughput approaching the combined capacity. The hash algorithm is configurable on most managed switches — using a five-tuple hash (source IP, destination IP, protocol, source port, destination port) provides the best distribution. Simpler hashes based only on MAC addresses can lead to uneven distribution, especially when traffic passes through a router (which presents a single MAC address for all routed traffic).
MLAG and vPC: Multi-Chassis Link Aggregation
Standard LACP requires all member links to terminate on the same switch. This creates a single point of failure — if that switch loses power or crashes, the entire LAG goes down. Multi-Chassis Link Aggregation (MLAG) solves this by allowing a LAG to span two separate physical switches that appear as a single logical switch to the connected device. Different vendors have their own implementations: Cisco calls it vPC (Virtual Port Channel), Arista uses MLAG, Juniper has MC-LAG, and HPE/Aruba offers VSX (Virtual Switching Extension).
In an MLAG configuration, the two peer switches maintain a peer link (sometimes called an inter-chassis link or ICL) that synchronises MAC address tables, ARP entries, and LACP state between them. A keepalive mechanism detects if the peer switch fails, and the surviving switch takes over all forwarding. From the server's perspective, it has a single LACP bundle spread across two switches, providing both bandwidth aggregation and switch-level redundancy. MLAG is the standard design for data centre access layer connectivity and is used extensively in server rooms and top-of-rack deployments.
Link Aggregation Options Compared
| Feature | Static LAG | LACP (Single Switch) | MLAG/vPC (Dual Switch) |
|---|---|---|---|
| Negotiation protocol | None | LACP (802.3ad) | LACP + vendor-specific peering |
| Misconfiguration detection | No | Yes | Yes |
| Switch redundancy | No | No | Yes |
| Configuration complexity | Low | Low | Medium-High |
| Typical use case | Simple NAS/device bonding | Server to single switch | Server to switch pair (data centre) |
| Vendor interoperability | Varies — no standard negotiation | Excellent — IEEE standard | Peer switches must be same vendor |
Use Cases: Where Link Aggregation Shines
Server connectivity is the most common use case. Dual-NIC servers bonded via LACP to a switch pair (using MLAG) provide both increased throughput and NIC/cable/switch-level redundancy. This is standard practice for VMware ESXi hosts, Hyper-V hosts, and file servers where downtime from a single link failure is unacceptable. On Linux, bonding is configured via the bonding kernel module with mode 4 (802.3ad). On Windows Server, NIC teaming is built into the OS and supports LACP natively through the "Switch Independent" or "LACP" teaming modes.
Switch-to-switch uplinks are another prime candidate. Instead of a single 10 Gbps uplink between an access switch and the distribution/core, you can aggregate two or four 10 Gbps links for 20-40 Gbps of inter-switch bandwidth. This is more cost-effective than upgrading to 25 or 40 Gbps optics in some scenarios and provides built-in redundancy. NAS and SAN connectivity also benefits — Synology and QNAP NAS devices support LACP out of the box, and bonding multiple gigabit or 2.5 GbE ports dramatically improves throughput for multi-user file access.
Common Mistakes and Troubleshooting
The most frequent LACP mistake is mismatched configuration between the switch and the server. If the switch is configured for LACP but the server NIC team is set to "Switch Independent" mode (which does not use LACP), only one link will forward traffic while the others are suspended. Always verify that both ends agree on the aggregation method. Another common issue is STP interaction — if Spanning Tree does not recognise the LAG as a single logical interface, individual member ports may be blocked. Ensure the LAG is configured on the switch before cabling the links to avoid transient STP issues.
Expecting per-packet load balancing is another pitfall. As discussed, LAG distributes flows, not packets. If you have a single client performing a large file copy to a NAS with a four-port LAG, that single flow will use only one link. You will see aggregate throughput increase only when multiple clients access the NAS concurrently. Similarly, mixing link speeds in a LAG is not supported — all member links must operate at the same speed and duplex. If a port auto-negotiates to a lower speed (e.g., due to a bad cable), LACP will exclude it from the bundle, which is actually a safety feature that prevents asymmetric forwarding.
Pros
- Increased aggregate bandwidth using existing switch ports and NICs
- Automatic failover if a link, cable, or port fails
- Industry-standard (IEEE 802.3ad/802.1AX) — works across vendors
- MLAG/vPC provides switch-level redundancy for critical workloads
- Simple configuration on most managed switches and server operating systems
Cons
- Single flow cannot exceed the speed of one member link
- Hash-based distribution can be uneven with few concurrent flows
- All member links must be the same speed and duplex
- MLAG requires same-vendor switch pairs and adds configuration complexity
- Does not replace proper network design — oversubscription still matters
LACP Configuration Best Practices
Follow these best practices to ensure reliable LAG deployments. Always use LACP in active mode on both ends rather than static LAG, as LACP provides misconfiguration detection and faster failure recovery. Use LACP fast timers (1-second intervals instead of the default 30 seconds) for environments where rapid failover is critical — this reduces link failure detection from 90 seconds to 3 seconds. Configure the hash algorithm to use the widest tuple available (L3+L4 or five-tuple) for the best traffic distribution. When using MLAG, always provision a dedicated, high-bandwidth peer link and a separate out-of-band keepalive link to avoid split-brain scenarios.
Link aggregation is the duct tape of networking — simple, versatile, and effective. It will not solve every bandwidth problem, but it solves the most common ones elegantly.