Chapter 5. The Internet Protocol (IP)

IP is the workhorse protocol of the TCP/IP protocol suite. TCPv1

Introduction

IP provides a best-effort, connectionless datagram delivery service. When something goes wrong, such as a router temporarily running out of buffers, IP simly throws away some data. Any required reliability must be provided by the upper layers (e.g. TCP). IPv4 and IPv6 both use this basic best-effort delivery model.

The term connectionless means that IP does not maintain any connection state information about related datagrams within the network elements (within the routers):

This chapter is on IPv4 and IPv6 header fields, and describes how IP forwarding works.

IPv4 and IPv6 Headers

IPv4 Header *

The IPv4 datagram. The header is of variable size, limited to fifteen 32-bit words (60 bytes) by the 4-bit IHL field. A typical IPv4 header contains 20 bytes (no options). The source and destination addresses are 32 bits long. Most of the second 32-bit word is used for the IPv4 fragmentation function. A header checksum helps ensure that the fields in the header are delivered correctly to the proper destination but does not protect the data.

IPv6 Header *

The IPv6 header is of fixed size (40 bytes) and contains 128-bit source and destination addresses. The Next Header field is used to indicate the presence and types of additional extension headers that follow the IPv6 header, forming a daisy chain of headers that may include special extensions or processing directives. Application data follows the header chain, usually immediately following a transport-layer header.

Size and network byte order *

In our pictures of headers and datagrams, for a 32-bit value, the most significant bit is numbered 0 at the left, and the least significant bit of a 32-bit value is numbered 31 on the right. The 4 bytes in a 32-bit value are transmitted in the following order: bits 0–7 first, then bits 8–15, then 16–23, and bits 24–31 last. This is called big endian byte ordering, which is the byte ordering required for all binary integers in the TCP/IP headers as they traverse a network. It is also called network byte order. Computer CPUs that store binary integers in little endian format must convert the header values into network byte order for transmission and back again for reception.

IP Header Fields

The Internet Checksum

The Internet checksum is a 16-bit mathematical sum used to determine, with reasonably high probability, whether a received message or portion of a message matches the one sent. the Internet checksum algorithm is not the same as the common cyclic redundancy check (CRC), which offers stronger protection.

To compute the IPv4 header checksum for an outgoing datagram, the value of the datagram’s Checksum field is first set to 0. Then, the 16-bit one’s complement sum of the header is calculated (the entire header is considered a sequence of 16-bit words). The 16-bit one’s complement of this sum is then stored in the Checksum field to make the datagram ready for transmission.

When an IPv4 datagram is received, a checksum is computed across the whole header, including the value of the Checksum field itself. Assuming there are no errors, the computed checksum value is always 0 (a one’s complement of the value FFFF). The value of the Checksum field in the packet can never be FFFF. If it were, the sum (prior to the final one’s complement operation at the sender) would have to have been 0. No sum can ever be 0 using one’s complement addition unless all the bytes are 0. (end-round carry)

When the header is found to be bad (the computed checksum is nonzero), the IPv4 implementation discards the received datagram. No error message is generated. It is up to the higher layers to somehow detect the missing datagram and retransmit if necessary.

Mathematics of the Internet Checksum

For the mathematically inclined, the set of 16-bit hexadecimal values V = {0001, . . . , FFFF} and the one’s complement sum operation + together form an Abelian group. The following properties are obeyed:

Note that in the set V and the group <V,+>, number 0000 deleted the from consideration. If we put the number 0000 in the set V, then <V,+> is not a group any longer. [p187-188]

DS Field and ECN

The third and fourth fields of the IPv4 header (second and third fields of the IPv6 header) are the Differentiated Services (called DS Field) and ECN fields, formerly called the ToS Byte or IPv6 Traffic Class.

Differentiated Services (called DiffServ) is a framework and set of standards aimed at supporting differentiated classes of service (beyond just best-effort) on the Internet. IP datagrams that are marked in certain ways may be forwarded differently (e.g., with higher priority) and can lead to increased or decreased queuing delay in the network and other special effects (possibly with associated special fees imposed by an ISP). [p188]

The Differentiated Services Code Point (DSCP) is a number (in the DS Field) that refers to a particular predefined arrangement of bits with agreed-upon meaning. Datagrams have a DSCP assigned to them when they are given to the network infrastructure that remains unmodified during delivery ,but policies (such as how many high-priority packets are allowed to be sent in a period of time) may cause a change in DSCP during delivery. [p188]

The pair of ECN bits marks a datagram with a congestion indicator when passing through a router that has a significant amount of internally queued traffic. Both bits are set by persistently congested ECN-aware routers when forwarding packets. When a marked packet is received at the destination, some protocol (such as TCP) will notice that the packet is marked and indicate this fact back to the sender, which would then slow down, thereby easing congestion before a router is forced to drop traffic because of overload. This mechanism is one of several aimed at avoiding or dealing with network congestion.

(Original uses for the ToS and Traffic Class skipped) [p188-189]

The 6-bit DS Field holds the DSCP, providing support for 64 distinct code points. The particular value of the DSCP, expressed as per-hop behavior (PHB), tells a router the forwarding treatment or special handling the datagram should receive. The default value for the DSCP is generally 0, which corresponds to routine, best-effort Internet traffic.

The DS Field contains the DSCP in 6 bits (5 bits are currently standardized to indicate the forwarding treatment the datagram should receive when forwarded by a compliant router). The following 2 bits are used for ECN and may be turned on in the datagram when it passes through a persistently congested router. When such datagrams arrive at their destinations, the congestion indication is sent back to the source in a later datagram to inform the source that its datagrams are passing through one or more congested routers.

As indicated in the table below, the DSCP values are divided into three pools: standardized, experimental/local use (EXP/LU), and experimental/local use.

Pool Code Point Prefix Policy
1 xxxxx0 Standards
2 xxxx11 EXP/LU
3 xxxx01 EXP/LU(*)

A router is to first segregate traffic into different classes. Traffic within a common class may have different drop probabilities, allowing the router to decide what traffic to drop first if it is forced to discard traffic.

The following table indicates the class selector DSCP values:

Name Value Reference Description
CS0 000000 [RFC2474] Class selector (best-effort/routine)
CS1 001000 [RFC2474] Class selector (priority)
CS2 010000 [RFC2474] Class selector (immediate)
CS3 011000 [RFC2474] Class selector (flash)
CS4 100000 [RFC2474] Class selector (flash override)
CS5 101000 [RFC2474] Class selector (CRITIC/ECP)
CS6 110000 [RFC2474] Class selector (internetwork control)
CS7 111000 [RFC2474] Class selector (control)
AF11 001010 [RFC2597] Assured Forwarding (class 1,dp 1)
AF12 001100 [RFC2597] Assured Forwarding (1,2)
AF13 001110 [RFC2597] Assured Forwarding (1,3)
AF21 010010 [RFC2597] Assured Forwarding (2,1)
AF22 010100 [RFC2597] Assured Forwarding (2,2)
AF23 010110 [RFC2597] Assured Forwarding (2,3)
AF31 011010 [RFC2597] Assured Forwarding (3,1)
AF32 011100 [RFC2597] Assured Forwarding (3,2)
AF33 011110 [RFC2597] Assured Forwarding (3,3)
AF41 100010 [RFC2597] Assured Forwarding (4,1)
AF42 100100 [RFC2597] Assured Forwarding (4,2)
AF43 100110 [RFC2597] Assured Forwarding (4,3)
EF PHB 101110 [RFC3246] Expedited Forwarding
VOICE-ADMIT 101100 [RFC5865] Capacity-Admitted Traffic

IP Options

IP options may be selected on a per-datagram basis. Many of the options are no longer practical or desirable because of the limited size of the IPv4 header or concerns regarding security. With IPv6, most of the options have been removed or altered and are in the basic IPv6 header but are placed after the IPv6 header in one or more extension headers.

An IP router that receives a datagram containing options should perform special processing. In some cases IPv6 routers process extension headers, but many headers are designed to be processed only by end hosts. In some routers, datagrams with options or extensions are not forwarded as fast as ordinary datagrams.

The table shows most of the IPv4 options that have been standardized over the years.

Options, if present, are carried in IPv4 packets immediately after the basic IPv4 header. Options are identified by an 8-bit option Type field. This field is subdivided into three subfields: Copy (1 bit), Class (2 bits), and Number (5 bits). Options 0 and 1 are a single byte long, and most others are variable in length. Variable options consist of 1 byte of type identifier, 1 byte of length, and the option itself.

The options area always ends on a 32-bit boundary. Pad bytes with a value of 0 are added if necessary. This ensures that the IPv4 header is always a multiple of 32 bits (as required by the IHL field). [p192]

Options are identified by an 8-bit option Type field. This field is subdivided into three subfields: Copy (1 bit), Class (2 bits), and Number (5 bits). Options 0 and 1 are a single byte long, and most others are variable in length. Variable options consist of 1 byte of type identifier, 1 byte of length, and the option itself.

Most of the standardized options are rarely or never used in the Internet today. In addition, the options are primarily for diagnostic purposes and make the construction of firewalls more cumbersome and risky. Thus, IPv4 options are typically disallowed or stripped at the perimeter of enterprise networks by firewalls. (Chapter 7)

Within enterprise networks, where the average path length is smaller and protection from malicious users may be less of a concern, options can still be useful. In addition, since the Router Alert option is designed primarily as a performance optimization and does not change fundamental router behavior, it is permitted more often than the other options. Some router implementations have a highly optimized internal pathway for forwarding IP traffic containing no options. The Router Alert option informs routers that a packet requires processing beyond the conventional forwarding algorithms. The experimental Quick-Start option at the end of the table is applicable to both IPv4 and IPv6.

IPv6 Extension Headers

In IPv6, special functions such as those provided by options in IPv4 can be enabled by adding extension headers that follow the IPv6 header. IPv6 header is fixed at 40 bytes, and extension headers are added only when needed. [p194]

In choosing the IPv6 header to be of a fixed size, and requiring that extension headers be processed only by end hosts (with one exception), design and construction of high-performance routers are easier because the demands on packet processing at routers can be simpler than with IPv4.

Extension headers, along with headers of higher-layer protocols such as TCP or UDP, are chained together with the IPv6 header to form a cascade of headers (see the figure below). The Next Header field in each header indicates the type of the subsequent header, which could be an IPv6 extension header or some other type. The value of 59 indicates the end of the header chain. The most possible values for the Next Header field are provided in the following table.

IPv6 headers form a chain using the Next Header field. Headers in the chain may be IPv6 extension headers or transport headers. The IPv6 header appears at the beginning of the datagram and is always 40 bytes long.

This figure shows IPv6 headers form a chain using the Next Header field. Headers in the chain may be IPv6 extension headers or transport headers. The IPv6 header appears at the beginning of the datagram and is always 40 bytes long.

The values for the IPv6 Next Header field may indicate extensions or headers for other protocols. The same values are used with the IPv4 Protocol field, where appropriate.

This table show values for the IPv6 Next Header field may indicate extensions or headers for other protocols. The same values are used with the IPv4 Protocol field, where appropriate. The IPv6 extension header mechanism distinguishes some functions (e.g., routing and fragmentation) from options.

IPv6 Options

IPv6 options, if present, are grouped into either of the following:

Hopby-Hop Options (called HOPOPTs) are the only ones that need to be processed by every router a packet encounters. The format for encoding options within the Hop-by-Hop and Destination Options extension headers is common.

The Hop-by-Hop and Destination Options headers are capable of holding more than one option. Each of these options is encoded as type-length-value (TLV) sets, as shown below:

Hop-by-hop and Destination Options are encoded as TLV sets. The first byte gives the option type, including subfields indicating how an IPv6 node should behave if the option is not recognized, and whether the option data might change as the datagram is forwarded. The Opt Data Len field gives the size of the option data in bytes.

In the TLV sets, the first byte gives the option type, including subfields indicating how an IPv6 node should behave if the option is not recognized, and whether the option data might change as the datagram is forwarded. The Opt Data Len field gives the size of the option data in bytes.

The 2 high-order bits in an IPv6 TLV option type indicate whether an IPv6 node should forward or drop the datagram if the option is not recognized, and whether a message indicating the datagram’s fate should be sent back to the sender, as shown in the table below:

The 2 high-order bits in an IPv6 TLV option type indicate whether an IPv6 node should forward or drop the datagram if the option is not recognized, and whether a message indicating the datagram’s fate should be sent back to the sender.

Options in IPv6 are carried in either Hop-by-Hop (H) or Destination (D) Options extension headers. The option Type field contains the value from the "Type" column with the Action and Change subfields denoted in binary. The "Length" column contains the value of the Opt Data Len byte. See the table below:

Options in IPv6 are carried in either Hop-by-Hop (H) or Destination (D) Options extension headers. The option Type field contains the value from the “Type” column with the Action and Change subfields denoted in binary. The “Length” column contains the value of the Opt Data Len byte from Figure 5-7. The Pad1 option is the only one lacking this byte.

[p196-197]

Pad1 and PadN

IPv6 options are aligned to 8-byte offsets, so options that are naturally smaller are padded with 0 bytes to round out their lengths to the nearest 8 bytes. [p197]

IPv6 Jumbo Payload

In some TCP/IP networks, such as those used to interconnect supercomputers, the normal 64KB limit on the IP datagram size can lead to unwanted overhead when moving large amounts of data. The IPv6 Jumbo Payload option specifies an IPv6 datagram with payload size larger than 65,535 bytes, called a jumbogram. This option need not be implemented by nodes attached to links with MTU sizes below 64KB. The Jumbo Payload option provides a 32-bit field for holding the payload size for datagrams with payloads of sizes between 65,535 and 4,294,967,295 bytes (4 GB).

When a jumbogram is formed for transmission, its normal Payload Length field is set to 0. The TCP protocol makes use of the Payload Length field in order to compute its checksum using the Internet checksum algorithm described previously. When the Jumbo Payload option is used, TCP must be careful to use the length value from the option instead of the regular Length field in the base header. [p198]

Tunnel Encapsulation Limit

Tunneling refers to the encapsulation of one protocol in another that does not conform to traditional layering. For example, IP datagrams may be encapsulated inside the payload portion of another IP datagram.

Using Tunnel Encapsulation Limit option, a sender can specify a limit to have control over how many tunnel levels are ultimately used for encapsulation. Using this option.

Router Alert

The Router Alert option indicates that the datagram contains information that needs to be processed by a router. It is used for the same purpose as the IPv4 Router Alert option.

Quick-Start

The Quick-Start (QS) option is used in conjunction with the experimental QuickStart procedure for TCP/IP specified in [RFC4782]. It is applicable to both IPv4 and IPv6 but at present is suggested only for private networks and not the global Internet. [p199]

CALIPSO

This option is used for supporting the Common Architecture Label IPv6 Security Option (CALIPSO) [RFC5570] in certain private networks.

Home Address

This option holds the "home" address of the IPv6 node sending the datagram when IPv6 mobility options are in use. Mobile IP (Section 5.5) specifies a set of procedures for handling IP nodes that may change their point of network attachment without losing their higher-layer network connections. [p199]

Routing Header

The IPv6 Routing header provides a mechanism for the sender of an IPv6 datagram to control the path the datagram takes through the network. Two different versions of the routing extension header have been specified: type 0 (RH0) and type 2 (RH2):

To best understand the Routing header, we begin by discussing RH0 and then investigate why it has been deprecated and how it differs from RH2. RH0 specifies one or more IPv6 nodes to be "visited" as the datagram is forwarded.

The now-deprecated Routing header type 0 (RH0) generalizes the IPv4 loose and strict Source Route and Record Route options. It is constructed by the sender to include IPv6 node addresses that act as waypoints when the datagram is forwarded. Each address can be specified as a loose or strict address. A strict address must be reached by a single IPv6 hop, whereas a loose address may contain one or more other hops in between. The IPv6 Destination IP Address field in the base header is modified to contain the next waypoint address as the datagram is forwarded.

The IPv6 Routing header shown below generalizes the loose Source and Record Route options from IPv4. RH0 allows the sender to specify a vector of IPv6 addresses for nodes to be visited. [p200-201]

A Routing header is not processed until it reaches the node whose address is contained in the Destination IP Address field of the IPv6 header. At this time, the Segments Left field is used to determine the next hop address from the address vector, and this address is swapped with the Destination IP Address field in the IPv6 header. Thus, as the datagram is forwarded, the Segments Left field grows smaller, and the list of addresses in the header reflects the node addresses that forwarded the datagram.

The forwarding procedure is shown in the below figure:

Using an IPv6 Routing header (RH0), the sender (S) is able to direct the datagram through the intermediate nodes R2 and R3 . The other nodes traversed are determined by the normal IPv6 routing. Note that the destination address in the IPv6 header is updated at each hop specified in the Routing header.

  1. The sender (S) constructs the datagram with destination address R1 and a Routing header (type 0) containing the addresses R2, R3, and D. The final destination of the datagram is the last address in the list (D). The Segments Left field (labeled "Left") starts at 3.
  2. The datagram is forwarded toward R1 automatically by S and R0 . Because R0's address is not present in the datagram, no modifications of the Routing header or addresses are performed by R0 .
  3. Upon reaching R1, the destination address from the base header is swapped with the first address listed in the Routing header and the Segments Left field is decremented.
  4. As the datagram is forwarded, the process of swapping the destination address with the next address from the address list in the Routing header repeats until the last destination listed in the Routing header is reached.

RH0 has been deprecated by [RFC5095] because of a security concern that allows RH0 to be used to increase the effectiveness of DoS attacks. The problem is that RH0 allows the same address to be specified in multiple locations within the Routing header. This can lead to traffic being forwarded many times between two or more hosts or routers along a particular path. The potentially high traffic loads that can be created along particular paths in the network can cause disruption to other traffic flows competing for bandwidth across the same path. Consequently, RH0 has been deprecated and only RH2 remains as the sole Routing header supported by IPv6. RH2 is equivalent to RH0 except it has room for only a single address and uses a different value in the Routing Type field.

Fragment Header

The Fragment header is used by an IPv6 source when sending a datagram larger than the path MTU of the datagram’s intended destination. (Path MTU and how it is determined are detailed in Chapter 13). 1280 bytes is a network-wide link-layer minimum MTU for IPv6 [RFC2460].

The Fragment header includes the same information as is found in the IPv4 header, but the Identification field is 32 bits instead of the 16 that are used for IPv4. The larger field provides the ability for more fragmented packets to be outstanding in the network simultaneously. The Fragment header uses the format shown in the figure below:

The IPv6 Fragment header contains a 32-bit Identification field (twice as large as the Identification field in IPv4). The M bit field indicates whether the fragment is the last of an original datagram. As with IPv4, the Fragment Offset field gives the offset of the payload into the original datagram in 8-byte units.

The datagram serving as input to the fragmentation process is called the "original packet" and consists of two parts: the "unfragmentable part" and the "fragmentable part":

When the original packet is fragmented, multiple fragment packets are produced, each of which contains a copy of the unfragmentable part of the original packet followed by the Fragment header. In each fragmented IPv6 packet:

The following example illustrates the way an IPv6 source might fragment a datagram:

 An example of IPv6 fragmentation where a 3960-byte payload is split into three fragment packets of size 1448 bytes or less. Each fragment contains a Fragment header with the identical Identification field. All but the last fragment have the More Fragments field (M) set to 1. The offset is given in 8-byte units—the last fragment, for example, contains data beginning at offset (362 * 8) = 2896 bytes from the beginning of the original packet’s data. The scheme is similar to fragmentation in IPv4.

In the figure above, a payload of 3960 bytes is fragmented such that no fragment’s total packet size exceeds 1500 bytes (a typical MTU for Ethernet), yet the fragment data sizes still are arranged to be multiples of 8 bytes. [p204-205]

The receiver must ensure that all fragments of an original datagram have been received before performing reassembly. As with fragmentation in IPv4 (Chapter 10), fragments may arrive out of order at the receiver but are reassembled in order to form a datagram that is given to other protocols for processing.

[p205-208]

(Wireshark example skipped)

IP Forwarding

IP forwarding is simple, especially for a host:

A host differs from a router in how IP datagrams are handled: a host never forwards datagrams it does not originate, whereas routers do.

The IP protocol can receive a datagram from either of the following:

The IP layer has some information in memory, called a routing table or forwarding table, which it searches each time it receives a datagram to send.

When a datagram is received from a network interface, IP first checks if the destination IP address is one of its own IP addresses (one of the IP addresses associated with one of its network interfaces) or some other address for which it should receive traffic such as an IP broadcast or multicast address:

Forwarding Table

The data in a forwarding table is up to the implementation, though several key pieces of information are generally required to implement the forwarding table for IP. Each entry in the routing or forwarding table contains (at least conceptually) the following information fields:

IP forwarding is performed on a hop-by-hop basis. The routers and hosts do not contain the complete forwarding path to any destination, except those destinations that are directly connected to the host or router. IP forwarding provides the IP address of only the next-hop entity to which the datagram is sent. The following assumption are made:

The job of ensuring correctness of the routing table is given to one or more routing protocols. Many different routing protocols are available to do this job, including RIP, OSPF, BGP, and IS-IS.

IP Forwarding Actions

When the IP layer in a host or router needs to send an IP datagram to a next-hop router or host, it first examines the destination IP address (D) in the datagram Using the value D, the following longest prefix match algorithm is executed on the forwarding table:

  1. Search the table for all entries (masking and comparing) for which the following property holds:

    (D & mj) = dj

    Where:

    • mj is the value of the mask field associated with the forwarding entry ej having index j
    • dj is the value of the destination field associated with ej.

    This means that the destination IP address D is bitwise ANDed with the mask in each forwarding table entry (mj), and the result is compared against the destination in the same forwarding table entry (dj). If the property holds, the entry (ej here) is a "match" for the destination IP address. When a match happens, the algorithm notes the entry index (j here) and how many bits in the mask mj were 1. The more bits are 1, the "better" the match.

  2. The best matching entry ek (the one with the largest number of 1 bits in its mask mj) is selected, and its next-hop field nmk is used as the next-hop IP address in forwarding the datagram.

If no matches in the forwarding table are found, the datagram is undeliverable:

In some circumstances, more than one entry may match an equal number of 1 bits: for example, when more than one default route is available (e.g., when attached to more than one ISP, called multihoming). A common behavior is for the system to simply choose the first match. More sophisticated systems may attempt to load-balance or split traffic across the multiple routes. Studies suggest that multihoming can be beneficial not only for large enterprises, but also for residential users. [p210]

Examples of IP Forwarding

There are two cases of IP forwarding:

Direct Delivery

Host S (with IPv4 address S and MAC address S) has an IP datagram to send to Host D (IPv4 address D, MAC address D). Both hosts are on the same Ethernet (front cover) and are interconnected using a switch. [p210]

The top part of the following figure shows the delivery of the datagram and the forwarding table on S to contain the information as shown in the following table:

Direct delivery does not require the presence of a router—IP datagrams are encapsulated in a link-layer frame that directly identifies the source and destination. Indirect delivery involves a router—data is forwarded to the router using the router’s link-layer address as the destination link-layer address. The router’s IP address does not appear in the IP datagram (unless the router itself is the source or destination, or when source routing is used).

Destination Mask Gateway (Next Hop) Interface
0.0.0.0 0.0.0.0 10.0.0.1 10.0.0.100
10.0.0.0 255.255.255.128 10.0.0.100 10.0.0.100

The (unicast) IPv4 forwarding table at host S contains only two entries. Host S is configured with IPv4 address and subnet mask 10.0.0.100/25. Datagrams destined for addresses in the range 10.0.0.1 through 10.0.0.126 use the second forwarding table entry and are sent using direct delivery. All other datagrams use the first entry and are given to router R with IPv4 address 10.0.0.1.

Direct delivery does not require the presence of a router: IP datagrams are encapsulated in a link-layer frame that directly identifies the source and destination. In the above forwarding table, the destination IPv4 address D (10.0.0.9) matches both the first and second forwarding table entries. Because it matches the second entry better (25 bits instead of none), the "gateway" or next-hop address is 10.0.0.100, the address S. Thus, the gateway portion of the entry contains the address of the sending host’s own network interface (no router is referenced), indicating that direct delivery is to be used to send the datagram.

The datagram is encapsulated in a lower-layer frame destined for the target host D. If the lower-layer address of the target host is unknown, the ARP protocol (for IPv4; Chapter 4) or Neighbor Solicitation (for IPv6; Chapter 8) operation may be invoked at this point to determine the correct lower-layer address, D. Once known, the destination address in the datagram is D’s IPv4 address (10.0.0.9), and D is placed in the Destination IP Address field in the lower-layer header. The switch delivers the frame to D based solely on the link-layer address D; it pays no attention to the IP addresses.

Indirect Delivery

Host S has an IP datagram to send to the Host D (ftp.uu.net), whose IPv4 address is 192.48.96.9. The bottom part of the Figure 5-16 shows the conceptual path of the datagram through four routers. Host S searches its forwarding table but does not find a matching prefix on the local network. It uses its default route entry (which matches every destination, but with no 1 bits at all).

The IP addresses correspond to the source and destination hosts, but the lower-layer addresses do not. The lower-layer addresses determine which machines receive the frame containing the datagram on a per-hop basis.

In this example:

The following table is the forwarding table of R1:

Destination Mask Gateway (Next Hop) Interface Note
0.0.0.0 0.0.0.0 70.231.159.254 70.231.132.85 NAT
10.0.0.0 255.255.255.128 10.0.0.100 10.0.0.1 NAT

The forwarding table at R1 indicates that address translation should be performed for traffic. The router has a private address on one side (10.0.0.1) and a public address on the other (70.231.132.85). Address translation is used to make datagrams originating on the 10.0.0.0/25 network appear to the Internet as though they had been sent from 70.231.132.85.

IPv6 uses a slightly different mechanism (Neighbor Solicitation messages) from IPv4 (which uses ARP) to ascertain the lower-layer address of its next hop (Chapter 8). In addition, IPv6 has both link-local addresses and global addresses (Chapter 2). While global addresses behave like regular IP addresses, link-local addresses can be used only on the same link. In addition, because all the link-local addresses share the same IPv6 prefix (fe80::/10), a multihomed host may require user to determine which interface to use when sending a datagram destined for a link-local destination.

[p213-214]

To see the path taken to an IP destination, we can use the traceroute program:

Linux% traceroute -n ftp.uu.net
traceroute to ftp.uu.net (192.48.96.9), 30 hops max, 38 byte packets
 1 70.231.159.254 9.285 ms 8.404 ms 8.887 ms
 2 206.171.134.131 8.412 ms 8.764 ms 8.661 ms
 3 216.102.176.226 8.502 ms 8.995 ms 8.644 ms
 4 151.164.190.185 8.705 ms 8.673 ms 9.014 ms
 5 151.164.92.181 9.149 ms 9.057 ms 9.537 ms
 6 151.164.240.134 9.680 ms 10.389 ms 11.003 ms
 7 151.164.41.10 11.605 ms 37.699 ms 11.374 ms
 8 12.122.79.97 13.449 ms 12.804 ms 13.126 ms
 9 12.122.85.134 15.114 ms 15.020 ms 13.654 ms
 MPLS Label=32307 CoS=5 TTL=1 S=0
10 12.123.12.18 16.011 ms 13.555 ms 13.167 ms
11 192.205.33.198 15.594 ms 15.497 ms 16.093 ms
12 152.63.57.102 15.103 ms 14.769 ms 15.128 ms
13 152.63.34.133 77.501 ms 77.593 ms 76.974 ms
14 152.63.38.1 77.906 ms 78.101 ms 78.398 ms
15 207.18.173.162 81.146 ms 81.281 ms 80.918 ms
16 198.5.240.36 77.988 ms 78.007 ms 77.947 ms
17 198.5.241.101 81.912 ms 82.231 ms 83.115 ms

This program lists each of the IP hops traversed while sending a series of datagrams to the destination ftp.uu.net (192.48.96.9). The traceroute program uses a combination of UDP datagrams (with increasing TTL over time) and ICMP messages (used to detect each hop when the UDP datagrams expire) to accomplish its task. Three UDP packets are sent at each TTL value, providing three roundtrip-time measurements to each hop. The following line indicates that Multiprotocol Label Switching (MPLS) [RFC3031] is being used.

MPLS Label=32307 CoS=5 TTL=1 S=0

MPLS is a form of link-layer network capable of carrying multiple network-layer protocols. Many network operators use it for traffic engineering purposes (controlling where network traffic flows through their networks). [p215]

Discussion (IP Forwarding)

Key points regarding the operation of IP unicast forwarding:

  1. Most of the hosts and routers in this example used a default route consisting of a single forwarding table entry of this form: mask 0, destination 0, next hop <some IP address>. Indeed, most hosts and most routers at the edge of the Internet can use a default route for everything other than destinations on local networks because there is only one interface available that provides connectivity to the rest of the Internet.
  2. The source and destination IP addresses in the datagram never change once in the regular Internet. This is always the case unless either source routing is used, or when other functions (such as NAT, as in the example) are encountered along the data path. Forwarding decisions at the IP layer are based on the destination address.
  3. A different lower-layer header is used on each link that uses addressing, and the lower-layer destination address (if present) always contains the lower-layer address of the next hop. Therefore, lower-layer headers routinely change as the datagram is moved along each hop toward its destination. In our example, both Ethernet LANs encapsulated a link-layer header containing the next hop’s Ethernet address, but the DSL link did not. Lower-layer addresses are normally obtained using ARP (see Chap

Mobile IP

(skipped)

[p215-220]

Host Processing of IP Datagrams

Host Models