Data Transfer Across the Internet - II
This is part 2 of the article on data transfer across the Internet. You may wish to read Part 1 here.
The previous section covered several important topics. Here's a brief summary before we proceed:
Introduced the OSI Model, which considers the implementation of networking over 7 layers, which have a hierarchical relationship. Although the OSI Model is the most general networking model, the working of the Internet can be thought of in terms of a simpler, 4 or 5 layer model. This simplified model consists of:
- Application Layer: which includes the Application, Presentation and part of the Session Layer of the OSI model. How much of the Session Layer is included depends to some degree upon how smart switches are, that is, how much of the work of these layers you can offload to the switch.
- Transport Layer: corresponds to the Transport Layer of the OSI model. This is what actually packetizes the data received from the Application and other layers above it. Depending on the protocol, it can produce several different kinds of packets, such as TCP packets, UDP packets, etc.
- Network Layer: corresponds to the Network Layer of the OSI model. This layer receives TCP or UDP packets (or other kinds of packets generated by the Transport Layer), fragments them according to various schemes, sometimes based on Path MTU Discovery, stamps the source and destination addresses, and repackages them into Network Layer packets. On the Internet, this usually means IP Packets.
- Link Layer: corresponds to the Data-Link Layer of the OSI model. This layer receives the IP packets and then repackages them into frames, depending upon the actual physical link used (ethernet over copper, fiber, microwaves, whatever). Sometimes people include the Physical Layer of the OSI model in this layer as well.
- Physical Layer: corresponds to the Physical Layer in the OSI model. Listed separately, unless you want to include it with the Link Layer above. This is the actual physical medium that transmits the information.
We talked about different kinds of devices used on the Internet, including hosts, switches, routers, bridges, gateways, etc. We differentiated between devices based on which layers of the OSI model they deal with.
We covered the IP Packet in detail. The IP Packet is the basic unit of information across the Internet, a packet that can be independently routed from source to destination.
We talked about fragmentation at the level of IP packets, how IP packet size should ideally match the MTU across all hops on a route in order to minimize fragmentation and ensure the highest possible data density.
If you are unfamiliar with any of these concepts, you might want to review Part 1 before proceeding further.
Encapsulation
So far, we've talked about the IP packet, which is the basic unit of data transmission across the Internet. We saw how IP packets actually contain other types of packetized data, for example, they may contain TCP data or UDP data. We saw that IP packets are themselves repackaged by layers below.

All of these things reflect a common concept used in dealing with packetized data, which is encapsulation. Encapsulation simply means taking a packet produced by a certain layer, and then repackaging it in the layer below.
As we've seen before, repackaging can include fragmentation, for example when the Network Layer repackages TCP packets into IP packets. The same thing happens on the layer below, where IP packets are often repackaged into Ethernet frames in order to send them to the Internet gateway.
This scheme of packaging one kind of packet into another (with suitable fragmentation when needed) is called encapsulation. The figure on the right shows a diagrammatic representation of the process. In reality, things might be much more complex, since we haven't really talked about what happens to the packet once it leaves your cablemodem. This is the telco company's responsibility, but be aware that telcos use a variety of media and toplogies to transfer data. In a switching center, they might use multi-gigabit Ethernet (probably with Jumbo Frames). They have fiber, coaxial cable, copper, etc. They have local feeders going to individual homes, aggregation sites where several such pipes combine, trunks leading to their switching centers, backbones connecting various cities, etc. Not all have to use cables or wires, some might be microwave links. These are all generally Link Layer devices, so they may operate on various Link Layer protocols specific to the physical medium. However, at the end of it all, your IP packet will be assembled and forwarded to the host at the other end, intact. This is to say, the host at the other end will receive your data in the form of IP packets.
If you wish to see an example of encapsulation in action, you can use a packet sniffer on your own computer to examine the traffic on your machine. There's a lot of packet capture/analysis software available on the Internet. I would highly recommend Wireshark, a free open-source packet analyzer with lots of nifty features. If you don't want to bother with that now, here are a few screenshots showing how encapsulation works.
Encapsulation is also sometimes called "framing", which is really just a subtype of encapsulation, specifically when dealing with certain types of packets such as Ethernet frames. Unlike TCP or IP packets, which contain a header at the beginning followed by data, an Ethernet packet is actually a "frame", with a header at the beginning, followed by the data, and then a footer at the end. The data is "framed" or surrounded by the header at one end and footer at the other end. The principle is the same: a higher level packet (say an IP packet) becomes data (or the Service Data Unit) of the Link Layer, and in turn acquires a new header.
The TCP Packet
As mentioned in the previous section, the two commonest Transport Layer protocols on the Internet are TCP and UDP. Other Transport Layer protocols include SCTP (Stream Control Transmission Protocol), DCCP (Datagram Congestion Control Protocol), and many more.
The TCP protocol includes several features such as flow control, congestion avoidance, error detection, automatic generation of retransmission requests if data is lost, etc. For these reasons, it is very popular for delivering data of many different types. Most web browser based data (HTTP) is transferred through TCP packets, for example. Many popular protocols such as IRC (Internet Relay Chat), SSH (Secure Shell), Telnet, and FTP (File Transfer Protocol), also encapsulate their data in TCP packets.
The basic structure of a TCP packet is shown below.

As can be seen, it very much resembles the IP packet described earlier. Like the IP packet, the TCP packet also has a minimum 20 byte (160 bit) header, plus an optional part. The fields, of course, are very different, because it has different information stored in it. Following is a brief description of the fields in the TCP header, which is followed by a more detailed description of how these fields work.
| Source Port | This 16-bit field lists the source port for the data. Modern operating systems can sort through multiple streams of traffic arriving through their network cards simultaneously, based on port number. For example, a computer might be simultaneously loading web pages in a browser, transferring files through FTP or bittorrent, connected to an IRC server, etc. Each of these applications can direct its own data to a specified port, and thereby separate its own stream from others. The IANA determines which ports should be used by specific applications. Some ports are considered well-known ports, such as port 23 for Telnet, port 25 for SMTP, port 80 for HTTP, etc. Other ports are considered reserved for specific applications, for example, port 1214 for Kazaa, 1220 for Quicktime, 1512 for WINS, 3074 for XBox Live, etc. Still other ports may be unofficial standard ports (not approved by IANA, but commonly accepted for a certain service), such as 3724 for World of Warcraft, 30301 for BitTorrent, etc. And finally, applications may just pick a port of their own choosing to use, so long as it doesn't conflict with any well-known or reserved ports. | ||
| Destination Port | This is the destination port. Also a 16 bit field like source port. | ||
| Sequence Number | The sequence number and acknowledgment number (below) are used together to keep tabs on what data has already been sent, what remains to be sent, as well as acknowledgments of data received. As mentioned earlier, a TCP stream may contain many thousands of packets (think of watching a movie online, or transferring a massive file). Sequence and acknowledgment numbers help the computers at the originating and receiving ends keep track of all those thousands of packages. They are ways of telling each other "this is the data I'm sending you now, this is how much data I've already sent", and "this is what I've received so far". | ||
| Acknowledgment Number | See Sequence Number above for a brief explanation. A more detailed explanation follows below in the section explaining how TCP works. | ||
| Data Offset | This is simply the header size, in units of 32-bit words. The name reflects the fact that if you apply this offset to a TCP package, you will be at the start of the data. This is a 4-bit field, allowing for numbers up to 15. So the maximum TCP header size is 15 words, or 60 bytes. The minimum, as mentioned earlier, is 20 bytes, because all fields in the header (except the Options field) are required. | ||
| Reserved | This is an unused field reserved for future use. It should be set to 0. | ||
| Flags | TCP allows for delivery confirmation, congestion avoidance, flow control, etc. These processes are controlled through various flags. There are 8 flags, and each is 1 bit (so it can be either on or off). The flags are: | ||
| CWR | Congestion Window Reduced | This along with ECE-Echo are used for congestion reduction. | |
| ECE | ECN Echo | Used with CWR for congestion reduction. | |
| URG | Urgent | Indicates that the Urgent Pointer field is significant. | |
| ACK | Acknowledgment | Indicates that the Acknowledgment field is significant. | |
| PSH | Push | Used for the Push function. | |
| RST | Reset | Used to reset the connection. | |
| SYN | Synchronize | Used to synchronize sequence numbers. | |
| FIN | Finish | Finish, meaning no more data from the sender. | |
| Window Size | The window is used for flow control. This field refers to the receive window, indicating how many bytes the recipient is willing to receive in the next packet. | ||
| Checksum | This is used for error checking both the header and data. | ||
| Urgent Pointer | If the URG flag is set, then this field refers to the offset from the sequence number to the last urgent data byte. | ||
| Options | Options are not required, but some options are very commonly used in TCP packets. The options header must be some multiple of a whole word (32-bits), from 0 (meaning no options) to 320 bits (40 bytes). | ||
| 0 | End of Options List | indicates the end of the list of options | |
| 1 | No Operation | used for padding | |
| 2 | Maximum Segment Size | declares maximum segment size, negotiated during handshake | |
| 3 | Window Scale | a scale factor that can be applied to window size | |
| 4 | Selective Acknowledgment Supported | whether selective acknowledgment is supported | |
| 5 | |||
| 6 | |||
| 7 | |||
| 8 | Timestamp | 4 byte timestamp, used in resolving conflicts. | |
| Options are basically used for enhancing TCP in various ways. Unlike the Option in the IP packet header, the TCP options are quite frequently used. More details about how various options such as selective acknowledgment, setting maximum segment size, etc. work, are provided in later sections dealing with the operation of TCP. | |||
Data Transmission via TCP
Consider a case where a server needs to send 10 megabytes of data to a client. The application doing this transfer operates over TCP, therefore TCP packets will be used for sending this information. Let's go over the steps of how this will be done.
