Time: 5 hours
Concepts: What is TCP/IP?, IP Addresses, subnet masks and gateway addresses. TCP/UDP ports. IPV4 and IPV6. DNS and DHCP.
This is a very basic overview of the TCP/IP protocol, what it is, how it works. We will explain how addresses are assigned and how we create names that point to a numerical address.
TCP/IP stands for Transmission Control Protocol and Internet Protocol. It is a routable protocol that is(nonproprietary) independent of any vendor or software developer. It was developed by the DOD (Department of Defence) to allow connectivity from anywhere in the world. It is now the standard protocol used across the internet to connect all entities. It can and is also be used on internal private networks in schools, businesses and government agencies.
TCP/IP has two parts that work together.
TCP or Transmission Control Protocol decides how the data is assembled and sent across the network and then received and recompiled on the other end. The data is encapsulated into packets and then transmitted.
IP or Internet Protocol is the addressing of hosts on the network or on the internet. This is like a street address that lets other computers know where to send the data. Each computer has a unique IP address that is assigned to it. IP uses subnet masks to break up networking into smaller parts and gateways to tell traffic how to get from one network to the other. These paths are called routes. The devices or software that tells data which route to take care called routers. The routers will each have a routing table that help it to decide which way to send the traffic over the network.
Like the OSI model we learned about in Computer Networking, TCP has its own model broken up into 4 parts or “Layers”. Below we show you have to TCP model matches up with the corresponding OSI model layers.
7 layers 4 layers
OSI Model TCP Model
7 Application 4 Application
4 Transport 3 Transport
3 Network 2 Internet/Network
2 Data Link 1 Data Link
The TCP Model
The Data Link layer is the layer that states how data is going to get from one node to the other using the physical connections.
The Internet/Network layer handles the connection and routing between different networks and is the layer where the IP protocol is put into play.
The Transport Layer is responsible for flow control, reliability and multiplexing.
The Application layer is what decides how applications communicate with each other. Things like email (SMTP), Websites (HTTP) and file transfer (FTP) are handled at this layer.
Introduction to TCP/IP
By Michael Lamont - Process Software
Now we will take a look at data packets and how electronic data is encapsulated into data packets that travel across copper wire (Ethernet) fiber or wireless networks.
The Transport Layer utilizes two major protocols.
UDP (User Datagram Protocol)
TCP (Transmission Control Protocol)
UDP is a connectionless protocol that contains no reliability, flow-control, or error-recovery functions. Because of its simplicity, UDP headers contain fewer bytes and consume less network overhead than TCP. UDP is useful in situations where the reliability mechanisms of TCP are not necessary, such as in cases where a higher-layer protocol might provide error and flow control. UDP provides users with TCP-like services. Unlike TCP, UDP packets can be discarded before reaching their targets. UDP is useful when TCP would be too complex, too slow, or just unnecessary.
The UDP packet format contains four fields:
Source Port and Destination Port fields identify the endpoints of the connection.
Length field specifies the length of the header and data.
Checksum field allows packet integrity checking (optional).
UDP takes the message received from the layers above it on the OSI model and formats that message into UDP packets. The sending application sends the packets to a peer application on the receiving host. UDP requires no notification of receipt and does not use the three-way handshake (SYN-ACK-ACK—which we’ll discuss in more detail a bit later). Again,UDP is a much simpler protocol than TCP and is useful in situations where the reliability mechanisms of TCP are not necessary.
Transmission Control Protocol (TCP)
TCP is a connection-oriented Layer 4 protocol that provides full-duplex, acknowledged, and flow-controlled service to upper-layer protocols. It moves data in a continuous, unstructured byte stream. Sequence numbers identify bytes within that stream. TCP can also support numerous simultaneous upper-layer conversations.
Figure B shows the packet format.
The TCP packet format consists of these fields:
Source Port and Destination Port fields identify the endpoints of the connection.
Sequence Number field specifies the number assigned to the first byte of data in the current message. Under certain circumstances, it can also be used to identify an initial sequence number to be used in the upcoming transmission.
Acknowledgement Number field contains the value of the next sequence number that the sender of the segment is expecting to receive, if the ACK control bit is set. Note that the sequence number refers to the stream flowing in the same direction as the segment, while the acknowledgement number refers to the stream flowing in the opposite direction from the segment.
Data Offset field tells how many 32-bit words are contained in the TCP header. This information is needed because the Options field has variable length, so the header length is variable too.
Reserved field must be zero. This is for future use.
Flags field contains the various flags:
URG—Indicates that some urgent data has been placed.
ACK—Indicates that acknowledgement number is valid.
PSH—Indicates that data should be passed to the application as soon as possible.
RST—Resets the connection.
SYN—Synchronizes sequence numbers to initiate a connection.
FIN—Means that the sender of the flag has finished sending data.
Window field specifies the size of the sender's receive window
Checksum field indicates whether the header was damaged in transit.
Urgent pointer field points to the first urgent data byte in the packet.
Options field specifies various TCP options.
Data field contains upper-layer information.
TCP makes up for IP's deficiencies by providing reliable, stream-oriented connections. The protocol suite gets its name because most TCP/IP protocols are based on TCP, which is in turn based on IP. TCP and IP are the twin pillars of TCP/IP. TCP adds a great deal of functionality to the IP service.
TCP almost always operates in full-duplex mode (two independent byte streams traveling in opposite directions). Only during the start and end of a connection will data be transferred in one direction and not the other. TCP uses segments to determine whether the receiving host is ready to receive the data.
When the sending TCP host wants to establish connections, it sends a segment called a SYN to the peer TCP protocol running on the receiving host. The receiving TCP returns a segment called an ACK to acknowledge the successful receipt of the segment. The sending TCP sends another ACK segment and then proceeds to send the data. This exchange of control information is referred to as a three-way handshake.
TCP packets are very complex and incorporate several mechanisms to ensure connection state, reliability, and flow control of data packets:
Streams: TCP data is organized as a stream of bytes, much like a file.
Reliable delivery: Sequence numbers are used to coordinate which data has been transmitted and received. TCP will arrange for retransmission if it determines that data has been lost.
Network adaptation: TCP will dynamically learn the delay characteristics of a network and adjust its operation to maximize throughput without overloading the network.
Flow control: TCP manages data buffers and coordinates traffic so its buffers will never overflow. Fast senders will be stopped periodically to keep up with slower receivers.
Round-trip time estimation: TCP continuously monitors the exchange of data packets, develops an estimate of how long it should take to receive an acknowledgement, and automatically retransmits if this time is exceeded.
Internet protocol (IP)
IP is the Layer 3 protocol that provides fragmentation and reassembly of datagrams and error reporting. Along with TCP, IP represents the core of the Internet protocol suite.
Figure C shows the IP packet format.
The IP packet format consists of these fields:
Version field indicates the version of IP currently used.
IP Header Length (IHL) field indicates how many 32-bit words are in the IP header.
Type-of-service field specifies how a particular upper-layer protocol would like the current datagram to be handled. Datagrams can be assigned various levels of importance through this field.
Total Length field specifies the length of the entire IP packet, including data and header, in bytes.
Identification field contains an integer that identifies the current datagram. This field is used to help reconstruct datagram fragments.
Flags field controls whether routers are allowed to fragment a packet and indicates the parts of a packet to the receiver.
Time-to-live field maintains a counter that gradually decrements to zero, at which point the datagram is discarded. This keeps packets from looping endlessly.
Protocol field indicates which upper-layer protocol receives incoming packets after IP processing is complete.
Header Checksum field helps ensure IP header integrity.
Source Address field specifies the sending node.
Destination Address field specifies the receiving node.
Options field allows IP to support various options, such as security.
Data field contains upper-layer information.
IP attaches an IP header to the segment or packet’s header in addition to the information added by TCP or UDP. Information in the IP header includes the IP addresses of the sending and receiving hosts, datagram length, and datagram sequence order. This is provided in case the datagram exceeds the allowable byte size for network packets and must be fragmented
IPv4 is currently the most common version of IP addressing in use. It is the fourth version of the Internet Protocol. Most internet traffic today is still routed using IPv4. IPv4 uses hierarchical addressing scheme. An IP address which is 32-bits in length, is divided into two or three parts as depicted:
A single IP address can contain information about the network and its sub-network and ultimately the host. This scheme enables IP Address to be hierarchical where a network can have many sub-networks which in turn can have many hosts.
The 32-bit IP address contains information about the host and its network. It is very necessary to distinguish the both. For this, routers use Subnet Mask, which is as long as the size of the network address in the IP address. Subnet Mask is also 32 bits long. If the IP address in binary is ANDed with its Subnet Mask, the result yields the Network address. For example, say the IP Address 192.168.1.152 and the Subnet Mask is 255.255.255.0 then
This way the Subnet Mask helps extract Network ID and Host from an IP Address. It can be identified now that 192.168.1.0 is the Network number and 192.168.1.152 is the host on that network.
The positional value method is the simplest form of converting binary from decimal value. IP address is 32 bit value which is divided into 4 octets. A binary octet contains 8 bits and the value of each bit can be determined by the position of bit value '1' in the octet.
Positional value of bits is determined by 2 raised to power (position – 1), that is the value of a bit 1 at position 6 is 26-1 that is 25 that is 32. The total value of the octet is determined by adding up the positional value of bits. The value of 11000000 is 128+64 = 192. For more information, please watch the video below.
Subnet Masks - How Subnet Masks Work
IETF (Internet Engineering Task Force) has redesigned IP addresses and to mitigate the drawbacks of IPv4. The new IP address has version 6 and is 128-bit address, by which every single inch of the earth can be given millions of IP addresses.
Today majority of devices running on Internet are using IPv4 and it is not possible to shift them to IPv6 in coming days. There are mechanism provided by IPv6, by which IPv4 and IPv6 can coexist unless the Internet entirely shifts to IPv6:
IPv6 addresses are 128 bits long, compared to 32 bits for IPv4 addresses. IPv6 addresses are represented as 8 chunks of 16 bits in hexadecimal separated by colons, compared to 4 chunks of 8 bits in decimal (dotted-quad) for IPv4. For example, IPv4 addresses looks like:
whereas IPv6 addresses looks like:
In IPv6, leading 0s in each chunk can be omitted, & a single instance of consecutive 0s can be replaced with "::". For example, the following IPv6 addresses are equivalent:
Finally, IPv6 addresses use the same prefix/length notation which is now used for IPv4 to specify blocks of address space. The "prefix" is the base address of the block. The "length" is the number of bits from the left which are the same for all addresses in the block. This is often called "CIDR notation" because it was created when Classless Inter-Domain Routing was employed in the IPv4 Internet. For example:
Block Addresses in Block
18.104.22.168/24 22.214.171.124 - 126.96.36.199
188.8.131.52/22 184.108.40.206 - 220.127.116.11
18.104.22.168/16 22.214.171.124 - 126.96.36.199
2620:0:e50::/48 2620:0:e50:0:0:0:0:0 - 2620:0:e50::ffff:ffff:ffff:ffff:ffff
fd9a:2c75:7d0c::/48 fd9a:2c75:7d0c:0:0:0:0:0 - fd9a:2c75:7d0c:ffff:ffff:ffff:ffff:ffff
The IPv6 address space which will be used on most of the uiowa campus net will be the 2620:0:e50::/48 block. We also have a "local" fd9a:2c75:7d0c::/48 block for devices which need campus connectivity but not access to the general Internet.
How IP V6 Works
IP V4 VS IP V6
In this section we will discuss the concept of Ports and how they work with IP addresses.The devices and computers connected to the Internet use a protocol called TCP/IP to communicate with each other. When a computer in New York wants to send a piece of data to a computer in England, it must know the destination IP address that it would like to send the information to. That information is sent most often via two methods, UDP and TCP.
TCP and UDP Ports
As you know every computer or device on the Internet must have a unique number assigned to it called the IP address. This IP address is used to recognize your particular computer out of the millions of other computers connected to the Internet. When information is sent over the Internet to your computer how does your computer accept that information? It accepts that information by using TCP or UDP ports.
An easy way to understand ports is to imagine your IP address is a cable box and the ports are the different channels on that cable box. The cable company knows how to send cable to your cable box based upon a unique serial number associated with that box (IP Address), and then you receive the individual shows on different channels (Ports).
Ports work the same way. You have an IP address, and then many ports on that IP address. When I say many, I mean many. You can have a total of 65,535 TCP Ports and another 65,535 UDP ports. When a program on your computer sends or receives data over the Internet it sends that data to an ip address and a specific port on the remote computer, and receives the data on a usually random port on its own computer. If it uses the TCP protocol to send and receive the data then it will connect and bind itself to a TCP port. If it uses the UDP protocol to send and receive data, it will use a UDP port. Figure 1, below, is a representation of an IP address split into its many TCP and UDP ports. Note that once an application binds itself to a particular port, that port can not be used by any other application. It is first come, first served.
This all probably still feels confusing to you, and there is nothing wrong with that, as this is a complicated concept to grasp. Therefore, I will give you an example of how this works in real life so you can have a better understanding. We will use web servers in our example as you all know that a web server is a computer running an application that allows other computers to connect to it and retrieve the web pages stored there.
In order for a web server to accept connections from remote computers, such as yourself, it must bind the web server application to a local port. It will then use this port to listen for and accept connections from remote computers. Web servers typically bind to the TCP port 80, which is what the http protocol uses by default, and then will wait and listen for connections from remote devices. Once a device is connected, it will send the requested web pages to the remote device, and when done disconnect the connection.
On the other hand, if you are the remote user connecting to a web server it would work in reverse. Your web browser would pick a random TCP port from a certain range of port numbers, and attempt to connect to port 80 on the IP address of the web server. When the connection is established, the web browser will send the request for a particular web page and receive it from the web server. Then both computers will disconnect the connection.
Now, what if you wanted to run an FTP server, which is a server that allows you to transfer and receive files from remote computers, on the same web server. FTP servers use TCP ports 20 and 21 to send and receive information, so you won't have any conflicts with the web server running on TCP port 80. Therefore, the FTP server application when it starts will bind itself to TCP ports 20 and 21, and wait for connections in order to send and receive data.
Most major applications have a specific port that they listen on and they register this information with an organization called IANA. You can see a list of applications and the ports they use at the IANA Registry. With developers registering the ports their applications use with IANA, the chances of two programs attempting to use the same port, and therefore causing a conflict, will be diminished.
TCP and UDP
DHCP stands for Dynamic Host Configuration Protocol. It is a network management protocol used to dynamically assign an IP Address on a TCP/IP network. DHCP assigns the address and other network configuration parameters to devices on a network so they can communicate with other.
DHCP simplifies the administrative management of IP address configuration by automating address configuration for network clients. The DHCP standard provides for the use of DHCP servers, which are defined as any computer running the DHCP service. The DHCP server automatically allocates IP addresses and related TCP/IP configuration settings to DHCP-enabled clients on the network.
Every device on a TCP/IP-based network must have a unique IP address in order to access the network and its resources. Without DHCP, IP configuration must be done manually for new computers, computers moving from one subnet to another, and computers removed from the network.
By deploying DHCP in a network, this entire process is automated and centrally managed. The DHCP server maintains a pool of IP addresses and leases an address to any DHCP-enabled client when it logs on to the network. Because the IP addresses are dynamic (leased) rather than static (permanently assigned), addresses no longer in use are automatically returned to the pool for reallocation.
DHCP minimizes configuration errors caused by manual IP address configuration, such as typographical errors, as well as address conflicts caused by a currently assigned IP address accidentally being reissued to another computer.It is centralized and automated
A DHCP server is a server that is configured with the DHCP service. This can be a Windows server, a router, a switch, linux server or any device capable of running DHCP services.
A DHCP client is any device that connects to the network and is configured to make a DHCP request.
The DHCP scope is the range of addresses that the DHCP is configured to hand out on a specific network, usually a Local area network. The scope can also be configure to exclude certain addresses from being handed out. This is known as the exclusion range. Sometimes there will be servers, printers or other network devices that never leave the network and must always be available on your network. It make sense to give these devices statically assigned IP addresses that are in your DHCP exclusion range. The reason for excluding the addresses of these statically assigned devices from the scope is to avoid duplicate IP addresses on your network. As we stated before, each computer or device on the network must have its own IP address.
A DHCP reservation is a permanent address lease assignment from the DHCP server to the client. Reservations ensure that a specified hardware device on the subnet can always use the same IP address. This is useful for computers such as remote access gateways, WINS, or DNS servers that must have a static IP address.
DNS - Domain Name System
DNS is the system that helps us give names to systems and web sites on a network or over the internet. Having to look for a website by IP address like 188.8.131.52 just isn’t very user friendly. Instead with give it the name yahoo.com amd then have that name point to the address 184.108.40.206.
If we had to remember the IP address of all our favorite websites, we'd probably go nuts! Human beings are just not that good at remembering strings of numbers. We are good at remembering words, however, and that is where domain names come in. You probably have hundreds of domain names stored in your head, such as Google.com, mit.edu or yahoo.com.
Domain names are strings of characters with dots in them like computer.howstuffworks.com.
The last word in a domain name represents a top-level domain. These top level domain are controlled by the IANA in what's called the Root Zone Database. The following are some common top-level domains:
COM -- commercial Web sites, though open to everyone
NET -- network Web sites, though open to everyone
ORG -- non-profit organization websites, though open to everyone
EDU -- restricted to schools and educational organizations
MIL -- restricted to the U.S. military
GOV -- restricted to the U.S. government
US, UK, RU and other two-letter country codes -- each is assigned to a domain name authority in the respective country
In a domain name, each word and dot combination you add before a top-level domain indicates a level in the domain structure. Each level refers to a server or a group of servers that manage that domain level. For example, "howstuffworks" in our domain name is a second-level domain off the COM top-level domain. An organization may have a hierarchy of sub-domains further organizing its Internet presence, like "bbc.co.uk" which is the BBC's domain under CO, an additional level created by the domain name authority responsible for the UK country code.
The left-most word in the domain name, such as www or mail, is a host name. It specifies the name of a specific machine (with a specific IP address) in a domain, typically dedicated to a specific purpose. A given domain can potentially contain millions of host names as long as they're all unique to that domain.
Because all of the names in a given domain need to be unique, there has to be some way to control the list and makes sure no duplicates arise. That's where registrars come in. A registrar is an authority that can assign domain names directly under one or more top-level domains and register them with InterNIC, a service of ICANN, which enforces uniqueness of domain names across the Internet. Each domain registration becomes part of a central domain registration database known as the whois database. Network Solutions, Inc. (NSI) was one of the first registrars, and today companies like GoDaddy.com offer domain registration in addition to many other Web site and domain management services. [source: InterNIC]
Internet service providers, corporations and universities and any organization that has their own servers and us the TCP/IP protocol usually have their own DNS server or their systems are configured to connect to an external DNS server. When a user types in a name of a website or attempts to send an email, the system contacts a DNS server with the name it is looking for. The DNS server will give the address the request should be routed to after looking up information in the DNS database. If the DNS server does not have the information in its local database it will forward that request on to the Internet Service Providers DNS server or a Root DNS server.
Networks are separated from each other by specialized hosts, called gateways or routers with specialized software support optimized for routing. In routers, packets arriving at any interface are examined for source and destination addressing and queued to the appropriate outgoing interface according to their destination address and a set of rules and performance metrics. Rules are encoded in a routing table that contains entries for all interfaces and their connected networks. If no rule satisfies the requirements for a network packet, it is forwarded to a default route. Routing tables are maintained either manually by a network administrator, or updated dynamically with a routing protocol. Routing rules may contain other parameters than source and destination, such as limitations on available bandwidth, expected packet loss rates, and specific technology requirements.
IP forwarding algorithms take into account the size of each packet, the type of service specified in the header, as well as characteristics of the available links to other routers in the network, such as link capacity, utilization rate, and maximum datagram size that is supported on the link. In general, most routing software determines a route through a shortest path algorithm. However, other routing protocols may use other metrics for determining the best path. Based on the metrics required and present for each link, each path has an associated cost. The routing algorithm attempts to minimize the cost when choosing the next hop.
A routing protocol is a software mechanism by which routers communicate and share information about the topology of the network, and the capabilities of each routing node. It thus implements the network-global rules by which traffic is directed within a network and across multiple networks. Different protocols are often used for different topologies or different application areas. For example, the Open Shortest Path First (OSPF) protocol is generally used for routing packets between subnetworks within an enterprise and the Border Gateway Protocol (BGP) is used on a global scale. BGP, in particular is the de facto standard of worldwide Internet routing.