Chapter 5. TCP Client/Server Example¶
Introduction¶
We will now use the elementary functions from the previous chapter to write a complete TCP client/server example. Our simple example is an echo server that performs the following steps:
- The client reads a line of text from its standard input and writes the line to the server.
- The server reads the line from its network input and echoes the line back to the client.
- The client reads the echoed line and prints it on its standard output.
The figure below depcits this simple client/server:
Despite two arrows between the client and server in the above figure, it is really a full-duplex TCP connection. fgets
and fputs
functions are from the standard I/O library. writen
and readline
functions were shown in Section 3.9.
The echo client/server is a valid, simple example of a network application. To expand this example into your own application, all you need to do is change what the server does with the input it receives from its clients.
Besides running the client/server in normal mode (type in a line and watch it echo), we examine lots of boundary conditions:
- What happens when the client and server are started?
- What happens when the client terminates normally?
- What happens to the client if the server process terminates before the client is done?
- What happens to the client if the server host crashes?
In all these examples, we have "hard-coded" protocol-specific constants such as addresses and ports. There are two reasons for this:
- We must understand exactly what needs to be stored in the protocol-specific address structures
- We have not yet covered the library functions that can make this more portable
TCP Echo Server: main
Function¶
Our TCP client and server follow the flow of functions that we diagrammed in Figure 4.1. The below code is the concurrent server program:
#include "unp.h" int main(int argc, char **argv) { int listenfd, connfd; pid_t childpid; socklen_t clilen; struct sockaddr_in cliaddr, servaddr; listenfd = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); Bind(listenfd, (SA *) &servaddr, sizeof(servaddr)); Listen(listenfd, LISTENQ); for ( ; ; ) { clilen = sizeof(cliaddr); connfd = Accept(listenfd, (SA *) &cliaddr, &clilen); if ( (childpid = Fork()) == 0) { /* child process */ Close(listenfd); /* close listening socket */ str_echo(connfd); /* process the request */ exit(0); } Close(connfd); /* parent closes connected socket */ } }
The above code does the following:
- Create socket, bind server's well-known port
- A TCP socket is created.
- An Internet socket address structure is filled in with the wildcard address (
INADDR_ANY
) and the server's well-known port (SERV_PORT
, which is defined as 9877 in our unp.h header). Binding the wildcard address tells the system that we will accept a connection destined for any local interface, in case the system is multihomed. Our choice of the TCP port number is based on Figure 2.10 in Section 2.9. It should be greater than 1023 (we do not need a reserved port), greater than 5000 (to avoid conflict with the ephemeral ports allocated by many Berkeley-derived implementations), less than 49152 (to avoid conflict with the "correct" range of ephemeral ports), and it should not conflict with any registered port. [p122] - The socket is converted into a listening socket by
listen
.
- Wait for client connection to complete
- The server blocks in the call to
accept
, waiting for a client connection to complete.
- The server blocks in the call to
- Concurrent server
- For each client,
fork
spawns a child, and the child handles the new client. The child closes the listening socket and the parent closes the connected socket. (Section 4.8)
- For each client,
TCP Echo Server: str_echo
Function¶
The function str_echo
performs the server processing for each client: It reads data from the client and echoes it back to the client.
#include "unp.h" void str_echo(int sockfd) { ssize_t n; char buf[MAXLINE]; again: while ( (n = read(sockfd, buf, MAXLINE)) > 0) Writen(sockfd, buf, n); if (n < 0 && errno == EINTR) goto again; else if (n < 0) err_sys("str_echo: read error"); }
The above code does the following:
- Read a buffer and echo the buffer
read
reads data from the socket and the line is echoed back to the client bywriten
. If the client closes the connection (the normal scenario), the receipt of the client's FIN causes the child's read to return 0. This causes thestr_echo
function to return, which terminates the child.
TCP Echo Client: main
Function¶
#include "unp.h" int main(int argc, char **argv) { int sockfd; struct sockaddr_in servaddr; if (argc != 2) err_quit("usage: tcpcli <IPaddress>"); sockfd = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); Inet_pton(AF_INET, argv[1], &servaddr.sin_addr); Connect(sockfd, (SA *) &servaddr, sizeof(servaddr)); str_cli(stdin, sockfd); /* do it all */ exit(0); }
The above code does the following:
- Create socket, fill in Internet socket address structure
- A TCP socket is created and an Internet socket address structure is filled in with the server's IP address and port number. The server's IP address is taken from the command-line argument and the server's well-known port (
SERV_PORT
) is from ourunp.h
header.
- A TCP socket is created and an Internet socket address structure is filled in with the server's IP address and port number. The server's IP address is taken from the command-line argument and the server's well-known port (
- Connect to server
connect
establishes the connection with the server. The functionstr_cli
handles the rest of the client processing.
TCP Echo Client: str_cli
Function¶
The str_cli
function handles the client processing loop: It reads a line of text from standard input, writes it to the server, reads back the server's echo of the line, and outputs the echoed line to standard output.
#include "unp.h" void str_cli(FILE *fp, int sockfd) { char sendline[MAXLINE], recvline[MAXLINE]; while (Fgets(sendline, MAXLINE, fp) != NULL) { Writen(sockfd, sendline, strlen(sendline)); if (Readline(sockfd, recvline, MAXLINE) == 0) err_quit("str_cli: server terminated prematurely"); Fputs(recvline, stdout); } }
The above code does the following:
- Read a line, write to server
fgets
reads a line of text andwriten
sends the line to the server.
- Read echoed line from server, write to standard output
readline
reads the line echoed back from the server andfputs
writes it to standard output.
- Return to main
- The loop terminates when
fgets
returns a null pointer, which occurs when it encounters either an end-of-file (EOF) or an error. OurFgets
wrapper function checks for an error and aborts if one occurs, soFgets
returns a null pointer only when an end-of-file is encountered.
- The loop terminates when
Normal Startup¶
Although the TCP example is small, it is essential that we understand:
- How the client and server start and end,
- What happens when something goes wrong:
- the client host crashes,
- the client process crashes,
- network connectivity is lost
Only by understanding these boundary conditions, and their interaction with the TCP/IP protocols, can we write robust clients and servers that can handle these conditions.
Start the server in the background¶
First, we start the server in the background:
linux % tcpserv01 & [1] 17870
When the server starts, it calls socket
, bind
, listen
, and accept
, blocking in the call to accept.
Run netstat
¶
Before starting the client, we run the netstat
program to verify the state of the server's listening socket.
linux % netstat -a Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 *:9877 *:* LISTEN
This command shows the status of all sockets on the system. We must specify the -a
flag to see listening sockets.
In the output, a socket is in the LISTEN state with a wildcard for the local IP address and a local port of 9877. netstat
prints an asterisk for an IP address of 0 (INADDR_ANY
, the wildcard) or for a port of 0.
Start the client on the same host¶
We then start the client on the same host, specifying the server's IP address of 127.0.0.1 (the loopback address). We could have also specified the server's normal (nonloopback) IP address.
linux % tcpcli01 127.0.0.1
The client calls socket
, and connect
which causes TCP's three-way handshake. When the three-way handshake completes, connect
returns in the client and accept
returns in the server. The connection is established. The following steps then take place:
- The client calls
str_cli
, which will block in the call tofgets
. - When
accept
returns in the server, it callsfork
and the child callsstr_echo
. This function callsreadline
, which callsread
, which blocks while waiting for a line to be sent from the client. - The server parent, on the other hand, calls
accept
again, and blocks while waiting for the next client connection.
Notes from the previous three steps:
- All three processes are asleep (blocked): client, server parent, and server child.
- We purposely list the client step first, and then the server steps when the three-way handshake completes. This is because
accept
returns one-half of the RTT afterconnect
returns (see Figure 2.5):- On the client side,
connect
returns when the second segment of the handshake is received - On the server side,
accept
does not return until the third segment of the handshake is received
- On the client side,
Run netstat
after connection completes¶
Since we are running the client and server on the same host, netstat
now shows two additional lines of output, corresponding to the TCP connection:
linux % netstat -a Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 local host:9877 localhost:42758 ESTABLISHED tcp 0 0 local host:42758 localhost:9877 ESTABLISHED tcp 0 0 *:9877 *:* LISTEN
- The first ESTABLISHED line corresponds to the server child's socket, since the local port is 9877.
- The second ESTABLISHED lines is the client's socket, since the local port is 42758
If we were running the client and server on different hosts, the client host would display only the client's socket, and the server host would display only the two server sockets.
Run ps
to check process status and relationship¶
linux % ps -t pts/6 -o pid,ppid,tty,stat,args,wchan PID PPID TT STAT COMMAND WCHAN 22038 22036 pts/6 S -bash wait4 17870 22038 pts/6 S ./tcpserv01 wait_for_connect 19315 17870 pts/6 S ./tcpserv01 tcp_data_wait 19314 22038 pts/6 S ./tcpcli01 127.0 read_chan
Very specific arguments to ps
are used:
- The TT column (
pts/6
): client and server are run from the same window, pseudo-terminal number 6. - The PID and PPID columns show the parent and child relationships.
- The first
tcpserv01
line is the parent and the second tcpserv01 line is the child since the PPID of the child is the parent's PID. - The PPID of the parent is the shell (bash).
- The first
- The STAT column for all three of our network processes is "S", meaning the process is sleeping (waiting for something).
- The WCHAN column specifies the condition when a process is asleep.
- Linux prints
wait_for_connect
when a process is blocked in eitheraccept
orconnect
,tcp_data_wait
when a process is blocked on socket input or output, orread_chan
when a process is blocked on terminal I/O. - In
ps(1)
, WCHAN column indicates the name of the kernel function in which the process is sleeping, a "-" if the process is running, or a "*" if the process is multi-threaded and ps is not displaying threads.
- Linux prints
Normal Termination¶
At this point, the connection is established and whatever we type to the client is echoed back.
linux % tcpcli01 127.0.0.1 # we showed this line earlier hello, world # we now type this hello, world # and the line is echoed good bye good bye ^D # Control-D is our terminal EOF character
If we immediately execute netstat, we have:
linux % netstat -a | grep 9877 tcp 0 0 *:9877 *:* LISTEN tcp 0 0 localhost:42758 localhost:9877 TIME_WAIT
This time we pipe the output of netstat into grep
, printing only the lines with our server's well-known port:
- The client's side of the connection (since the local port is 42758) enters the TIME_WAIT state
- The listening server is still waiting for another client connection.
The following steps are involved in the normal termination of client and server:
- When we type our EOF character,
fgets
returns a null pointer and the functionstr_cli
(Section 5.5) returns. str_cli
returns to the clientmain
function (Section 5.5), which terminates by callingexit
.- Part of process termination is the closing of all open descriptors, so the client socket is closed by the kernel. This sends a FIN to the server, to which the server TCP responds with an ACK. This is the first half of the TCP connection termination sequence. At this point, the server socket is in the CLOSE_WAIT state and the client socket is in the FIN_WAIT_2 state (Figure 2.4 and Figure 2.5)
- When the server TCP receives the FIN, the server child is blocked in a call to
read
(Section 3.8), andread
then returns 0. This causes thestr_echo
function to return to the server child main. [Errata] [p128] - The server child terminates by calling exit. (Section 5.2)
- All open descriptors in the server child are closed.
- The closing of the connected socket by the child causes the final two segments of the TCP connection termination to take place: a FIN from the server to the client, and an ACK from the client.
- Finally, the
SIGCHLD
signal is sent to the parent when the server child terminates.- This occurs in this example, but we do not catch the signal in our code, and the default action of the signal is to be ignored. Thus, the child enters the zombie state. We can verify this with the
ps
command.
- This occurs in this example, but we do not catch the signal in our code, and the default action of the signal is to be ignored. Thus, the child enters the zombie state. We can verify this with the
linux % ps -t pts/6 -o pid,ppid,tty,stat,args,wchan PID PPID TT STAT COMMAND WCHAN 22038 22036 pts/6 S -bash read_chan 17870 22038 pts/6 S ./tcpserv01 wait_for_connect 19315 17870 pts/6 Z [tcpserv01 <defu do_exit
The STAT of the child is now Z
(for zombie).
We need to clean up our zombie processes and doing this requires dealing with Unix signals. The next section will give an overview of signal handling.
POSIX Signal Handling¶
A signal is a notification to a process that an event has occurred. Signals are sometimes called software interrupts. Signals usually occur asynchronously, which means that a process doesn't know ahead of time exactly when a signal will occur.
Signals can be sent:
- By one process to another process (or to itself)
- By the kernel to a process.
- For example, whenever a process terminates, the kernel send a
SIGCHLD
signal to the parent of the terminating process.
- For example, whenever a process terminates, the kernel send a
Every signal has a disposition, which is also called the action associated with the signal. We set the disposition of a signal by calling the sigaction
function and we have three choices for the disposition:
-
Catching a signal. We can provide a function called a signal handler that is called whenever a specific signal occurs. The two signals
SIGKILL
andSIGSTOP
cannot be caught. Our function is called with a single integer argument that is the signal number and the function returns nothing. Its function prototype is therefore:void handler (int signo);
For most signals, we can call
sigaction
and specify the signal handler to catch it. A few signals,SIGIO
,SIGPOLL
, andSIGURG
, all require additional actions on the part of the process to catch the signal. -
Ignoring a signal. We can ignore a signal by setting its disposition to
SIG_IGN
. The two signals SIGKILL and SIGSTOP cannot be ignored. - Setting the default disposition for a signal. This can be done by setting its disposition to
SIG_DFL
. The default is normally to terminate a process on receipt of a signal, with certain signals also generating a core image of the process in its current working directory. There are a few signals whose default disposition is to be ignored:SIGCHLD
andSIGURG
(sent on the arrival of out-of-band data) are two that we will encounter in this text.
signal
Function¶
The POSIX way to establish the disposition of a signal is to call the sigaction
function, which is complicated in that one argument to the function is a structure (struct sigaction
) that we must allocate and fill in.
An easier way to set the disposition of a signal is to call the signal
function. The first argument is the signal name and the second argument is either a pointer to a function or one of the constants SIG_IGN
or SIG_DFL
.
However, signal
is an historical function that predates POSIX. Different implementations provide different signal semantics when it is called, providing backward compatibility, whereas POSIX explicitly spells out the semantics when sigaction
is called.
The solution is to define our own function named signal
that just calls the POSIX sigaction
function. This provides a simple interface with the desired POSIX semantics. We include this function in our own library, along with our err
_XXX functions and our wrapper functions. [p130]
#include "unp.h" Sigfunc * signal(int signo, Sigfunc *func) { struct sigaction act, oact; act.sa_handler = func; sigemptyset(&act.sa_mask); act.sa_flags = 0; if (signo == SIGALRM) { #ifdef SA_INTERRUPT act.sa_flags |= SA_INTERRUPT; /* SunOS 4.x */ #endif } else { #ifdef SA_RESTART act.sa_flags |= SA_RESTART; /* SVR4, 44BSD */ #endif } if (sigaction(signo, &act, &oact) < 0) return(SIG_ERR); return(oact.sa_handler); } /* end signal */ Sigfunc * Signal(int signo, Sigfunc *func) /* for our signal() function */ { Sigfunc *sigfunc; if ( (sigfunc = signal(signo, func)) == SIG_ERR) err_sys("signal error"); return(sigfunc); }
Simplify function prototype using typedef
¶
The normal function prototype for signal
is complicated by the level of nested parentheses.
void (*signal (int signo, void (*func) (int))) (int);
To simplify this, we define the Sigfunc
type in our unp.h header as
typedef void Sigfunc(int);
stating that signal handlers are functions with an integer argument and the function returns nothing (void
). The function prototype then becomes
Sigfunc *signal (int signo, Sigfunc *func);
A pointer to a signal handling function is the second argument to the function, as well as the return value from the function.
Set handler¶
The sa_handler
member of the sigaction
structure is set to the func argument.
Set signal mask for handler¶
POSIX allows us to specify a set of signals that will be blocked when our signal handler is called. Any signal that is blocked cannot be delivered to a process. We set the sa_mask
member to the empty set, which means that no additional signals will be blocked while our signal handler is running. POSIX guarantees that the signal being caught is always blocked while its handler is executing.
Set SA_RESTART
flag¶
SA_RESTART
is an optional flag. When the flag is set, a system call interrupted by this signal will be automatically restarted by the kernel.
If the signal being caught is not SIGALRM
, we specify the SA_RESTART
flag, if defined. This is because the purpose of generating the SIGALRM
signal is normally to place a timeout on an I/O operation, in which case, we want the blocked system call to be interrupted by the signal. [p131]
Call sigaction
¶
We call sigaction
and then return the old action for the signal as the return value of the signal function.
Throughout this text, we will use the signal
function from the above definition.
Handling SIGCHLD
Signals¶
The zombie state is to maintain information about the child for the parent to fetch later, which includes:
- process ID of the child,
- termination status,
- information on the resource utilization of the child.
If a parent process of zombie children terminates, the parent process ID of all the zombie children is set to 1 (the init
process), which will inherit the children and clean them up (init
will wait
for them, which removes the zombie). [p132]
Handling Zombies¶
Zombies take up space in the kernel and eventually we can run out of processes. Whenever we fork
children, we must wait
for them to prevent them from becoming zombies. We can establish a signal handler to catch SIGCHLD
and call wait
within the handler. We establish the signal handler by adding the following function call after the call to listen
(in server's main
function; it must be done before fork
ing the first child and needs to be done only once.):
Signal (SIGCHLD, sig_chld);
The signal handler, the function sig_chld
, is defined below:
#include "unp.h" void sig_chld(int signo) { pid_t pid; int stat; pid = wait(&stat); printf("child %d terminated\n", pid); return; }
Note that calling standard I/O functions such as printf
in a signal handler is not recommended. We call printf
here as a diagnostic tool to see when the child terminates.
Compiling and running the program on Solaris *¶
This program (tcpcliserv/tcpserv02.c) is compiled on Solaris 9 and uses the signal
function from the system library (not our version).
solaris % tcpserv02 & # start server in background [2] 16939 solaris % tcpcli01 127.0.0.1 # then start client in foreground hi there # we type this hi there # and this is echoed ^D # we type our EOF character child 16942 terminated # output by printf in signal handler accept error: Interrupted system call # main function aborts
The sequence of steps is as follows:
- We terminate the client by typing our EOF character. The client TCP sends a FIN to the server and the server responds with an ACK.
- The receipt of the FIN delivers an EOF to the child's pending
readline
. The child terminates. - The parent is blocked in its call to accept when the
SIGCHLD
signal is delivered. Thesig_chld
function executes (our signal handler),wait
fetches the child's PID and termination status, andprintf
is called from the signal handler. The signal handler returns. - Since the signal was caught by the parent while the parent was blocked in a slow system call (
accept
), the kernel causes theaccept
to return an error ofEINTR
(interrupted system call). The parent does not handle this error (see server'smain
function), so it aborts.
From this example, we know that when writing network programs that catch signals, we must be cognizant of interrupted system calls, and we must handle them. In this example, the signal
function provided in the standard C library does not cause an interrupted system call to be automatically restarted by the kernel. Some other systems automatically restart the interrupted system call. If we run the same example under 4.4BSD, using its library version of the signal
function, the kernel restarts the interrupted system call and accept does not return an error. To handle this potential problem between different operating systems is one reason we define our own version of the signal
function. [p134]
As part of the coding conventions used in this text, we always code an explicit return in our signal handlers, even though this is unnecessary for a function returning void
. This reads as a reminder that the return may interrupt a system call.
Handling Interrupted System Calls¶
The term "slow system call" is used to describe any system call that can block forever, such as accept
. That is, the system call need never return. Most networking functions fall into this category. Examples are:
accept
: there is no guarantee that a server's call toaccept
will ever return, if there are no clients that will connect to the server.read
: the server's call toread
in server'sstr_echo
function will never return if the client never sends a line for the server to echo.
Other examples of slow system calls are reads and writes of pipes and terminal devices. A notable exception is disk I/O, which usually returns to the caller (assuming no catastrophic hardware failure).
When a process is blocked in a slow system call and the process catches a signal and the signal handler returns, the system call can return an error of EINT
. Some kernels automatically restart some interrupted system calls. For portability, when we write a program that catches signals (most concurrent servers catch SIGCHLD
), we must be prepared for slow system calls to return EINTR
. [p134]
To handle an interrupted accept
, we change the call to accept
in server's main
function, the beginning of the for loop, to the following:
for ( ; ; ) { clilen = sizeof (cliaddr); if ( (connfd = accept (listenfd, (SA *) &cliaddr, &clilen)) < 0) { if (errno == EINTR) continue; /* back to for () */ else err_sys ("accept error"); }
Note that this accept
is not our wrapper function Accept
, since we must handle the failure of the function ourselves.
The modified version of the server source code is tcpcliserv/tcpserv03.c.
Restarting the interrupted system call is fine for:
accept
read
write
select
open
However, there is one function that we cannot restart: connect
. If this function returns EINTR
, we cannot call it again, as doing so will return an immediate error. When connect
is interrupted by a caught signal and is not automatically restarted, we must call select
to wait for the connection to complete.
wait
and waitpid
Functions¶
We can call wait
function to handle the terminated child.
#include <sys/wait.h> pid_t wait (int *statloc); pid_t waitpid (pid_t pid, int *statloc, int options); /* Both return: process ID if OK, 0 or–1 on error */
wait
and waitpid
both return two values: the return value of the function is the process ID of the terminated child, and the termination status of the child (an integer) is returned through the statloc pointer.
There are three macros that we can call that examine the termination status (see APUE):
WIFEXITED
: tells if the child terminated normallyWIFSIGNALED
: tells if the child was killed by a signalWIFSTOPPED
: tells if the child was just stopped by job control
Additional macros let us then fetch the exit status of the child, or the value of the signal that killed the child, or the value of the job-control signal that stopped the child. We will use the WIFEXITED
and WEXITSTATUS
macros for this purpose.
If there are no terminated children for the process calling wait
, but the process has one or more children that are still executing, then wait
blocks until the first of the existing children terminates.
waitpid
has more control over which process to wait for and whether or not to block:
- The pid argument specifies the process ID that we want to wait for. A value of -1 says to wait for the first of our children to terminate.
- The options argument specifies additional options. The most common option is
WNOHANG
, which tells the kernel not to block if there are no terminated children.
Difference between wait
and waitpid
¶
The following example illustrates the difference between the wait
and waitpid
functions when used to clean up terminated children.
We modify our TCP client as below, which establishes five connections with the server and then uses only the first one (sockfd[0]
) in the call to str_cli
. The purpose of establishing multiple connections is to spawn multiple children from the concurrent server.
#include "unp.h" int main(int argc, char **argv) { int i, sockfd[5]; struct sockaddr_in servaddr; if (argc != 2) err_quit("usage: tcpcli <IPaddress>"); for (i = 0; i < 5; i++) { sockfd[i] = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); Inet_pton(AF_INET, argv[1], &servaddr.sin_addr); Connect(sockfd[i], (SA *) &servaddr, sizeof(servaddr)); } str_cli(stdin, sockfd[0]); /* do it all */ exit(0); }
When the client terminates, all open descriptors are closed automatically by the kernel (we do not call close
, only exit
), and all five connections are terminated at about the same time. This causes five FINs to be sent, one on each connection, which in turn causes all five server children to terminate at about the same time. This causes five SIGCHLD
signals to be delivered to the parent at about the same time. This causes the problem under discussion.
We first run the server (tcpcliserv/tcpserv03.c) in the background and then our new client:
linux % tcpserv03 & [1] 20419 linux % tcpcli04 127.0.0.1 hello # we type this hello # and it is echoed ^D # we then type our EOF character child 20426 terminated # output by server
Only one printf
is output, when we expect all five children to have terminated. If we execute ps
, we see that the other four children still exist as zombies.
PID TTY TIME CMD 20419 pts/6 00:00:00 tcpserv03 20421 pts/6 00:00:00 tcpserv03 <defunct> 20422 pts/6 00:00:00 tcpserv03 <defunct> 20423 pts/6 00:00:00 tcpserv03 <defunct>
Establishing a signal handler and calling wait from that handler are insufficient for preventing zombies. The problem is that all five signals are generated before the signal handler is executed, and the signal handler is executed only one time because Unix signals are normally not queued.This problem is nondeterministic. Dependent on the timing of the FINs arriving at the server host, the signal handler is executed two, three or even four times.
The correct solution is to call waitpid
instead of wait
. The code below shows the version of our sig_chld
function that handles SIGCHLD
correctly. This version works because we call waitpid
within a loop, fetching the status of any of our children that have terminated, with the WNOHANG
option, which tells waitpid
not to block if there are running children that have not yet terminated. We cannot call wait
in a loop, because there is no way to prevent wait from blocking if there are running children that have not yet terminated.
#include "unp.h" void sig_chld(int signo) { pid_t pid; int stat; while ( (pid = waitpid(-1, &stat, WNOHANG)) > 0) printf("child %d terminated\n", pid); return; }
The code below shows the final version of our server. It correctly handles a return of EINTR
from accept
and it establishes a signal handler (code above) that calls waitpid
for all terminated children.
#include "unp.h" int main(int argc, char **argv) { int listenfd, connfd; pid_t childpid; socklen_t clilen; struct sockaddr_in cliaddr, servaddr; void sig_chld(int); listenfd = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); Bind(listenfd, (SA *) &servaddr, sizeof(servaddr)); Listen(listenfd, LISTENQ); Signal(SIGCHLD, sig_chld); /* must call waitpid() */ for ( ; ; ) { clilen = sizeof(cliaddr); if ( (connfd = accept(listenfd, (SA *) &cliaddr, &clilen)) < 0) { if (errno == EINTR) continue; /* back to for() */ else err_sys("accept error"); } if ( (childpid = Fork()) == 0) { /* child process */ Close(listenfd); /* close listening socket */ str_echo(connfd); /* process the request */ exit(0); } Close(connfd); /* parent closes connected socket */ } }
The purpose of this section has been to demonstrate three scenarios that we can encounter with network programming:
- We must catch the
SIGCHLD
signal when forking child processes. - We must handle interrupted system calls when we catch signals.
- A
SIGCHLD
handler must be coded correctly usingwaitpid
to prevent any zombies from being left around.
Connection Abort before accept
Returns¶
There is another condition similar to the interrupted system call that can cause accept
to return a nonfatal error, in which case we should just call accept
again. The sequence of packets shown below has been seen on busy servers (typically busy Web servers), where the server receives an RST for an ESTABLISHED
connection before accept is called.
The three-way handshake completes, the connection is established, and then the client TCP sends an RST (reset). On the server side, the connection is queued by its TCP, waiting for the server process to call accept when the RST arrives. Sometime later, the server process calls accept.
An easy way to simulate this scenario is to start the server, have it call socket
, bind
, and listen
, and then go to sleep for a short period of time before calling accept
. While the server process is asleep, start the client and have it call socket
and connect
. As soon as connect
returns, set the SO_LINGER
socket option to generate the RST and terminate.
Termination of Server Process¶
We will now start our client/server and then kill the server child process, which simulates the crashing of the server process. We must be careful to distinguish between the crashing of the server process and the crashing of the server host.
The following steps take place:
- We start the server and client and type one line to the client to verify that all is okay. That line is echoed normally by the server child.
- We find the process ID of the server child and
kill
it. As part of process termination, all open descriptors in the child are closed. This causes a FIN to be sent to the client, and the client TCP responds with an ACK. This is the first half of the TCP connection termination. - The
SIGCHLD
signal is sent to the server parent and handled correctly. - Nothing happens at the client. The client TCP receives the FIN from the server TCP and responds with an ACK, but the problem is that the client process is blocked in the call to
fgets
waiting for a line from the terminal. -
Running
netstat
at this point shows the state of the sockets.linux % netstat -a | grep 9877 tcp 0 0 *:9877 *:* LISTEN tcp 0 0 localhost:9877 localhost:43604 FIN_WAIT2 tcp 1 0 localhost:43604 localhost:9877 CLOSE_WAIT
-
We can still type a line of input to the client. Here is what happens at the client starting from Step 1:
linux % tcpcli01 127.0.0.1 # start client hello # the first line that we type hello # is echoed correctly we kill the server child on the server host another line # we then type a second line to the client str_cli : server terminated prematurely
When we type "another line,"
str_cli
callswriten
and the client TCP sends the data to the server. This is allowed by TCP because the receipt of the FIN by the client TCP only indicates that the server process has closed its end of the connection and will not be sending any more data. The receipt of the FIN does not tell the client TCP that the server process has terminated (which in this case, it has).When the server TCP receives the data from the client, it responds with an RST since the process that had that socket open has terminated. We can verify that the RST was sent by watching the packets with
tcpdump
. -
The client process will not see the RST because it calls
readline
immediately after the call to writen and readline returns 0 (EOF) immediately because of the FIN that was received in Step 2. Our client is not expecting to receive an EOF at this point (str_cli) so it quits with the error message "server terminated prematurely." - When the client terminates (by calling
err_quit
in str_cli), all its open descriptors are closed.- If the
readline
happens before the RST is received (as shown in this example), the result is an unexpected EOF in the client. - If the RST arrives first, the result is an
ECONNRESET
("Connection reset by peer") error return fromreadline
.
- If the
The problem in this example is that the client is blocked in the call to fgets
when the FIN arrives on the socket. The client is really working with two descriptors,the socket and the user input. Instead of blocking on input from only one of the two sources, it should block on input from either source. Indeed, this is one purpose of the select
and poll
functions described in Chapter 6.
SIGPIPE
Signal¶
The rules are:
- When a process writes to a socket that has received an RST, the
SIGPIPE
signal is sent to the process. The default action of this signal is to terminate the process, so the process must catch the signal to avoid being involuntarily terminated. - If the process either catches the signal and returns from the signal handler, or ignores the signal, the write operation returns
EPIPE
.
We can simulate this from the client by performing two writes to the server (which has sent FIN to the client) before reading anything back, with the first write eliciting the RST (causing the server to send an RST to the client). We must use two writes to obtain the signal, because the first write elicits the RST and the second write elicits the signal. It is okay to write to a socket that has received a FIN, but it is an error to write to a socket that has received an RST.
We modify our client as below:
#include "unp.h" void str_cli(FILE *fp, int sockfd) { char sendline[MAXLINE], recvline[MAXLINE]; while (Fgets(sendline, MAXLINE, fp) != NULL) { Writen(sockfd, sendline, 1); sleep(1); Writen(sockfd, sendline+1, strlen(sendline)-1); if (Readline(sockfd, recvline, MAXLINE) == 0) err_quit("str_cli: server terminated prematurely"); Fputs(recvline, stdout); } }
The writen
is called two times. The intent is for the first writen
to elicit the RST and then for the second writen
to generate SIGPIPE
.
Run the program on the Linux host:
linux % tcpclill 127.0.0.1 hi there # we type this line hi there # this is echoed by the server # here we kill the server child bye # then we type this line Broken pipe # this is printed by the shell
We start the client, type in one line, see that line echoed correctly, and then terminate the server child on the server host. We then type another line ("bye") and the shell tells us the process died with a SIGPIPE
signal.
The recommended way to handle SIGPIPE
depends on what the application wants to do when this occurs:
- If there is nothing special to do, then setting the signal disposition to
SIG_IGN
is easy, assuming that subsequent output operations will catch the error ofEPIPE
and terminate. - If special actions are needed when the signal occurs (writing to a log file perhaps), then the signal should be caught and any desired actions can be performed in the signal handler.
- If multiple sockets are in use, the delivery of the signal will not tell us which socket encountered the error. If we need to know which
write
caused the error, then we must either ignore the signal or return from the signal handler and handleEPIPE
from thewrite
.
Crashing of Server Host¶
To simulate what happens when the server host crashes, we must run the client and server on different hosts. We then start the server, start the client, type in a line to the client to verify that the connection is up, disconnect the server host from the network, and type in another line at the client. This also covers the scenario of the server host being unreachable when the client sends data (i.e., some intermediate router goes down after the connection has been established).
The following steps take place:
- When the server host crashes (which means it is not shut down by an operator), nothing is sent out on the existing network connections.
- We type a line of input to the client, it is written by
writen
(str_cli), and is sent by the client TCP as a data segment. The client then blocks in the call toreadline
, waiting for the echoed reply. - With
tcpdump
, we will see the client TCP continually retransmitting the data segment, trying to receive an ACK from the server. Berkeley-derived implementations retransmit the data segment 12 times, waiting for around 9 minutes before giving up. When the client TCP finally gives up (assuming the server host has not been rebooted during this time, or the server host is still unreachable), an error is returned to the client process'sreadline
. The error can be one of the following:- If the server host crashed and there were no responses at all to the client's data segments, the error is
ETIMEDOUT
. - If some intermediate router determined that the server host was unreachable and responded with an ICMP "destination unreachable" message, the error is either
EHOSTUNREACH
orENETUNREACH
.
- If the server host crashed and there were no responses at all to the client's data segments, the error is
To detect that the peer is down or unreachable quicker than 9 minutes, we can place a timeout on the call to readline
, which is discussed in Chapter 14.
This example detects that the server host has crashed only when we send data to that host. If we want to detect the crashing of the server host even if we are not actively sending it data, another technique is required: SO_KEEPALIVE socket option (Chapter 7).
Crashing and Rebooting of Server Host¶
In the following example, we will establish a connection between the client and server and then assume the server host crashes and reboots. The easiest way to simulate this is to establish the connection, disconnect the server from the network, shut down the server host and then reboot it, and then reconnect the server host to the network. We do not want the client to see the server host shut down.
As stated in the previous section, if the client is not actively sending data to the server when the server host crashes, the client is not aware that the server host has crashed. The following steps take place:
- We start the server and then the client. We type a line to verify that the connection is established.
- The server host crashes and reboots.
- We type a line of input to the client, which is sent as a TCP data segment to the server host.
- When the server host reboots after crashing, its TCP loses all information about connections that existed before the crash. Therefore, the server TCP responds to the received data segment from the client with an RST.
- Our client is blocked in the call to
readline
when the RST is received, causingreadline
to return the errorECONNRESET
.
If it is important for our client to detect the crashing of the server host, even if the client is not actively sending data, then some other technique, such as the SO_KEEPALIVE
socket option or some client/server heartbeat function, is required.
Shutdown of Server Host¶
This section discusses what happens if the server host is shut down by an operator while our server process is running on that host.
When a Unix system is shut down, the following steps happen:
- The
init
process normally sends theSIGTERM
signal to all processes (we can catch this signal). - The
init
waits some fixed amount of time (often between 5 and 20 seconds). - The
init
sends theSIGKILL
signal (which we cannot catch) to any processes still running.
This gives all running processes a short amount of time to clean up and terminate. When the process terminates, all open descriptors are closed (the sequence of steps are same to Termination of Server Process). We must use the select
or poll
function in our client to have the client detect the termination of the server process as soon as it occurs.
Summary of TCP Example¶
Before any TCP client and server can communicate with each other, each end must specify the socket pair for the connection: the local IP address, local port, foreign IP address, and foreign port. These four values are shown as bullets in the two figures below.
Client's perspective¶
connect
. The foreign IP address and foreign port must be specified by the client in the call toconnect
. The two local values are normally chosen by the kernel as part of theconnect
function.bind
. The client has the option of specifying either or both of the local values, bycalling
bind beforeconnect
, but this is not common.getsockname
. The client can obtain the two local values chosen by the kernel by callinggetsockname
after the connection is established.
Server's perspective¶
bind
. The local port (the server's well-known port) is specified bybind
. Normally, the server also specifies the wildcard IP address in this call.getsockname
. If the server binds the wildcard IP address on a multihomed host, it can determine the local IP address by callinggetsockname
after the connection is established.accept.
The two foreign values are returned to the server byaccept
.getpeername
. If another program isexec
ed by the server that callsaccept
, that program can callgetpeername
to determine the client's IP address and port, if necessary.
Data Format¶
Normally we must worry about the format of the data exchanged between the client and server.
Example: Passing Text Strings between Client and Server¶
We modify our server so that it still reads a line of text from the client, but the server now expects that line to contain two integers separated by white space, and the server returns the sum of those two integers. All that changes is our str_echo
function:
#include "unp.h" void str_echo(int sockfd) { long arg1, arg2; ssize_t n; char line[MAXLINE]; for ( ; ; ) { if ( (n = Readline(sockfd, line, MAXLINE)) == 0) return; /* connection closed by other end */ if (sscanf(line, "%ld%ld", &arg1, &arg2) == 2) snprintf(line, sizeof(line), "%ld\n", arg1 + arg2); else snprintf(line, sizeof(line), "input error\n"); n = strlen(line); Writen(sockfd, line, n); } }
We call sscanf
to convert the two arguments from text strings to long integers, and then snprintf is called to convert the result into a text string.
This modified client and server work fine, regardless of the byte ordering of the client and server hosts.
Example: Passing Binary Structures between Client and Server¶
We now modify our client and server to pass binary values across the socket, instead of text strings. We will see that this does not work when the client and server are run on hosts with different byte orders, or on hosts that do not agree on the size of a long integer
We define one structure for the two arguments, another structure for the result, and place both definitions in our sum.h header. Below show the modified str_cli
function and str_echo
function.
#include "unp.h" int main(int argc, char **argv) { int sockfd; struct sockaddr_in servaddr; if (argc != 2) err_quit("usage: tcpcli <IPaddress>"); sockfd = Socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); Inet_pton(AF_INET, argv[1], &servaddr.sin_addr); Connect(sockfd, (SA *) &servaddr, sizeof(servaddr)); str_cli(stdin, sockfd); /* do it all */ exit(0); }
sscanf
converts the two arguments from text strings to binary, and we call writen
to send the structure to the server. We call readn
to read the reply, and print the result using printf
.
#include "unp.h" #include "sum.h" void str_echo(int sockfd) { ssize_t n; struct args args; struct result result; for ( ; ; ) { if ( (n = Readn(sockfd, &args, sizeof(args))) == 0) return; /* connection closed by other end */ result.sum = args.arg1 + args.arg2; Writen(sockfd, &result, sizeof(result)); } }
We read the arguments by calling readn
, calculate and store the sum, and call writen
to send back the result structure.
If we run the client and server on two machines of the same architecture, say two SPARC machines, everything works fine. But when the client and server are on two machines of different architectures (say the server is on the big-endian SPARC system freebsd and the client is on the little endian Intel system linux), it does not work.
linux % tcpcli09 206.168.112.96 1 2 # we type this 3 # and it works -22 -77 # then we type this -16777314 # and it does not work
The problem is that the two binary integers are sent across the socket in little-endian format by the client, but interpreted as big-endian integers by the server. It appears to work for positive integers but fails for negative integers. There are really three potential problems:
- Different implementations store binary numbers in different formats. The most common formats are big-endian and little-endian, as we described in Section 3.4.
- Different implementations can store the same C datatype differently. For example, most 32-bit Unix systems use 32 bits for a long but 64-bit systems typically use 64 bits for the same datatype. There is no guarantee that a
short
,int
, orlong
is of any certain size. - Different implementations pack structures differently, depending on the number of bits used for the various datatypes and the alignment restrictions of the machine. Therefore, it is never wise to send binary structures across a socket.
There are two common solutions to this data format problem:
- Pass all numeric data as text strings.
- Explicitly define the binary formats of the supported datatypes (number of bits, big- or little-endian) and pass all data between the client and server in this format. RPC packages normally use this technique. RFC 1832 describes the External Data Representation (XDR) standard that is used with the Sun RPC package.
Summary¶
- The first problem was zombie children and we caught the
SIGCHLD
signal to handle this. Our signal handler then calledwaitpid
and we must call this function instead of the olderwait
function, since Unix signals are not queued. - The next problem we encountered was the client not being notified when the server process terminated. We saw that our client's TCP was notified, but we did not receive that notification since we were blocked, waiting for user input. We will use the
select
orpoll
function in Chapter 6 to handle this scenario, by waiting for any one of multiple descriptors to be ready, instead of blocking on a single descriptor. - If the server host crashes, we do not detect this until the client sends data to the server. Some applications must be made aware of this fact sooner; we will look at the
SO_KEEPALIVE
socket option in Chapter 7.