Access:

» C/C++ Programming Tools under Linux

Related categories: C/C++ | Linux | Linux | Programming in generall

Maciej Zawadziński
Viewed: 8202 | Article date: 2006-07-18 14:44:23

Within this article Maciej'll attempt to familiarise the Readers with basic techniques and tools assisting the programmer in course of developing both less and more complex software.

Within this article I will attempt to familiarise the Readers with basic techniques and tools assisting the programmer in course of developing both less and more complex software.

About the Author

Maciej Zawadziński is a student at the Faculty of Mathematics and Computer Science of University of Wrocław, a programmer, an administrator and a fanatic of *BSD system. Member of Akademia Alternatywnych Systemów Operacyjnych (Academy of Alternate Operating Systems; http://www.aaso.pl) and a co-founder of the student association AASOC (http://www.aasoc.pwr.wroc.pl). He develops software based on Open Source technologies and takes active part in events in the IT sector.

Contact: mzawadzinski@gmail.com

A well-written program ought to, when an error has occurred, provide the programmer with information so detailed that he can quickly locate and fix the problem. Unfortunately most applications diverge very highly from this idealised assumption. What is more, sometimes it is very hard to determine where, and first of all - why, the problem occurs, just by looking at the source code and watching the behaviour of the program. To begin with, we can get some help here by using the tools: strace and ltrace, which do not require us to tamper with the application in any way - just run it (this is particularly useful when we don't have the source code but want to see what could have caused the error, or when we're curious about how our favourite program works...).

Tracing Library Function and System Calls.

The purpose of the tools: strace and ltrace is to trace executed process. The former displays system calls, the latter - calls to library functions. Furthermore, using them makes it possible to obtain information about signals the process received, arguments system calls (or library functions, respectively) were executed with and - the most important part - what their return codes were and what they mean.

For starters, we shall examine a simple server of the daytime service (RFC 867), listening on TCP port 13. After the client has established a connection, the server transmits current date and time in plain text and closes it. Our program will make use of the system command date. The first version of the server can be found in listing 1. Having compiled and started the program we can see the service available on TCP port 13 (it is the port assigned to the daytime service as specified on the list of ports maintained by the organisation IANA - more information at: http://www.iana.org/). Let us try connecting to the service:

# telnet 127.0.0.1 daytime

Trying 127.0.0.1...

Connected to localhost.

Escape character is '^]'.

accept(): Bad file descriptor

Connection closed by foreign host.

That sure didn't look like current date! What is more, the program has returned something it shouldn't have. However, before we start poking around the source code in an attempt to locate the bug we will make use of strace. Execute the following command:

# strace -p 6277 -f

Then use another console to reconnect to the service, obtaining the same error message.

The option -f indicates we want to trace system calls of child processes of the program under investigation, whereas -p [pid] allows us to specify the identifier of the already-running process we want to attach to. The inset ”The Command lsof” shows how one can quickly obtain the process identifier (PID) of a service knowing the port it listens on.

Listing 2 shows the output of strace. We are mainly interested in child processes, i.e. those which handle connecting clients. Note the fragment highlighted in red - this is the part responsible for the error.

Listing 1.Daytime.c

/* prosty serwer usługi daytime (RFC 867) */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <netinet/in.h>

#define DAYTIME_PORT 13

int main(void)
{
int lsock, csock, cliaddrlen;
struct sockaddr_in servaddr, *cliaddr;

/* utworzenie gniazda IPv4/TCP */
if( (lsock = socket(AF_INET, SOCK_STREAM, 0)) < 0 ) {
perror("socket()");
exit(-1);
}
/* inicjalizacja danych - przypisanie adresu i portu */
memset(&servaddr,0,sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port = htons(DAYTIME_PORT);

if( bind(lsock, (struct sockaddr*)&servaddr, sizeof(servaddr)) < 0 ) {
perror("bind()");
exit(-1);
}
/* przestaw gniazdo w tryb oczekiwania na polaczenia */
if ( listen(lsock, 128) < 0 ) {
perror("listen()");
exit(-1);
}
/* pętla główna - akceptuj połączenia i przekazuj je procesom potomnym */
while( 1 ) {
cliaddr = calloc(1, sizeof(struct sockaddr_in));
cliaddrlen = sizeof(struct sockaddr_in);

/* oczekujemy na przychodzące połączenia */
if( (csock = accept(lsock,(struct sockaddr*)cliaddr, &cliaddrlen)) < 0) {
perror("accept()");
exit(-1);
}
if( fork() == 0 ) {
/* proces potomny - przekierowujemy standardowe wejście,
wyjście oraz wyjście dla błędów do połączonego klienta */

close(lsock);

dup2(csock, STDIN_FILENO);
close(csock);
dup2(STDIN_FILENO, STDOUT_FILENO);
dup2(STDIN_FILENO, STDERR_FILENO);

/* uruchomienie programu zewnętrznego - po zakończeniu pracy
zamykane są wszystkie deskryptory, w tym połączenie klienta */

execl("date", "date", 0);
} else
close(csock);
}
return 0;
}

The most important thing from our point of view is what happens to the child process which handled our client connection. Let us examine the results - the program didn't execute date, the call to execve() returned an error and the process carried on running until it has reached the call to accept(). Descriptor number 3 was not a listening socket, thus causing accept() to return the aforementioned error to stderr (in this case, the TCP connection to the client) and terminate the process.

Having corrected our call to execl() and added code for handling a failure we have managed to get our program to start running properly. As it turns out we have passed an incorrect argument to execl(), furthermore we did not anticipate that call returning an error, thus leaving the child process active and executing the code of its parent. The most simple solution here is to put exit(0) after calling execl(). The corrected code could look like this:

...

execl("/bin/date","date",0);

exit(1);

}

This time the source code of the program was very short, so we could have pinpointed the potential source of problems immediately. Then again, while working on larger projects it might turn out too tedious or even virtually impossible to succeed in reasonable time to be searching for such bugs in the source code.

Note that even though our corrected program works, it contains another, serious yet commonly ignored bug. We'll get to that in a moment.

Listing 2. The output of strace

root@3[artykul]# strace -p 6277 -f
Process 6277 attached - interrupt to quit
accept(3, 0, NULL) = 4
clone(Process 6947 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7f8d708) = 6947
[pid 6277] close(4) = 0
[pid 6277] accept(3, <unfinished ...>
[pid 6947] close(3) = 0
[pid 6947] dup2(4, 0) = 0
[pid 6947] close(4) = 0
[pid 6947] dup2(0, 1) = 1
[pid 6947] dup2(0, 2) = 2
[pid 6947] execve("date", ["date"], [/* 31 vars */]) = -1 ENOENT (No such file or directory)
[pid 6947] accept(3, 0, NULL) = -1 EBADF (Bad file descriptor)
[pid 6947] dup(2) = 3
[pid 6947] fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR)
[pid 6947] brk(0) = 0x804a000
[pid 6947] brk(0x806b000) = 0x806b000
[pid 6947] fstat64(3, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
[pid 6947] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9b000
[pid 6947] _llseek(3, 0, 0xbfb99c08, SEEK_CUR) = -1 ESPIPE (Illegal seek)
[pid 6947] write(3, "accept(): Bad file descriptorn", 30) = 30
[pid 6947] close(3) = 0
[pid 6947] munmap(0xb7f9b000, 4096) = 0
[pid 6947] exit_group(-1) = ?
Process 6947 detached
<...
accept resumed> 0, NULL) = ? ERESTARTSYS (To be restarted)
---
SIGCHLD (Child exited) @ 0 (0) ---
accept(3, <unfinished ...>
Process 6277 detached
root@3[artykul]#

A d v e r t i s e m e n t
Linux BSD Unix ranking vote

Page: 1 2 3 4
Buy article Buy subscription
Buy now add to cart
add to cart
Standard price: 2€/$3 Standard price: 25€/$30
Buy article for as little as (2€/$3) each allow access to individual articles. Buy a full access to our Software Developers's Journal archive portal. You will be able to read the articles from all archive issues from year 2005 and 2006. For just 25€/$30 you get unrestricted access to the entire website for the whole year.
SDJhakin9

.SDJ Users:


.:Login
.:Password

[Register]
[Forgotten your password?]

...Shopping Cart

sum: 0 €
Choose currency:

...Topics

...Advertisement

www.acunetix.com www.verifysoft.com

...Conferences




...Print Edition Archive

...Affiliate Program



 

 

Subscribe | Contact Us | Newsletter | Privacy policy | Regulations | See all issues | About SDJ
Copyright C 2006 by Software Developer's Journal. All rights reserved.