e42.uk Circle Device

 

Quick Reference

Using GDB

Using GDB on Windows

I know many people have written about using GDB but who knows I might write something useful to you!

Attaching to a Running Process

So, you have a process on a machine (perhaps your pre-deployment machine) and it is doing something silly, you need to break it and see where it is blocked. Here you will see an actual problem I had and how I tried to solve it.

Step 1 get MinGW (I am using MinGW-64) on the machine where the server process is running and attach to the running process.

(gdb) attach 121832
Attaching to process 121832
[New Thread 121832.0x1dbec]
[New Thread 121832.0x1dbf0]
[New Thread 121832.0x20a8c]
Reading symbols from C:\Users\Administrator\Desktop\clusterclient\bin\clusterclient.exe...done.
0x0000000077510591 in ntdll!DbgBreakPoint () from C:\Windows\SYSTEM32\ntdll.dll
(gdb)

121832 is the PID and you can find that using ps or in the windows task manager. I did compile symbols into my code so everything is OK but if you did not you can load symbols from a different executable if you have an executable compiled from the exact same source using the same compiler with debugging information using symbol-file.

(gdb) info threads
  Id   Target Id         Frame
* 3    Thread 4132.0x1378 0x00000000771e0591 in ntdll!DbgBreakPoint ()
   from C:\Windows\SYSTEM32\ntdll.dll
  2    Thread 4132.0x114c 0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
  1    Thread 4132.0x107c 0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
(gdb) thread 1
[Switching to thread 1 (Thread 4132.0x107c)]
#0  0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
(gdb)

The programme is a communications programme and spends a lot of time waiting on a socket or a semaphore. This is frustrating because the stack trace does not show anything useful when queried with backtrace or bt (these commands are the same).

(gdb) bt
#0  0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
#1  0x000007fefc790f75 in WSPStartup () from C:\Windows\system32\mswsock.dll
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

To get out of the waiting function we can use step.

(gdb) step
Single stepping until exit from function ntdll!ZwWaitForSingleObject,
which has no line number information.
[Thread 2208.0xa30 exited with code 0]
0x000007fefc790f75 in WSPStartup () from C:\Windows\system32\mswsock.dll
(gdb) step
Single stepping until exit from function WSPStartup,
which has no line number information.
0x000007fefeb64efc in select () from C:\Windows\system32\ws2_32.dll
(gdb) backtrace
#0  0x000007fefeb64efc in select () from C:\Windows\system32\ws2_32.dll
#1  0x000007fefeb64e7d in select () from C:\Windows\system32\ws2_32.dll
#2  0x00000000004078ec in wait_for_data (sock=100, timeout=30000)
    at src/socketsutility.c:21
#3  0x000000000040793f in recv_fill_buf (sock=100, buf=0x22fd60, len=16,
    bytesRead=0x22fd5c, timeout=30000) at src/socketsutility.c:41
#4  0x0000000000405136 in packet_recv (ctx=0x5d5ab0, timeout=30000)
    at src/application.c:68
#5  0x0000000000403bfc in main (argc=1, argv=0x5d5f60) at src/main.c:201
(gdb)

Now you can see that I have returned to select() which is the call I made to wait for the socket. and the backtrace should work properly now. Ordinarily you might expect to use finish to get back, unfortunately that requires debugging information and so will sometimes return an error:

(gdb) thread 1
[Switching to thread 1 (Thread 2208.0x1244)]
#0  0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
(gdb) finish
Run till exit from #0  0x00000000771e12fa in ntdll!ZwWaitForSingleObject ()
   from C:\Windows\SYSTEM32\ntdll.dll
[Thread 2208.0xd10 exited with code 0]
0x000007fefc790f75 in WSPStartup () from C:\Windows\system32\mswsock.dll
(gdb) finish
"finish" not meaningful in the outermost frame.
(gdb) step
Single stepping until exit from function WSPStartup,
which has no line number information.
0x000007fefeb64efc in select () from C:\Windows\system32\ws2_32.dll
(gdb)

From my back trace I can see where execution is currently and I can look at my code and see what is wrong. Firstly looking at wait_for_data()

static int wait_for_data(SOCKET sock, uint32_t timeout) {
	struct timeval s_timeout;
	fd_set socks;
	int readsocks;

	if (timeout == 0) {
		timeout = 5*1000;
	}

	s_timeout.tv_sec = timeout / 1000;
	s_timeout.tv_usec = (timeout % 1000) * 1000;

	FD_ZERO(&socks);
	FD_SET(sock, &socks);

	readsocks = select(sock+1, &socks, (fd_set *)0, (fd_set *)0, (PTIMEVAL)&s_timeout);

	if (readsocks == 0) {
		return -1;
	}
	return 0;
}

Well, that is simple, I cannot see any problems there... lets step until we come to the next function recv_fill_buf you can do this in several ways, the best is probably to set a breakpoint, this is easy if you have your source file, just type break socketsutility.c:42 since we know that line 42 is the next line after the call to wait_for_data(). Naturally it would be nice to know the value of res.

(gdb) break socketsutility.c:42
Breakpoint 1 at 0x407942: file src/socketsutility.c, line 42.
(gdb) cont
Continuing.
[Thread 5340.0x169c exited with code 0]

Breakpoint 1, recv_fill_buf (sock=100, buf=0x22fd60, len=16,
    bytesRead=0x22fd5c, timeout=30000) at src/socketsutility.c:42
42                      if (res == -1) {
(gdb) print res
$2 = 1
(gdb)

Or you could do it the long way round, please note that you should try and use finish rather than step if you are not in your own code. GDB locked up on my machine when I tried step instead of finish.

(gdb) step
Single stepping until exit from function ntdll!ZwWaitForSingleObject,
which has no line number information.
[Thread 5340.0x1518 exited with code 0]
0x000007fefc790f75 in WSPStartup () from C:\Windows\system32\mswsock.dll
(gdb) step
Single stepping until exit from function WSPStartup,
which has no line number information.
0x000007fefeb64efc in select () from C:\Windows\system32\ws2_32.dll
(gdb) finish <-- if you use step here gdb seems to crash
Run till exit from #0  0x000007fefeb64efc in select ()
   from C:\Windows\system32\ws2_32.dll
0x000007fefeb64e7d in select () from C:\Windows\system32\ws2_32.dll
(gdb) finish
Run till exit from #0  0x000007fefeb64e7d in select ()
   from C:\Windows\system32\ws2_32.dll
0x00000000004078ec in wait_for_data (sock=100, timeout=30000)
    at src/socketsutility.c:21
21              readsocks = select(sock+1, &socks, (fd_set *)0, (fd_set *)0, (PTIMEVAL)&s_timeout);
(gdb) finish
Run till exit from #0  0x00000000004078ec in wait_for_data (sock=100,
    timeout=30000) at src/socketsutility.c:21
0x000000000040793f in recv_fill_buf (sock=100, buf=0x22fd60, len=16,
    bytesRead=0x22fd5c, timeout=30000) at src/socketsutility.c:41
41                      res = wait_for_data(sock, timeout);
Value returned is $1 = 1
(gdb) step
42                      if (res == -1) {
(gdb)

The source for the recv_fill_buf() function.

int recv_fill_buf(SOCKET sock, void * buf, int len, int * bytesRead, uint32_t timeout) {
	int res;
	int bytesReceived;
	int totalBytesReceived = 0;

	while (totalBytesReceived < len) {
		res = wait_for_data(sock, timeout);
		if (res == -1) { <-- line number 42
			res = 0;
			break;
		}
		bytesReceived = recv(sock, (char *)((PBYTE)buf) + totalBytesReceived, len - totalBytesReceived, 0);
		if (bytesReceived > 0) {
			totalBytesReceived += bytesReceived;
		}
		if (bytesReceived < 0) {
			res = -1;
			break;
		}
	}
	*bytesRead = totalBytesReceived;
	return res;
}

Whilst I am sure you noticed the error already (I am a plonker and I noticed it immediately too). On a Linux machine this code would not go into an infinate loop but rather a SIGPIPE would be sent to the process, this is no excuse for shoddy code though. Lets follow it through anyway...

(gdb) print res
$1 = 0
(gdb) step
46                      bytesReceived = recv(sock, (char *)((PBYTE)buf) + totalBytesReceived, len - totalBytesReceived, 0);
(gdb) step
47                      if (bytesReceived > 0) {
(gdb) print bytesReceived
$2 = 0
(gdb) step
50                      if (bytesReceived < 0) {
(gdb) step
40              while (totalBytesReceived < len) {
(gdb) step
41                      res = wait_for_data(sock, timeout);
(gdb)

Reading the Windows documentation, (search for it) you will notice a line under Return value that says:

If the connection has been gracefully closed, the return value is zero.

Did I check for a 0... NO! fortunately it took many times longer to write this than it did to discover this bug and now I know more about how to use GDB on Windows.

Quick Links: Techie Stuff | General | Personal | Quick Reference