Instrumenting All Function Calls
Page Contents
References
- Trace and profile function calls with GCC, Freedom Embedded.
- Monitoring Function Calls, By Aurelian Melinte.
- The addr2line source code, GitHub.
- Binary File Descriptor Library, WikiPedia.
- LIB BFD, the Binary File Descriptor Library.
- Linux ApplicationsDebuggingTechniques, Aurelian Melinte, see page 20, chapter "The Interposition LIbrary& quot;.
- Trace and profile function calls fast, JehTech code.
Why Instrumenting All Function Calls
I had a complex program test that failed. I wanted to see what lead up to the failure. Its a long execution path and I wanted to be able to see what the execution path was that lead to the failure.
Small Modification of Freedom Embedded's Article
I was Google'ing about for a way to do this and found Freedom Embedded's Article [Ref]. It almost worked straight out of the box for me, which was good.
The problem I was that my program was relocated so the addresses it recorded would not be the same is in the OBJ files because Linux applies a randomised load offset to the program when it loads it.
To get around this I modified his constructor function to print the address of main()
,
so that the first line of trace.out
had an address of a known function. By looking
up main inthe objdump
of the executable I could then figure out the load offset and
subtract this from all the addresses in trace.out
.
I also borrowed from A. Melinte's article and book [Ref][Ref] and added as output the thread ID.
The resulting C is this:
#if defined(__GNUC__) || defined(__GNUG__) extern "C" { // See https://balau82.wordpress.com/2010/10/06/trace-and-profile-function-calls-with-gcc/ // This is a derivative of the above article. #include <stdio.h> #include <pthread.h> static FILE *fp_trace; void __attribute__((constructor)) trace_begin(void) { fp_trace = fopen("trace.out", "w"); if (fp_trace != NULL) { extern int main(int, char **); fprintf(fp_trace, "ADDRESS_OF_MAIN %p\n", main); } } void __attribute__((destructor)) trace_end(void) { if (fp_trace != NULL) { fclose(fp_trace); } } void __cyg_profile_func_enter(void *func, void *caller) { if (fp_trace != NULL) { const pthread_t self = pthread_self(); fprintf(fp_trace, "e %p %p %p\n", self, func, caller); } } void __cyg_profile_func_exit(void *func, void *caller) { if (fp_trace != NULL) { const pthread_t self = pthread_self(); fprintf(fp_trace, "x %p %p %p\n", self, func, caller); } } } #endif
The shell script became this:
#/bin/bash if test ! -f "$1"; then echo "Error: executable $1 does not exist." exit 1 fi if test ! -f "$2"; then echo "Error: trace log $2 does not exist." exit 1 fi EXECUTABLE="$1" TRACELOG="$2" PMA_MAIN=$(objdump -d "$EXECUTABLE" | grep "<main>:" | cut -d" " -f1 | sed -e 's/^0*\([1-9][0-9]*\)/\1/g') VMA_MAIN=$(head -1 "$TRACELOG" | cut -d" " -f2) VMA_MAIN=${VMA_MAIN:2:-1} VMA_OFFSET=$((16#$VMA_MAIN - 16#$PMA_MAIN)) INDENT=2 COUNT=0 while read LINETYPE TID FADDR CADDR; do NEWFADDR="$((16#${FADDR:2:-1} - $VMA_OFFSET))" FADDR="$(printf "0x%x", "$NEWFADDR")" NEWCADDR="$((16#${CADDR:2:-1} - $VMA_OFFSET))" CADDR="$(printf "0x%x", "$NEWCADDR")" FNAME="$(addr2line -f -e ${EXECUTABLE} ${FADDR}|head -1)" if [ "$FNAME" == "??" ]; then FNAME=$FADDR; fi if test "${LINETYPE}" = "e" then CNAME="$(addr2line -f -e ${EXECUTABLE} ${CADDR} | head -1)" CLINE="$(addr2line -s -e ${EXECUTABLE} ${CADDR})" SPACES="$(printf "%0.s " $(seq 1 $COUNT))" echo "${TID}: ${SPACES}Enter ${FNAME}, called from ${CNAME} (${CLINE})" COUNT=$((COUNT + INDENT)) fi if test "${LINETYPE}" = "x" then COUNT=$((COUNT - INDENT)) SPACES="$(printf "%0.s " $(seq 1 $COUNT))" echo "${TID}: ${SPACES}Exit ${FNAME}" fi done < "${TRACELOG}"
Remembering to compile with options -O0 -g -finstrument-functions
and it all
worked like a charm. Thank you Balua and Melinte!
WARNING: This will produce a tonne of debug into. The file trace.out
will
potentially be many many GIGABYTES of information.
NOTE: For anything but really small examples the shell script will be REALLY SLOW.
Parse Function Traces Efficiently
As noted Balua's excellent example will run slowly. I liked his idea of just dumping addresses to a trace file as it doesn't require the program-to-be-traced to link against anything new really... the code just slots in. I also like Melinte's use of the thread ID, but didn't want to have all that extra code in by GCC trace functions and also didn't manage to find all his functions.
So... I decided to to modify Balua's example in two ways. Firstly, get it to output a
binary trace format, to try, in vain it seems, to reduce the trace.out
file size. Secondly, to remove the bash script which has to spawn a new addre2line
process for each line, to something that looks up the whole trace in one processes, like
addr2line
, but for the trace output file.
The result was a fairly length investigation into how addr2line
works, the
BFD library and the result is the this GitHub repo. Enjoy :)