cflow - C Language Function Call Relation Generator

Posted by swampone on Wed, 29 Dec 2021 08:43:28 +0100

Preface

Get an unfamiliar C project, want to know the whole project context, what method would you use?

function call

Typically, a function call relationship is used to clarify the entire project's operation process, and the software used is source insight. Starting with the main function, see which functions are called by the main function, and then enter each subfunction. By analogy, the whole vein is gradually expanded. I've been using this method before, and I call it "human meat preparation". This method is time consuming, and a small project may take more than half a day to organize. Since then, I have often wondered if there is automated software to do this? Later we found one: calltree, whose function is known by name, called tree, which is used to clarify the function call relationship. However, the software has been discontinued for a long time, the version is older, and it is not easy to use.

cflow

Later, one was discovered: cflow. The software can be installed by using the command directly under ubuntu, as follows

sudo apt install cflow

cflow can analyze control flow in C file, for example

$ cflow -T log.c 
+-log_init() <void log_init (void) at log.c:193>
  +-InitializeCriticalSection()
  +-wget_console_init()
  +-wget_logger_set_func()
  +-wget_get_logger()
  +-write_debug_stderr() <void write_debug_stderr (const char *data, size_t len) at log.c:157>
  | \-write_debug() <void write_debug (FILE *fp, const char *data, size_t len) at log.c:138>
  |   \-write_out() <void write_out (FILE *default_fp, const char *data, size_t len, int with_timestamp, const char *colorstring, wget_console_color color_id) at log.c:55>
  |     +-strcmp()
  |     +-open()
  |     +-wget_buffer_init()
  |     +-isatty()
  |     +-fileno()
  |     +-wget_buffer_strcpy()
  |     +-gettime()
  |     +-localtime_r()
  |     +-wget_buffer_printf_append()
  |     +-wget_buffer_memcat()
  |     +-wget_buffer_strcat()
  |     +-fwrite()
  |     +-EnterCriticalSection()
  |     +-wget_console_set_fg_color()
  |     +-fflush()
  |     +-wget_console_reset_fg_color()
  |     +-LeaveCriticalSection()
  |     +-write()
  |     +-close()
  |     \-wget_buffer_deinit()
...

tree2dotx

For a more intuitive display of function call relationships, we can use the xdot tool. However, using the xdot tool requires that we have a node file in xdot format. This requires the use of another tool, tree2dotx, which can be used from Here Get it, save it as a tree2dotx file, and put it in the system path.
Run it to see the effect

$ cflow log.c | tree2dotx
digraph G{
ranksep = 1;
	rankdir=LR;
	size="1920,1080";
	node [fontsize=16,fontcolor=blue,style=filled,fillcolor=Wheat,shape=box];
	"log_init" -> "InitializeCriticalSection";
	"log_init" -> "wget_console_init";
	"log_init" -> "wget_logger_set_func";
	"log_init" -> "wget_get_logger";
	"log_init" -> "write_debug_stderr";
	"write_debug_stderr" -> "write_debug";
	"write_debug" -> "write_out";
	"write_out" -> "strcmp";
	"write_out" -> "open";
	"write_out" -> "wget_buffer_init";
	"write_out" -> "isatty";
	"write_out" -> "fileno";
	"write_out" -> "wget_buffer_strcpy";
	"write_out" -> "gettime";
	"write_out" -> "localtime_r";
...
}

xdot

The xdot function graphically displays the relationships between nodes, which can be installed using the apt command under ubuntu

sudo apt install xdot

Run to see the effect

$ cflow log.c | tree2dotx > out.dot
$ xdot out.dot 


You can see the log very visually. The function call relationship in the C file, with the mouse over a function, the arrows before and after turn red to indicate the called relationship.

optimization

The above tree2dotx script is actually a bit problematic, and I've made some optimizations for it, as follows

  • Duplicate removal. Nodes processed by tree2dotx have duplicates that cause the connection to double. The result after de-duplication is as follows, and is it much simpler than the previous one?

    The command is as follows
cflow log.c | tree2dotx | awk '!a[$0]++' > out.dot
  • strip
    When the original tree2dotxscripttree2dotxscriptscriptscriptscriptscripttree2dotxscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscripting a function call relationship to a node, some functions are followed by a space that changes the s e d -e scriptscript s e d -e "s/<. *> *> *> *> '' '| tr -d' | | | tr -d '' '| | | | tr -tr -tr -tr -2dotdotdotxxscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscripttretretree2tree2tree2tree2dotxxxxscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptscriptThe result is that
  • Increase Child Nodes
    This shows which C file the current function belongs to.

Enclosure

Optimized full tree2dotx code

$ cat /usr/local/bin/tree2dotx 
#!/bin/bash
#
# tree2dotx --- transfer a "tree"(such as the result of tree,calltree,cflow -b)
#                 to a picture discribed by DOT language(provided by Graphviz)
#
# Author: falcon <wuzhangjin@gmail.com>
# Update: 2007-11-14, 2015-3-19
# Usage:
#
#       tree -L 2 -d /path/to/a/directory | bash tree2dotx | dot -Tsvg -o tree.svg
#       cd /path/to/a/c/project/; calltree -gb -np -m *.c | bash tree2dotx | dot -Tsvg -o calltree.svg
#       cd /path/to/a/c/project/; cflow -b -m setup_rw_floppy kernel/blk_drv/floppy.c | bash tree2dotx | dot -Tsvg -o cflow.svg
#

# Set the picture size, direction(LR=Left2Right,TB=Top2Bottom) and shape(diamond, circle, box)
size="1920,1080"
direction="LR"
shape="box"

# color, X11 color name: http://en.wikipedia.org/wiki/X11_color_names
fontcolor="blue"
fillcolor="Wheat"

# fontsize
fontsize=16

# Specify the symbols you not concern with space as decollator here
filterstr="";

input=`cat`

# output: dot, flame
output="dot"

has_subgraph="0"
ordering="0"

# Usage

#grep -v ^$ | cat

function usage
{
        echo ""
        echo "  $0 "
        echo ""
        echo "     [ -f  \"filter1 filter2 ...\" ]"
        echo "     [ -s  size, ex: 1080,760; 1920,1080 ]"
        echo "     [ -d  direction, ex: LR; TB ]"
        echo "     -h  get help"
        echo ""
}

function subgraph() {
	echo "$input" \
	| grep -e " at " \
	| sed 's/).* at /)/g;s/:.*//g;s/ //g' \
	| sed -r 's/^(.*)\(\)(.*)$/\tsubgraph "cluster_\2" { label="\2";\1;}/' \
	| sort -u
}

while getopts "f:s:S:d:e:h:o:r:" opt;
do
        case $opt in
                f)
                        filterstr=$OPTARG
                ;;
                s)
                        size=$OPTARG
                ;;
                S)
                        shape=$OPTARG
                ;;
                d)
                        direction=$OPTARG
                ;;
                e)
                        has_subgraph=$OPTARG
		;;
		o)
                        output=$OPTARG
                ;;
 		r)
                        ordering=$OPTARG
                ;;
                h|?)
                        usage $0;
                        exit 1;
                ;;
        esac
done

# Transfer the tree result to a file described in DOT language

echo "$input" | \
grep -v ^$ | grep -v "^[0-9]* director" \
| sed -e "s/ <.*>.*//g" | tr -d '\(' | tr -d '\)' | tr '|' ' ' \
| sed -e "s/ \[.*\].*//g" \
| awk '{if(NR==1) system("basename "$0); else printf("%s\n", $0);}' \
| awk -v fstr="$filterstr" '# function for filter the symbols you not concern
        function need_filter(node) {
                for ( i in farr ) {
                    if (match(node,farr[i]" ") == 1 || match(node,"^"farr[i]"$") == 1) {
                            return 1;
                    }
                }
                return 0;
        }
        BEGIN{
                # Filternode array are used to record the symbols who have been filtered.
                oldnodedepth = -1; oldnode = ""; nodep[-1] = ""; filter[nodep[-1]] = 0;
                oldnodedepth_orig = -1; nodepre = 0; nodebase = 0; nodefirst = 0;
                output = "'$output'";

                #printf("output = %s\n", output);

                # Store the symbols to an array farr
                split(fstr,farr," ");

                # print some setting info
                if (output == "dot") {
                    printf("digraph G{\n");

		    if(ordering == "1") {
		    	printf("ordering=out;\n");
	    	    }
		    printf("ranksep = 1;\n");
                    
		    printf("\trankdir='$direction';\n");
                    printf("\tsize=\"'$size'\";\n");
                    printf("\tnode [fontsize='$fontsize',fontcolor='$fontcolor',style=filled,fillcolor='$fillcolor',shape='$shape'];\n");
                }
        }{
                # Get the node, and its depth(nodedepth)
                # nodedepth = match($0, "[^| `]");
                nodedepth = match($0, "[[:digit:]|[:alpha:]]|[[:alnum:]]");
                node = substr($0,nodedepth);

                # printf("%d %d %s \n", nodedepth, oldnodedepth_orig, node);
                if (nodefirst == 1 && oldnodedepth_orig > 0) {
                        nodefirst = 0;
                        nodebase = nodedepth-oldnodedepth_orig;
                }

                if (nodedepth == 0)
                        nodedepth=1;

                tmp = nodedepth;
                # printf("pre=%d base=%d np=%d oldnp=%d node=%s \n", nodepre, nodebase, tmp, oldnodedepth_orig, node);

                if (nodedepth != 0 && oldnodedepth_orig == -1) {
                        nodepre = nodedepth-1;
                        nodefirst = 1;
                        nodedepth = 0;
                } else if (nodebase != 0) {
                        nodedepth = int((nodedepth-nodepre)/nodebase);
                }

                # if whose depth is 1 less than him, who is his parent
                if (nodedepth-oldnodedepth == 1) {
                        nodep[nodedepth-1] = oldnode;
                }

                # for debugging
                # printf("%d %s\n", nodedepth, node);
                # printf("\t\"%s\";\n",node);
                # print the vectors

                if (oldnodedepth != -1) {
                        # if need filter or whose parent have been filter, not print it, and set the flat of filter to 1
                        if (need_filter(node) || filter[nodep[nodedepth-1]] == 1) {
                                filter[node] = 1;
                        #       printf("node = %s, filter[node] = %d\n", node, filter[node]);
			} else if (nodep[nodedepth-1] != "") {
                                if (output == "dot") {
                                    printf("\t\"%s\" -> \"%s\";\n", nodep[nodedepth-1], node, nodep[nodedepth-1], node);
                                } else {
                                    for (i = 0; i < nodedepth; i++)
                                        printf("%s;", nodep[i]);
                                    printf("%s 1\n", node);
                                }
                        #       printf("\t\"%s\" -> \"%s\"[label=\"%s>%s\"];\n", nodep[nodedepth-1], node, nodep[nodedepth-1], node);
                        }
                }

                # save the old depth and the old node
                oldnodedepth_orig = tmp;
                oldnodedepth = nodedepth;
                oldnode = node;
        } END {
#                if (output == "dot")
#			printf("}");
        }'

echo ""
if [ $has_subgraph == "1" ]
then
	subgraph
fi
echo "}"
cflow -d 3 wget.c | tree2dotx -e 1 -r 1 | awk '!a[$0]++' > out.dot && cat out.dot

Tree2dotx-e 0/1 specifies whether or not the subgraph (the file in which the function is located) is displayed
Tree2dotx-r 0/1 specifies whether images are displayed in the order in which functions appear

image

You can also output an image displayed by xdot as a picture

dot -Tgif out.dot -o out.gif

Reference resources

https://graphviz.gitlab.io/_pages/pdf/dotguide.pdf