The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

cscan - C source scan (C::Scan alternative)

VERSION

version 0.37

AUTHOR

Jean-Damien Durand <jeandamiendurand@free.fr>

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Jean-Damien Durand.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

NAME

cscan - C source scsan

SYNOPSIS

 cscan [options] file

 Startup Options:
   --help                Brief help message.
   --cpprun <argument>   Preprocessor run command.
   --cppflags <argument> Preprocessor flags.
   --filter <argument>   File to look at after proprocessing. Defaults to file argument.
   --get <argument,...>  Dump the result of getting <argument>.
   --xml                 Print out an XML view of all declarations and definitions.
   --out <argument>      Redirect any output to this filename.
   --err <argument>      Redirect any error to this filename.
   --in <argument>       Load a full XML view of all declarations and definitions from this filename.
   --xpath <argument>    Dump the result of an xpath query on declarations and definitions.

OPTIONS

--help

This help

--cpprun <argument>

cpp run command. Default is the value when perl was compiled, i.e.:

$CPPRUN (if you read this message, do not worry: this is replaced by correct value at run-time)

This option can be repeated.

--cppflags <argument>

cpp flags. Default is the value when perl was compiled, i.e.:

$CPPFLAGS (if you read this message, do not worry: this is replaced by correct value at run-time)

--filter <argument>

File to look at after proprocessing. Defaults to file argument.

cscan is using the preprocessor. Every #include statement in your original source code is telling the preprocessor to look at another file, this is marked down by a line like:

 #line ... "information on the file processed"

in the generated output. The --filter argument is used to select which processed file is of interest, and obviously defaults to one given on the command-line. If $filter is starting with a slash "/" it is assumed to be a full regular expression (including modifier flags).

--get <argument,...>

Dump the result of getting <argument> using perl module Data::Dumper. A comma "," is the separator for multiple arguments.

<argument> is exactly one of the C::Scan like methods supported by MarpaX::languages::C::Scan, i.e.:

ast

The AST (Abstract Syntax Tree)

decls

All declarations.

defines_args

Macros with arguments.

defines_no_args

Macros without arguments.

defs

All definitions.

fdecls

Declarations of functions.

includes

Included files.

inlines

Definitions of functions.

macros

List of macros

parsed_fdecls

List of parsed functions declarations.

strings

List of strings.

typedef_hash

List of hash which contains known typedefs as keys.

typedef_structs

Hash which known typedefs as keys.

typedef_texts

List of known expansions of typedef.

typedefs_maybe

List of typedefed names.

vdecl_hash

Hash of parsed extern variable declarations.

vdecls

List of extern variable declarations.

--xml

Print out an XML view of all declarations and definitions.

The XML will have this structure:

 <C>
   <decls>...</decls>
   <decls>...</decls>
   ...
   <defs>...</defs>
   <defs>...</defs>
 </C>

There is one <decl/> element per declaration, one <defs/> element per function definition. Both can have the following attributes:

rt

Return type of a function.

nm

Identifier

ft

Full text used to get this information.

mod

Array modifiers if any (for example: char x[2] will make mod to be: '[2]').

ty

Type of a declarator. In case of a function, the type will contain only eventual stars '*'.

extern

"1" value means this is an 'extern' declaration.

typedef

"1" value means this is an 'typedef' declaration.

init

Declarator initialization, if any. For example, with char *x = "value" init will be the string "value".

func

"1" means this is an function declaration.

struct

"1" means this is a struct declaration.

union

"1" means this is a union declaration.

structOrUnion

"1" means this is a struct or union declaration. If true, it is guaranteed that one of 'struct' or 'union' attributes is true.

type

"1" means this is a type declaration. If true, it is guaranteed that one of 'typedef' or 'structOrUnion' attribute is true, and that the 'var' attribute (see below) is false.

var

"1" means this is a variable declaration. If true, it is guaranteed that the 'type' attribute is false.

file

Filename where this parsed statement occurs. The filename is derived from the preprocessor output, with no modification.

line

Line number within filename where is beginning the attribute 'ft'.

The only possible child element is:

args

Array reference of arguments parsed declarations, which can have same attributes as listed below, and other args children.

--in <argument>

Load a full XML view of all declarations and definitions from this filename.

Doing so will prevent any preprocessor call and analysis: cscan will assume this XML has a correct format. Such option exist because analysing preprocessing output takes time, and often, once the analysis is done, it is easier to reload the result to do xpath queries. For instance, first you create the XML:

 cscan --out /tmp/file.xml --xml /tmp/file.c

then you do xpath queries on the reloaded output:

 cscan --in /tmp/file.xml --xpath "//*[contains(@nm,'x')]"
 cscan --in /tmp/file.xml --xpath "//*[contains(@nm,'Y')]"

When you use the --in option, the --get option becomes a noop.

--xpath <argument>

Dump the result of an XPath (version 1) query on the XML structure described above. Found nodes are converted to hashes for readibility and printed out. For example, to find all nodes having an identifier named "x":

 cscan --xpath "//*[contains(@nm,'x')]" /tmp/file.c

To find all declared strings:

 cscan --xpath "//*[starts-with(@init,'\"')]" /tmp/file.c

To find all function definitions that have at least one argument of type "double":

 cscan --xpath "//*[@func=\"1\"]/args[@var=\"1\" and @ty=\"double\"]/.." /tmp/file.c
--out <argument>

Redirect any output to this filename.

--err <argument>

Redirect any error to this filename.

EXAMPLES

 cscan --get strings                                                                         /tmp/file.c
 cscan --get strings,macros --cppflags "-I/tmp/dir1            -DMYDEFINE"                   /tmp/file.c
 cscan --get strings,macros --cppflags  -I/tmp/dir1 --cppflags -DMYDEFINE                    /tmp/file.c
 cscan --get strings        --cppflags  -I/tmp/dir1 --cppflags -DMYDEFINE --filter '/\.H$/i' /tmp/file.c

The parsing result for the following source code, in filename test.c:

 int func1(int x1, double *x2, float *( f1)(int x11, double x12));
 int func1(int x1, double *x2, float *( f1)(int x11, double x12)) {
   char *string = "&";
   return 0;
 }

will be converted to xml using:

 cscan --xml test.c

giving:

 <C>
   <decls file="test.c" ft="int func1(int x1, double *x2, float *( f1)(int x11, double x12))" func="1" line="1" nm="func1" rt="int" var="1">
     <args file="test.c" ft="int x1" line="1" nm="x1" ty="int" var="1" />
     <args file="test.c" ft="double *x2" line="1" nm="x2" ty="double *" var="1" />
     <args file="test.c" ft="float *( f1)(int x11, double x12)" func="1" line="1" nm="f1" rt="float *" var="1">
       <args file="test.c" ft="int x11" line="1" nm="x11" ty="int" var="1" />
       <args file="test.c" ft="double x12" line="1" nm="x12" ty="double" var="1" />
     </args>
   </decls>
   <defs file="test.c" ft="int func1(int x1, double *x2, float *( f1)(int x11, double x12)) {
  char *string = &quot;&amp;&quot;;
  return 0;
}" func="1" line="2" nm="func1" rt="int">
     <args file="test.c" ft="int x1" line="2" nm="x1" ty="int" var="1" />
     <args file="test.c" ft="double *x2" line="2" nm="x2" ty="double *" var="1" />
     <args file="test.c" ft="float *( f1)(int x11, double x12)" func="1" line="2" nm="f1" rt="float *" var="1">
       <args file="test.c" ft="int x11" line="2" nm="x11" ty="int" var="1" />
       <args file="test.c" ft="double x12" line="2" nm="x12" ty="double" var="1" />
     </args>
   </defs>
 </C>

while the following source code:

 struct s1_ {
   int x;
   enum {E1, E2} e;
   struct {
     long y;
     double z;
     char *s[1024][32];
   } innerStructure;
 };

will give the following XML:

 <C>
   <decls enum="1" nm="ANON0" ty="ANON0" type="1">
     <args file="test.c" ft="E1" line="3" nm="E1" ty="int" var="1" />
   </decls>
   <decls nm="s1_" struct="1" structOrUnion="1" ty="struct s1_" type="1">
     <args file="test.c" ft="int x" line="2" nm="x" ty="int" var="1" />
     <args file="test.c" ft="enum {E1, E2} e" line="3" nm="e" ty="ANON0" var="1" />
     <args nm="ANON1" struct="1" structOrUnion="1" ty="struct ANON1" type="1">
       <args file="test.c" ft="long y" line="5" nm="y" ty="long" var="1" />
       <args file="test.c" ft="double z" line="6" nm="z" ty="double" var="1" />
       <args file="test.c" ft="char *s[1024][32]" line="7" mod="[1024][32]" nm="s" ty="char *" var="1" />
     </args>
     <args file="test.c" ft="struct {
     long y;
     double z;
     char *s[1024][32];
   } innerStructure" line="4" nm="innerStructure" ty="struct ANON1" var="1" />
   </decls>
   <decls file="test.c" ft="struct s1_ {
   int x;
   enum {E1, E2} e;
   struct {
     long y;
     double z;
     char *s[1024][32];
   } innerStructure;
 };" line="1" nm="ANON2" ty="struct s1_" var="1" />
 </C>

In the later example you see that anonymous types can be in an arg element. They do not have the attribute "var". Anonymous types are of two categeries: structOrUnion (divided again in struct or union), and enum. Per definition, enum types are always global, wherever and whenever they appear, i.e. they will always be a direct child of <decls/>. On contrary, structOrUnion types always stay in the scope of their declaration.

NOTES

Any unknown option on the command line is passed through to --cppflags. I.e.:

 cscan --get strings,macros --cppflags  -I/tmp/dir1 --cppflags -DMYDEFINE /tmp/file.c

and

 cscan --get strings,macros -I/tmp/dir1 -DMYDEFINE /tmp/file.c

are equivalent. A restriction is that the filename must be the last argument.

SEE ALSO

MarpaX::Languages::C::Scan

Data::Dumper