TODO revision cecec2e61f9131ae14672ce205185383a5f2c768
1-*-org-*- 2* TODO 3** Automatic prototype discovery: 4*** Use debuginfo if available 5 Alternatively, use debuginfo to generate configure file. 6*** Mangled identifiers contain partial prototypes themselves 7 They don't contain return type info, which can change the 8 parameter passing convention. We could use it and hope for the 9 best. Also they don't include the potentially present hidden this 10 pointer. 11** Automatically update list of syscalls? 12** More operating systems (solaris?) 13** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET 14** Implement displaced tracing 15 A technique used in GDB (and in uprobes, I believe), whereby the 16 instruction under breakpoint is moved somewhere else, and followed 17 by a jump back to original place. When the breakpoint hits, the IP 18 is moved to the displaced instruction, and the process is 19 continued. We avoid all the fuss with singlestepping and 20 reenablement. 21** Create different ltrace processes to trace different children 22** Config file syntax 23*** mark some symbols as exported 24 For PLT hits, only exported prototypes would be considered. For 25 symtab entry point hits, all would be. 26 27*** named arguments 28 This would be useful for replacing the arg1, emt2 etc. 29 30*** parameter pack improvements 31 The above format tweaks require that packs that expand to no types 32 at all be supported. If this works, then it should be relatively 33 painless to implement conditionals: 34 35 | void ptrace(REQ=enum(PTRACE_TRACEME=0,...), 36 | if[REQ==0](pack(),pack(pid_t, void*, void *))) 37 38 This is of course dangerously close to a programming language, and 39 I think ltrace should be careful to stay as simple as possible. 40 (We can hook into Lua, or TinyScheme, or some such if we want more 41 general scripting capabilities. Implementing something ad-hoc is 42 undesirable.) But the above can be nicely expressed by pattern 43 matching: 44 45 | void ptrace(REQ=enum[int](...)): 46 | [REQ==0] => () 47 | [REQ==1 or REQ==2] => (pid_t, void*) 48 | [true] => (pid_t, void*, void*); 49 50 Or: 51 52 | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)): 53 | [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...)) 54 55 This would still require pretty complete expression evaluation. 56 _Including_ pointer dereferences and such. And e.g. in accept, we 57 need subtraction: 58 59 | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*); 60 61 Perhaps we should hook to something after all. 62 63*** system call error returns 64 65 This is closely related to above. Take the following syscall 66 prototype: 67 68 | long read(int,+string0,ulong); 69 70 string0 means the same as string(array(char, zero(retval))*). But 71 if read returns a negative value, that signifies errno. But zero 72 takes this at face value and is suspicious: 73 74 | read@SYS(3 <no return ...> 75 | error: maximum array length seems negative 76 | , "\n\003\224\003\n", 4096) = -11 77 78 Ideally we would do what strace does, e.g.: 79 80 | read@SYS(3, 0x12345678, 4096) = -EAGAIN 81 82*** errno tracking 83 Some calls result in setting errno. Somehow mark those, and on 84 failure, show errno. System calls return errno as a negative 85 value (see the previous point). 86 87*** second conversions? 88 This definitely calls for some general scripting. The goal is to 89 have seconds in adjtimex calls show as e.g. 10s, 1m15s or some 90 such. 91 92*** format should take arguments like string does 93 Format should take value argument describing the value that should 94 be analyzed. The following overwriting rules would then apply: 95 96 | format | format(array(char, zero)*) | 97 | format(LENS) | X=LENS, format[X] | 98 99 The latter expanded form would be canonical. 100 101 This depends on named arguments and parameter pack improvements 102 (we need to be able to construct parameter packs that expand to 103 nothing). 104 105*** More fine-tuned control of right arguments 106 Combination of named arguments and some extensions could take care 107 of that: 108 109 | void func(X=hide(int*), long*, +pack(X)); | 110 111 This would show long* as input argument (i.e. the function could 112 mangle it), and later show the pre-fetched X. The "pack" syntax is 113 utterly undeveloped as of now. The general idea is to produce 114 arguments that expand to some mix of types and values. But maybe 115 all we need is something like 116 117 | void func(out int*, long*); | 118 119 ltrace would know that out/inout/in arguments are given in the 120 right order, but left pass should display in and inout arguments 121 only, and right pass then out and inout. + would be 122 backward-compatible syntactic sugar, expanded like so: 123 124 | void func(int*, int*, +long*, long*); | 125 | void func(in int*, in int*, out long*, out long*); | 126 127 But sometimes we may want to see a different type on the way in and 128 on the way out. E.g. in asprintf, what's interesting on the way in 129 is the address, but on the way out we want to see buffer contents. 130 Does something like the following make sense? 131 132 | void func(X=void*, long*, out string(X)); | 133 134** Support for functions that never return 135 This would be useful for __cxa_throw, presumably also for longjmp 136 (do we handle that at all?) and perhaps a handful of others. 137 138** Support flag fields 139 enum-like syntax, except disjunction of several values is assumed. 140** Support long long 141 We currently can't define time_t on 32bit machines. That mean we 142 can't describe a range of time-related functions. 143 144** Support signed char, unsigned char, char 145 Also, don't format it as characted by default, string lens can do 146 it. Perhaps introduce byte and ubyte and leave 'char' as alias of 147 one of those with string lens applied by default. 148 149** Support fixed-width types 150 Really we should keep everything as {u,}int{8,16,32,64} internally, 151 and have long, short and others be translated to one of those 152 according to architecture rules. Maybe this could be achieved by a 153 per-arch config file with typedefs such as: 154 155 | typedef ulong = uint8_t; | 156 157** Support for ARM/AARCH64 types 158 - ARM and AARCH64 both support half-precision floating point 159 - there are two different half-precision formats, IEEE 754-2008 160 and "alternative". Both have 10 bits of mantissa and 5 bits of 161 exponent, and differ only in how exponent==0x1F is handled. In 162 IEEE format, we get NaN's and infinities; in alternative 163 format, this encodes normalized value -1S × 2¹⁶ × (1.mant) 164 - The Floating-Point Control Register, FPCR, controls: — The 165 half-precision format where applicable, FPCR.AHP bit. 166 - AARCH64 supports fixed-point interpretation of {,double}words 167 - e.g. fixed(int, X) (int interpreted as a decimal number with X 168 binary digits of fraction). 169 - AARCH64 supports 128-bit quad words in SIMD 170 171** Some more functions in vect might be made to take const* 172 Or even marked __attribute__((pure)). 173 174** pretty printer support 175 GDB supports python pretty printers. We migh want to hook this in 176 and use it to format certain types. 177 178* BUGS 179** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p()) 180