TODO revision cecec2e61f9131ae14672ce205185383a5f2c768
1-*-org-*-
2* TODO
3** Automatic prototype discovery:
4*** Use debuginfo if available
5    Alternatively, use debuginfo to generate configure file.
6*** Mangled identifiers contain partial prototypes themselves
7    They don't contain return type info, which can change the
8    parameter passing convention.  We could use it and hope for the
9    best.  Also they don't include the potentially present hidden this
10    pointer.
11** Automatically update list of syscalls?
12** More operating systems (solaris?)
13** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
14** Implement displaced tracing
15   A technique used in GDB (and in uprobes, I believe), whereby the
16   instruction under breakpoint is moved somewhere else, and followed
17   by a jump back to original place.  When the breakpoint hits, the IP
18   is moved to the displaced instruction, and the process is
19   continued.  We avoid all the fuss with singlestepping and
20   reenablement.
21** Create different ltrace processes to trace different children
22** Config file syntax
23*** mark some symbols as exported
24    For PLT hits, only exported prototypes would be considered.  For
25    symtab entry point hits, all would be.
26
27*** named arguments
28    This would be useful for replacing the arg1, emt2 etc.
29
30*** parameter pack improvements
31    The above format tweaks require that packs that expand to no types
32    at all be supported.  If this works, then it should be relatively
33    painless to implement conditionals:
34
35    | void ptrace(REQ=enum(PTRACE_TRACEME=0,...),
36    |             if[REQ==0](pack(),pack(pid_t, void*, void *)))
37
38    This is of course dangerously close to a programming language, and
39    I think ltrace should be careful to stay as simple as possible.
40    (We can hook into Lua, or TinyScheme, or some such if we want more
41    general scripting capabilities.  Implementing something ad-hoc is
42    undesirable.)  But the above can be nicely expressed by pattern
43    matching:
44
45    | void ptrace(REQ=enum[int](...)):
46    |   [REQ==0] => ()
47    |   [REQ==1 or REQ==2] => (pid_t, void*)
48    |   [true] => (pid_t, void*, void*);
49
50    Or:
51
52    | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)):
53    |   [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...))
54
55    This would still require pretty complete expression evaluation.
56    _Including_ pointer dereferences and such.  And e.g. in accept, we
57    need subtraction:
58
59    | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*);
60
61    Perhaps we should hook to something after all.
62
63*** system call error returns
64
65    This is closely related to above.  Take the following syscall
66    prototype:
67
68    | long read(int,+string0,ulong);
69
70    string0 means the same as string(array(char, zero(retval))*).  But
71    if read returns a negative value, that signifies errno.  But zero
72    takes this at face value and is suspicious:
73
74    | read@SYS(3 <no return ...>
75    | error: maximum array length seems negative
76    | , "\n\003\224\003\n", 4096)                  = -11
77
78    Ideally we would do what strace does, e.g.:
79
80    | read@SYS(3, 0x12345678, 4096)                = -EAGAIN
81
82*** errno tracking
83    Some calls result in setting errno.  Somehow mark those, and on
84    failure, show errno.  System calls return errno as a negative
85    value (see the previous point).
86
87*** second conversions?
88    This definitely calls for some general scripting.  The goal is to
89    have seconds in adjtimex calls show as e.g. 10s, 1m15s or some
90    such.
91
92*** format should take arguments like string does
93    Format should take value argument describing the value that should
94    be analyzed.  The following overwriting rules would then apply:
95
96    | format       | format(array(char, zero)*) |
97    | format(LENS) | X=LENS, format[X]          |
98
99    The latter expanded form would be canonical.
100
101    This depends on named arguments and parameter pack improvements
102    (we need to be able to construct parameter packs that expand to
103    nothing).
104
105*** More fine-tuned control of right arguments
106    Combination of named arguments and some extensions could take care
107    of that:
108
109    | void func(X=hide(int*), long*, +pack(X)); |
110
111    This would show long* as input argument (i.e. the function could
112    mangle it), and later show the pre-fetched X.  The "pack" syntax is
113    utterly undeveloped as of now.  The general idea is to produce
114    arguments that expand to some mix of types and values.  But maybe
115    all we need is something like
116
117    | void func(out int*, long*); |
118
119    ltrace would know that out/inout/in arguments are given in the
120    right order, but left pass should display in and inout arguments
121    only, and right pass then out and inout.  + would be
122    backward-compatible syntactic sugar, expanded like so:
123
124    | void func(int*, int*, +long*, long*);              |
125    | void func(in int*, in int*, out long*, out long*); |
126
127    But sometimes we may want to see a different type on the way in and
128    on the way out.  E.g. in asprintf, what's interesting on the way in
129    is the address, but on the way out we want to see buffer contents.
130    Does something like the following make sense?
131
132    | void func(X=void*, long*, out string(X)); |
133
134** Support for functions that never return
135   This would be useful for __cxa_throw, presumably also for longjmp
136   (do we handle that at all?) and perhaps a handful of others.
137
138** Support flag fields
139   enum-like syntax, except disjunction of several values is assumed.
140** Support long long
141   We currently can't define time_t on 32bit machines.  That mean we
142   can't describe a range of time-related functions.
143
144** Support signed char, unsigned char, char
145   Also, don't format it as characted by default, string lens can do
146   it.  Perhaps introduce byte and ubyte and leave 'char' as alias of
147   one of those with string lens applied by default.
148
149** Support fixed-width types
150   Really we should keep everything as {u,}int{8,16,32,64} internally,
151   and have long, short and others be translated to one of those
152   according to architecture rules.  Maybe this could be achieved by a
153   per-arch config file with typedefs such as:
154
155   | typedef ulong = uint8_t; |
156
157** Support for ARM/AARCH64 types
158   - ARM and AARCH64 both support half-precision floating point
159     - there are two different half-precision formats, IEEE 754-2008
160       and "alternative".  Both have 10 bits of mantissa and 5 bits of
161       exponent, and differ only in how exponent==0x1F is handled.  In
162       IEEE format, we get NaN's and infinities; in alternative
163       format, this encodes normalized value -1S × 2¹⁶ × (1.mant)
164     - The Floating-Point Control Register, FPCR, controls: — The
165       half-precision format where applicable, FPCR.AHP bit.
166   - AARCH64 supports fixed-point interpretation of {,double}words
167     - e.g. fixed(int, X) (int interpreted as a decimal number with X
168       binary digits of fraction).
169   - AARCH64 supports 128-bit quad words in SIMD
170
171** Some more functions in vect might be made to take const*
172   Or even marked __attribute__((pure)).
173
174** pretty printer support
175   GDB supports python pretty printers.  We migh want to hook this in
176   and use it to format certain types.
177
178* BUGS
179** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())
180