TODO revision 6e570e5bf8e0a9049cb1bad3ca4f82fae116c741
1-*-org-*-
2* TODO
3** Keep exit code of traced process
4   See https://bugzilla.redhat.com/show_bug.cgi?id=105371 for details.
5
6** Automatic prototype discovery:
7*** Use debuginfo if available
8    Alternatively, use debuginfo to generate configure file.
9*** Mangled identifiers contain partial prototypes themselves
10    They don't contain return type info, which can change the
11    parameter passing convention.  We could use it and hope for the
12    best.  Also they don't include the potentially present hidden this
13    pointer.
14** Automatically update list of syscalls?
15** More operating systems (solaris?)
16** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
17** Implement displaced tracing
18   A technique used in GDB (and in uprobes, I believe), whereby the
19   instruction under breakpoint is moved somewhere else, and followed
20   by a jump back to original place.  When the breakpoint hits, the IP
21   is moved to the displaced instruction, and the process is
22   continued.  We avoid all the fuss with singlestepping and
23   reenablement.
24** Create different ltrace processes to trace different children
25** Config file syntax
26*** mark some symbols as exported
27    For PLT hits, only exported prototypes would be considered.  For
28    symtab entry point hits, all would be.
29
30*** named arguments
31    This would be useful for replacing the arg1, emt2 etc.
32
33*** parameter pack improvements
34    The above format tweaks require that packs that expand to no types
35    at all be supported.  If this works, then it should be relatively
36    painless to implement conditionals:
37
38    | void ptrace(REQ=enum(PTRACE_TRACEME=0,...),
39    |             if[REQ==0](pack(),pack(pid_t, void*, void *)))
40
41    This is of course dangerously close to a programming language, and
42    I think ltrace should be careful to stay as simple as possible.
43    (We can hook into Lua, or TinyScheme, or some such if we want more
44    general scripting capabilities.  Implementing something ad-hoc is
45    undesirable.)  But the above can be nicely expressed by pattern
46    matching:
47
48    | void ptrace(REQ=enum[int](...)):
49    |   [REQ==0] => ()
50    |   [REQ==1 or REQ==2] => (pid_t, void*)
51    |   [true] => (pid_t, void*, void*);
52
53    Or:
54
55    | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)):
56    |   [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...))
57
58    This would still require pretty complete expression evaluation.
59    _Including_ pointer dereferences and such.  And e.g. in accept, we
60    need subtraction:
61
62    | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*);
63
64    Perhaps we should hook to something after all.
65
66*** system call error returns
67
68    This is closely related to above.  Take the following syscall
69    prototype:
70
71    | long read(int,+string0,ulong);
72
73    string0 means the same as string(array(char, zero(retval))*).  But
74    if read returns a negative value, that signifies errno.  But zero
75    takes this at face value and is suspicious:
76
77    | read@SYS(3 <no return ...>
78    | error: maximum array length seems negative
79    | , "\n\003\224\003\n", 4096)                  = -11
80
81    Ideally we would do what strace does, e.g.:
82
83    | read@SYS(3, 0x12345678, 4096)                = -EAGAIN
84
85*** errno tracking
86    Some calls result in setting errno.  Somehow mark those, and on
87    failure, show errno.  System calls return errno as a negative
88    value (see the previous point).
89
90*** second conversions?
91    This definitely calls for some general scripting.  The goal is to
92    have seconds in adjtimex calls show as e.g. 10s, 1m15s or some
93    such.
94
95*** format should take arguments like string does
96    Format should take value argument describing the value that should
97    be analyzed.  The following overwriting rules would then apply:
98
99    | format       | format(array(char, zero)*) |
100    | format(LENS) | X=LENS, format[X]          |
101
102    The latter expanded form would be canonical.
103
104    This depends on named arguments and parameter pack improvements
105    (we need to be able to construct parameter packs that expand to
106    nothing).
107
108*** More fine-tuned control of right arguments
109    Combination of named arguments and some extensions could take care
110    of that:
111
112    | void func(X=hide(int*), long*, +pack(X)); |
113
114    This would show long* as input argument (i.e. the function could
115    mangle it), and later show the pre-fetched X.  The "pack" syntax is
116    utterly undeveloped as of now.  The general idea is to produce
117    arguments that expand to some mix of types and values.  But maybe
118    all we need is something like
119
120    | void func(out int*, long*); |
121
122    ltrace would know that out/inout/in arguments are given in the
123    right order, but left pass should display in and inout arguments
124    only, and right pass then out and inout.  + would be
125    backward-compatible syntactic sugar, expanded like so:
126
127    | void func(int*, int*, +long*, long*);              |
128    | void func(in int*, in int*, out long*, out long*); |
129
130    This is useful in particular for:
131
132    | ulong mbsrtowcs(+wstring3_t, string*, ulong, addr); |
133    | ulong wcsrtombs(+string3, wstring_t*, ulong, addr); |
134
135    Where we would like to render arg2 on the way in, and arg1 on the
136    way out.
137
138    But sometimes we may want to see a different type on the way in and
139    on the way out.  E.g. in asprintf, what's interesting on the way in
140    is the address, but on the way out we want to see buffer contents.
141    Does something like the following make sense?
142
143    | void func(X=void*, long*, out string(X)); |
144
145** Support for functions that never return
146   This would be useful for __cxa_throw, presumably also for longjmp
147   (do we handle that at all?) and perhaps a handful of others.
148
149** Support flag fields
150   enum-like syntax, except disjunction of several values is assumed.
151** Support long long
152   We currently can't define time_t on 32bit machines.  That mean we
153   can't describe a range of time-related functions.
154
155** Support signed char, unsigned char, char
156   Also, don't format it as characted by default, string lens can do
157   it.  Perhaps introduce byte and ubyte and leave 'char' as alias of
158   one of those with string lens applied by default.
159
160** Support fixed-width types
161   Really we should keep everything as {u,}int{8,16,32,64} internally,
162   and have long, short and others be translated to one of those
163   according to architecture rules.  Maybe this could be achieved by a
164   per-arch config file with typedefs such as:
165
166   | typedef ulong = uint8_t; |
167
168** Support for ARM/AARCH64 types
169   - ARM and AARCH64 both support half-precision floating point
170     - there are two different half-precision formats, IEEE 754-2008
171       and "alternative".  Both have 10 bits of mantissa and 5 bits of
172       exponent, and differ only in how exponent==0x1F is handled.  In
173       IEEE format, we get NaN's and infinities; in alternative
174       format, this encodes normalized value -1S × 2¹⁶ × (1.mant)
175     - The Floating-Point Control Register, FPCR, controls: — The
176       half-precision format where applicable, FPCR.AHP bit.
177   - AARCH64 supports fixed-point interpretation of {,double}words
178     - e.g. fixed(int, X) (int interpreted as a decimal number with X
179       binary digits of fraction).
180   - AARCH64 supports 128-bit quad words in SIMD
181
182** Some more functions in vect might be made to take const*
183   Or even marked __attribute__((pure)).
184
185** pretty printer support
186   GDB supports python pretty printers.  We migh want to hook this in
187   and use it to format certain types.
188
189* BUGS
190** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())
191