1
2Dealing with missing system call or ioctl wrappers in Valgrind
3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4You're probably reading this because Valgrind bombed out whilst
5running your program, and advised you to read this file.  The good
6news is that, in general, it's easy to write the missing syscall or
7ioctl wrappers you need, so that you can continue your debugging.  If
8you send the resulting patches to me, then you'll be doing a favour to
9all future Valgrind users too.
10
11Note that an "ioctl" is just a special kind of system call, really; so
12there's not a lot of need to distinguish them (at least conceptually)
13in the discussion that follows.
14
15All this machinery is in coregrind/m_syswrap.
16
17
18What are syscall/ioctl wrappers?  What do they do?
19~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20Valgrind does what it does, in part, by keeping track of everything your
21program does.  When a system call happens, for example a request to read
22part of a file, control passes to the Linux kernel, which fulfills the
23request, and returns control to your program.  The problem is that the
24kernel will often change the status of some part of your program's memory
25as a result, and tools (instrumentation plug-ins) may need to know about
26this.
27
28Syscall and ioctl wrappers have two jobs: 
29
301. Tell a tool what's about to happen, before the syscall takes place.  A
31   tool could perform checks beforehand, eg. if memory about to be written
32   is actually writeable.  This part is useful, but not strictly
33   essential.
34
352. Tell a tool what just happened, after a syscall takes place.  This is
36   so it can update its view of the program's state, eg. that memory has
37   just been written to.  This step is essential.
38
39The "happenings" mostly involve reading/writing of memory.
40
41So, let's look at an example of a wrapper for a system call which
42should be familiar to many Unix programmers.
43
44
45The syscall wrapper for time()
46~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47The wrapper for the time system call looks like this:
48
49  PRE(sys_time)
50  {
51     /* time_t time(time_t *t); */
52     PRINT("sys_time ( %p )",ARG1);
53     PRE_REG_READ1(long, "time", int *, t);
54     if (ARG1 != 0) {
55        PRE_MEM_WRITE( "time(t)", ARG1, sizeof(vki_time_t) );
56     }
57  }
58
59  POST(sys_time)
60  {  
61     if (ARG1 != 0) {
62        POST_MEM_WRITE( ARG1, sizeof(vki_time_t) );
63     }
64  }
65
66The first thing we do happens before the syscall occurs, in the PRE() function.
67The PRE() function typically starts with invoking to the PRINT() macro. This
68PRINT() macro implements support for the --trace-syscalls command line option.
69Next, the tool is told the return type of the syscall, that the syscall has
70one argument, the type of the syscall argument and that the argument is being
71read from a register:
72
73     PRE_REG_READ1(long, "time", int *, t);
74
75Next, if a non-NULL buffer is passed in as the argument, tell the tool that the
76buffer is about to be written to:
77
78     if (ARG1 != 0) {
79        PRE_MEM_WRITE( "time", ARG1, sizeof(vki_time_t) );
80     }
81
82Finally, the really important bit, after the syscall occurs, in the POST()
83function:  if, and only if, the system call was successful, tell the tool that
84the memory was written:
85
86     if (ARG1 != 0) {
87        POST_MEM_WRITE( ARG1, sizeof(vki_time_t) );
88     }
89
90The POST() function won't be called if the syscall failed, so you
91don't need to worry about checking that in the POST() function.
92(Note: this is sometimes a bug; some syscalls do return results when
93they "fail" - for example, nanosleep returns the amount of unslept
94time if interrupted. TODO: add another per-syscall flag for this
95case.)
96
97Note that we use the type 'vki_time_t'.  This is a copy of the kernel
98type, with 'vki_' prefixed.  Our copies of such types are kept in the
99appropriate vki*.h file(s).  We don't include kernel headers or glibc headers
100directly.
101
102
103Writing your own syscall wrappers (see below for ioctl wrappers)
104~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105If Valgrind tells you that system call NNN is unimplemented, do the 
106following:
107
1081.  Find out the name of the system call:
109
110       grep NNN /usr/include/asm/unistd*.h
111
112    This should tell you something like  __NR_mysyscallname.
113    Copy this entry to include/vki/vki-scnums-$(VG_PLATFORM).h.
114
115
1162.  Do 'man 2 mysyscallname' to get some idea of what the syscall
117    does.  Note that the actual kernel interface can differ from this,
118    so you might also want to check a version of the Linux kernel
119    source.
120
121    NOTE: any syscall which has something to do with signals or
122    threads is probably "special", and needs more careful handling.
123    Post something to valgrind-developers if you aren't sure.
124
125
1263.  Add a case to the already-huge collection of wrappers in 
127    the coregrind/m_syswrap/syswrap-*.c files. 
128    For each in-memory parameter which is read or written by
129    the syscall, do one of
130    
131      PRE_MEM_READ( ... )
132      PRE_MEM_RASCIIZ( ... ) 
133      PRE_MEM_WRITE( ... ) 
134      
135    for  that parameter.  Then do the syscall.  Then, if the syscall
136    succeeds, issue suitable POST_MEM_WRITE( ... ) calls.
137    (There's no need for POST_MEM_READ calls.)
138
139    Also, add it to the syscall_table[] array; use one of GENX_, GENXY
140    LINX_, LINXY, PLAX_, PLAXY.
141    GEN* for generic syscalls (in syswrap-generic.c), LIN* for linux
142    specific ones (in syswrap-linux.c) and PLA* for the platform
143    dependant ones (in syswrap-$(PLATFORM)-linux.c).
144    The *XY variant if it requires a PRE() and POST() function, and
145    the *X_ variant if it only requires a PRE()
146    function.  
147    
148    If you find this difficult, read the wrappers for other syscalls
149    for ideas.  A good tip is to look for the wrapper for a syscall
150    which has a similar behaviour to yours, and use it as a 
151    starting point.
152
153    If you need structure definitions and/or constants for your syscall,
154    copy them from the kernel headers into include/vki.h and co., with
155    the appropriate vki_*/VKI_* name mangling.  Don't #include any
156    kernel headers.  And certainly don't #include any glibc headers.
157
158    Test it.
159
160    Note that a common error is to call POST_MEM_WRITE( ... )
161    with 0 (NULL) as the first (address) argument.  This usually means
162    your logic is slightly inadequate.  It's a sufficiently common bug
163    that there's a built-in check for it, and you'll get a "probably
164    sanity check failure" for the syscall wrapper you just made, if this
165    is the case.
166
167
1684.  Once happy, send us the patch.  Pretty please.
169
170
171
172
173Writing your own ioctl wrappers
174~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
175
176Is pretty much the same as writing syscall wrappers, except that all
177the action happens within PRE(ioctl) and POST(ioctl).
178
179There's a default case, sometimes it isn't correct and you have to write a
180more specific case to get the right behaviour.
181
182As above, please create a bug report and attach the patch as described
183on http://www.valgrind.org.
184
185