1============
2CMake Primer
3============
4
5.. contents::
6   :local:
7
8.. warning::
9   Disclaimer: This documentation is written by LLVM project contributors `not`
10   anyone affiliated with the CMake project. This document may contain
11   inaccurate terminology, phrasing, or technical details. It is provided with
12   the best intentions.
13
14
15Introduction
16============
17
18The LLVM project and many of the core projects built on LLVM build using CMake.
19This document aims to provide a brief overview of CMake for developers modifying
20LLVM projects or building their own projects on top of LLVM.
21
22The official CMake language references is available in the cmake-language
23manpage and `cmake-language online documentation
24<https://cmake.org/cmake/help/v3.4/manual/cmake-language.7.html>`_.
25
2610,000 ft View
27==============
28
29CMake is a tool that reads script files in its own language that describe how a
30software project builds. As CMake evaluates the scripts it constructs an
31internal representation of the software project. Once the scripts have been
32fully processed, if there are no errors, CMake will generate build files to
33actually build the project. CMake supports generating build files for a variety
34of command line build tools as well as for popular IDEs.
35
36When a user runs CMake it performs a variety of checks similar to how autoconf
37worked historically. During the checks and the evaluation of the build
38description scripts CMake caches values into the CMakeCache. This is useful
39because it allows the build system to skip long-running checks during
40incremental development. CMake caching also has some drawbacks, but that will be
41discussed later.
42
43Scripting Overview
44==================
45
46CMake's scripting language has a very simple grammar. Every language construct
47is a command that matches the pattern _name_(_args_). Commands come in three
48primary types: language-defined (commands implemented in C++ in CMake), defined
49functions, and defined macros. The CMake distribution also contains a suite of
50CMake modules that contain definitions for useful functionality.
51
52The example below is the full CMake build for building a C++ "Hello World"
53program. The example uses only CMake language-defined functions.
54
55.. code-block:: cmake
56
57   cmake_minimum_required(VERSION 3.2)
58   project(HelloWorld)
59   add_executable(HelloWorld HelloWorld.cpp)
60
61The CMake language provides control flow constructs in the form of foreach loops
62and if blocks. To make the example above more complicated you could add an if
63block to define "APPLE" when targeting Apple platforms:
64
65.. code-block:: cmake
66
67   cmake_minimum_required(VERSION 3.2)
68   project(HelloWorld)
69   add_executable(HelloWorld HelloWorld.cpp)
70   if(APPLE)
71     target_compile_definitions(HelloWorld PUBLIC APPLE)
72   endif()
73   
74Variables, Types, and Scope
75===========================
76
77Dereferencing
78-------------
79
80In CMake variables are "stringly" typed. All variables are represented as
81strings throughout evaluation. Wrapping a variable in ``${}`` dereferences it
82and results in a literal substitution of the name for the value. CMake refers to
83this as "variable evaluation" in their documentation. Dereferences are performed
84*before* the command being called receives the arguments. This means
85dereferencing a list results in multiple separate arguments being passed to the
86command.
87
88Variable dereferences can be nested and be used to model complex data. For
89example:
90
91.. code-block:: cmake
92
93   set(var_name var1)
94   set(${var_name} foo) # same as "set(var1 foo)"
95   set(${${var_name}}_var bar) # same as "set(foo_var bar)"
96   
97Dereferencing an unset variable results in an empty expansion. It is a common
98pattern in CMake to conditionally set variables knowing that it will be used in
99code paths that the variable isn't set. There are examples of this throughout
100the LLVM CMake build system.
101
102An example of variable empty expansion is:
103
104.. code-block:: cmake
105
106   if(APPLE)
107     set(extra_sources Apple.cpp)
108   endif()
109   add_executable(HelloWorld HelloWorld.cpp ${extra_sources})
110   
111In this example the ``extra_sources`` variable is only defined if you're
112targeting an Apple platform. For all other targets the ``extra_sources`` will be
113evaluated as empty before add_executable is given its arguments.
114
115One big "Gotcha" with variable dereferencing is that ``if`` commands implicitly
116dereference values. This has some unexpected results. For example:
117
118.. code-block:: cmake
119
120   if("${SOME_VAR}" STREQUAL "MSVC")
121
122In this code sample MSVC will be implicitly dereferenced, which will result in
123the if command comparing the value of the dereferenced variables ``SOME_VAR``
124and ``MSVC``. A common workaround to this solution is to prepend strings being
125compared with an ``x``.
126
127.. code-block:: cmake
128
129   if("x${SOME_VAR}" STREQUAL "xMSVC")
130
131This works because while ``MSVC`` is a defined variable, ``xMSVC`` is not. This
132pattern is uncommon, but it does occur in LLVM's CMake scripts.
133
134.. note::
135   
136   Once the LLVM project upgrades its minimum CMake version to 3.1 or later we
137   can prevent this behavior by setting CMP0054 to new. For more information on
138   CMake policies please see the cmake-policies manpage or the `cmake-policies
139   online documentation
140   <https://cmake.org/cmake/help/v3.4/manual/cmake-policies.7.html>`_.
141
142Lists
143-----
144
145In CMake lists are semi-colon delimited strings, and it is strongly advised that
146you avoid using semi-colons in lists; it doesn't go smoothly. A few examples of
147defining lists:
148
149.. code-block:: cmake
150
151   # Creates a list with members a, b, c, and d
152   set(my_list a b c d)
153   set(my_list "a;b;c;d")
154   
155   # Creates a string "a b c d"
156   set(my_string "a b c d")
157
158Lists of Lists
159--------------
160
161One of the more complicated patterns in CMake is lists of lists. Because a list
162cannot contain an element with a semi-colon to construct a list of lists you
163make a list of variable names that refer to other lists. For example:
164
165.. code-block:: cmake
166
167   set(list_of_lists a b c)
168   set(a 1 2 3)
169   set(b 4 5 6)
170   set(c 7 8 9)
171   
172With this layout you can iterate through the list of lists printing each value
173with the following code:
174
175.. code-block:: cmake
176
177   foreach(list_name IN LISTS list_of_lists)
178     foreach(value IN LISTS ${list_name})
179       message(${value})
180     endforeach()
181   endforeach()
182   
183You'll notice that the inner foreach loop's list is doubly dereferenced. This is
184because the first dereference turns ``list_name`` into the name of the sub-list
185(a, b, or c in the example), then the second dereference is to get the value of
186the list.
187
188This pattern is used throughout CMake, the most common example is the compiler
189flags options, which CMake refers to using the following variable expansions:
190CMAKE_${LANGUAGE}_FLAGS and CMAKE_${LANGUAGE}_FLAGS_${CMAKE_BUILD_TYPE}.
191
192Other Types
193-----------
194
195Variables that are cached or specified on the command line can have types
196associated with them. The variable's type is used by CMake's UI tool to display
197the right input field. The variable's type generally doesn't impact evaluation.
198One of the few examples is PATH variables, which CMake does have some special
199handling for. You can read more about the special handling in `CMake's set
200documentation
201<https://cmake.org/cmake/help/v3.5/command/set.html#set-cache-entry>`_.
202
203Scope
204-----
205
206CMake inherently has a directory-based scoping. Setting a variable in a
207CMakeLists file, will set the variable for that file, and all subdirectories.
208Variables set in a CMake module that is included in a CMakeLists file will be
209set in the scope they are included from, and all subdirectories.
210
211When a variable that is already set is set again in a subdirectory it overrides
212the value in that scope and any deeper subdirectories.
213
214The CMake set command provides two scope-related options. PARENT_SCOPE sets a
215variable into the parent scope, and not the current scope. The CACHE option sets
216the variable in the CMakeCache, which results in it being set in all scopes. The
217CACHE option will not set a variable that already exists in the CACHE unless the
218FORCE option is specified.
219
220In addition to directory-based scope, CMake functions also have their own scope.
221This means variables set inside functions do not bleed into the parent scope.
222This is not true of macros, and it is for this reason LLVM prefers functions
223over macros whenever reasonable.
224
225.. note::
226  Unlike C-based languages, CMake's loop and control flow blocks do not have
227  their own scopes.
228
229Control Flow
230============
231
232CMake features the same basic control flow constructs you would expect in any
233scripting language, but there are a few quarks because, as with everything in
234CMake, control flow constructs are commands.
235
236If, ElseIf, Else
237----------------
238
239.. note::
240  For the full documentation on the CMake if command go
241  `here <https://cmake.org/cmake/help/v3.4/command/if.html>`_. That resource is
242  far more complete.
243
244In general CMake if blocks work the way you'd expect:
245
246.. code-block:: cmake
247
248  if(<condition>)
249    .. do stuff
250  elseif(<condition>)
251    .. do other stuff
252  else()
253    .. do other other stuff
254  endif()
255
256The single most important thing to know about CMake's if blocks coming from a C
257background is that they do not have their own scope. Variables set inside
258conditional blocks persist after the ``endif()``.
259
260Loops
261-----
262
263The most common form of the CMake ``foreach`` block is:
264
265.. code-block:: cmake
266
267  foreach(var ...)
268    .. do stuff
269  endforeach()
270
271The variable argument portion of the ``foreach`` block can contain dereferenced
272lists, values to iterate, or a mix of both:
273
274.. code-block:: cmake
275
276  foreach(var foo bar baz)
277    message(${var})
278  endforeach()
279  # prints:
280  #  foo
281  #  bar
282  #  baz
283
284  set(my_list 1 2 3)
285  foreach(var ${my_list})
286    message(${var})
287  endforeach()
288  # prints:
289  #  1
290  #  2
291  #  3
292
293  foreach(var ${my_list} out_of_bounds)
294    message(${var})
295  endforeach()
296  # prints:
297  #  1
298  #  2
299  #  3
300  #  out_of_bounds
301
302There is also a more modern CMake foreach syntax. The code below is equivalent
303to the code above:
304
305.. code-block:: cmake
306
307  foreach(var IN ITEMS foo bar baz)
308    message(${var})
309  endforeach()
310  # prints:
311  #  foo
312  #  bar
313  #  baz
314
315  set(my_list 1 2 3)
316  foreach(var IN LISTS my_list)
317    message(${var})
318  endforeach()
319  # prints:
320  #  1
321  #  2
322  #  3
323
324  foreach(var IN LISTS my_list ITEMS out_of_bounds)
325    message(${var})
326  endforeach()
327  # prints:
328  #  1
329  #  2
330  #  3
331  #  out_of_bounds
332
333Similar to the conditional statements, these generally behave how you would
334expect, and they do not have their own scope.
335
336CMake also supports ``while`` loops, although they are not widely used in LLVM.
337
338Modules, Functions and Macros
339=============================
340
341Modules
342-------
343
344Modules are CMake's vehicle for enabling code reuse. CMake modules are just
345CMake script files. They can contain code to execute on include as well as
346definitions for commands.
347
348In CMake macros and functions are universally referred to as commands, and they
349are the primary method of defining code that can be called multiple times.
350
351In LLVM we have several CMake modules that are included as part of our
352distribution for developers who don't build our project from source. Those
353modules are the fundamental pieces needed to build LLVM-based projects with
354CMake. We also rely on modules as a way of organizing the build system's
355functionality for maintainability and re-use within LLVM projects.
356
357Argument Handling
358-----------------
359
360When defining a CMake command handling arguments is very useful. The examples
361in this section will all use the CMake ``function`` block, but this all applies
362to the ``macro`` block as well.
363
364CMake commands can have named arguments, but all commands are implicitly
365variable argument. If the command has named arguments they are required and must
366be specified at every call site. Below is a trivial example of providing a
367wrapper function for CMake's built in function ``add_dependencies``.
368
369.. code-block:: cmake
370
371   function(add_deps target)
372     add_dependencies(${target} ${ARGV})
373   endfunction()
374
375This example defines a new macro named ``add_deps`` which takes a required first
376argument, and just calls another function passing through the first argument and
377all trailing arguments. When variable arguments are present CMake defines them
378in a list named ``ARGV``, and the count of the arguments is defined in ``ARGN``.
379
380CMake provides a module ``CMakeParseArguments`` which provides an implementation
381of advanced argument parsing. We use this all over LLVM, and it is recommended
382for any function that has complex argument-based behaviors or optional
383arguments. CMake's official documentation for the module is in the
384``cmake-modules`` manpage, and is also available at the
385`cmake-modules online documentation
386<https://cmake.org/cmake/help/v3.4/module/CMakeParseArguments.html>`_.
387
388.. note::
389  As of CMake 3.5 the cmake_parse_arguments command has become a native command
390  and the CMakeParseArguments module is empty and only left around for
391  compatibility.
392
393Functions Vs Macros
394-------------------
395
396Functions and Macros look very similar in how they are used, but there is one
397fundamental difference between the two. Functions have their own scope, and
398macros don't. This means variables set in macros will bleed out into the calling
399scope. That makes macros suitable for defining very small bits of functionality
400only.
401
402The other difference between CMake functions and macros is how arguments are
403passed. Arguments to macros are not set as variables, instead dereferences to
404the parameters are resolved across the macro before executing it. This can
405result in some unexpected behavior if using unreferenced variables. For example:
406
407.. code-block:: cmake
408
409   macro(print_list my_list)
410     foreach(var IN LISTS my_list)
411       message("${var}")
412     endforeach()
413   endmacro()
414   
415   set(my_list a b c d)
416   set(my_list_of_numbers 1 2 3 4)
417   print_list(my_list_of_numbers)
418   # prints:
419   # a
420   # b
421   # c
422   # d
423
424Generally speaking this issue is uncommon because it requires using
425non-dereferenced variables with names that overlap in the parent scope, but it
426is important to be aware of because it can lead to subtle bugs.
427
428LLVM Project Wrappers
429=====================
430
431LLVM projects provide lots of wrappers around critical CMake built-in commands.
432We use these wrappers to provide consistent behaviors across LLVM components
433and to reduce code duplication.
434
435We generally (but not always) follow the convention that commands prefaced with
436``llvm_`` are intended to be used only as building blocks for other commands.
437Wrapper commands that are intended for direct use are generally named following
438with the project in the middle of the command name (i.e. ``add_llvm_executable``
439is the wrapper for ``add_executable``). The LLVM ``add_*`` wrapper functions are
440all defined in ``AddLLVM.cmake`` which is installed as part of the LLVM
441distribution. It can be included and used by any LLVM sub-project that requires
442LLVM.
443
444.. note::
445
446   Not all LLVM projects require LLVM for all use cases. For example compiler-rt
447   can be built without LLVM, and the compiler-rt sanitizer libraries are used
448   with GCC.
449
450Useful Built-in Commands
451========================
452
453CMake has a bunch of useful built-in commands. This document isn't going to
454go into details about them because The CMake project has excellent
455documentation. To highlight a few useful functions see:
456
457* `add_custom_command <https://cmake.org/cmake/help/v3.4/command/add_custom_command.html>`_
458* `add_custom_target <https://cmake.org/cmake/help/v3.4/command/add_custom_target.html>`_
459* `file <https://cmake.org/cmake/help/v3.4/command/file.html>`_
460* `list <https://cmake.org/cmake/help/v3.4/command/list.html>`_
461* `math <https://cmake.org/cmake/help/v3.4/command/math.html>`_
462* `string <https://cmake.org/cmake/help/v3.4/command/string.html>`_
463
464The full documentation for CMake commands is in the ``cmake-commands`` manpage
465and available on `CMake's website <https://cmake.org/cmake/help/v3.4/manual/cmake-commands.7.html>`_
466