Enabling OpenMP support for data.table on macOS
Xcode compilers on macOS do not support OpenMP. In consequence, R packages use only one of the many available cores for their computations, which often results in longer computing time. In this blog post, I show how to re-enable OpenMP on macOS, and how to rebuild affected R packages.
The R package data.table provides very efficient functions to read and write large CSV files: fread and fwrite. To do so, data.table makes heavy use of multithreaded C code.
Yet, given Xcode’s missing OpenMP support1 on macOS,2 data.table greets users with the following message:
> library(data.table)
data.table 1.14.8 using 1 threads (see ?getDTthreads). Latest news:
r-datatable.com
**********
This installation of data.table has not detected OpenMP support. It
should still work but in single-threaded mode.
This is a Mac. Please read https://mac.r-project.org/openmp/. Please
engage with Apple and ask them for support. Check r-datatable.com for
updates, and our Mac instructions here:
https://github.com/Rdatatable/data.table/wiki/Installation. After
several years of many reports of installation problems on Mac, it's
time to gingerly point out that there have been no similar problems
on Windows or Linux.
**********
Alleviating this issue is luckily not too complicated.3 All we have to do to convince Xcode Clang to enable multithreading in data.table (and other R packages) is to install the OpenMP runtime and to add some build flags.
Get the OpenMP runtime library
The runtime consists of a dynamic library (libomp.dylib) and three header files, which can be fetched from https://mac.r-project.org/openmp and need to be copied to /usr/local/lib and /usr/local/include respectively.
Alternatively, fetch the LLVM-sources, select the commit matching the version shipped with Xcode (for Xcode 14.x 4ba6a9c9f65b), and build libomp from source.4
Build data.table
Now that we have the OpenMP runtime, let’s (re-)build data.table. We may choose one of these approaches:
-
Update
.R/Makevars.5CPPFLAGS += -Xclang -fopenmp LDFLAGS += -lomp -Wl,-rpath,/usr/local/libThen, build
data.tablefrom within R withinstall.packages("data.table", type = "source")or from the command line. Installing from the command line requires the package tarball to present in the current working directory.
R CMD INSTALL data.table_1.14.8.tar.gz -
Directly provide
R CMD INSTALLwith the information to compile and linkdata.tablewith OpenMP.PKG_CPPFLAGS='-Xclang -fopenmp' \ PKG_LIBS='-lomp -Wl,-rpath,/usr/local/lib' \ R CMD INSTALL data.table_1.14.8.tar.gz -
Alternatively, link the OpenMP runtime statically to
data.table. This comes with the usual advantages and disadvantages of linking statically,6 and requireslibompto be built as a static library (see footnote 4).PKG_CPPFLAGS='-Xclang -fopenmp' \ PKG_LIBS='/usr/local/lib/libomp.a' \ R CMD INSTALL data.table_1.14.8.tar.gz
Use the rebuilt data.table
After (re-)building data.table, start up R and load the fresh library. data.table should now display how many threads it will use.
> library(data.table)
data.table 1.14.8 using 4 threads (see ?getDTthreads). Latest news: r-datatable.com
Happy coding!
-
This is not to say that macOS doesn’t provide means to build multithreading applications. The system in fact ships for instance with pthread.h as well as Apple’s Grand Central Dispatch. ↩︎
-
Thanks to the work of the R project at https://mac.r-project.org/openmp. ↩︎
-
What I did was something along these lines:
↩︎# fetch sources and switch to OpenMP directory git clone --branch release/15.x https://github.com/llvm/llvm-project.git cd llvm-project git checkout 4ba6a9 cd openmp # configure and build (to build a static library, add # -DLIBOMP_ENABLE_SHARED=false) mkdir build && cd build cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ .. make -j # install sudo make install -
Avid readers of the
data.tableinstallation documentation on GitHub or the macOS documentation on R-Project (https://mac.r-project.org/openmp) may notice that I added-Wl,-rpath,/usr/local/libPKG_LIBS. This is necessary to ensure thatlibomp.dylibis actually found bydata.table. ↩︎ -
See e.g. https://en.wikipedia.org/wiki/Static_build and https://en.wikipedia.org/wiki/Library_(computing). ↩︎