R packages by Hadley Wickham


Namespace

The package NAMESPACE is one of the most confusing parts of building a package. Roxygen2 aims to make it as easy as possible to build a package that is a well-behaved member of the R ecosystem. This is a little frustrating at first, but soon becomes second-nature.

As the name suggest, namespaces provide “spaces” for “names”, providing a context for evaluating which object is found when you look for it. When developing code, they allow you to specify which package to look for a function when there are multiple packages it could come from, and when developing packages they ensure that your functions always call functions in the same place.

Namespaces make it possible for packages to refer to specific functions in other packages, not the functions that you have defined in the global workspace.

For example, take the simple nrow() function:

nrow
#> function (x) 
#> dim(x)[1L]
#> <bytecode: 0xdf4ad8>
#> <environment: namespace:base>

What happens if we override dim() with our own definition — does nrow() break?

dim <- function(x) c(1, 1)
dim(mtcars)
#> [1] 1 1
nrow(mtcars)
#> [1] 32

Suprisingly, it does not! That’s because when nrow() looks for an object called dim(), it finds the function in the base package, not our function.

Namespaces also provide an extension to the standard way of looking up the value of an object: instead of just typing objectname, you type package::objectname.

Namespaces also control the functions and methods that your package makes available for use by others. Namespaces make it easier to come up with your own function names without worrying about what names other packages have used. A namespace means you can use any name you like for internal functions, and when there is a conflict with an exported function, there is a standard disambiguation procedure.

The easiest way to use namespaces is with roxygen2, because it keeps the namespace definitions next to the function that it concerns. The translation between roxygen2 tags and NAMESPACE directives is usually straightforward: the tag becomes a function name and its space-separated arguments become comma separated arguments to the function. For example @import plyr becomes import(plyr) and @importFrom plyr ddply becomes importFrom(plyr, ddply). The chief exception is @export which will automatically figure out the function name to export.

Exports

For a function to be usable outside of your package, you must export it. By default roxygen2 doesn’t export anything from your package. If you want an object to be publically available, you must explicitly tag it with @export.

Use the following guidelines to decide what to export:

  • Functions: export functions that you want to make available. Exported functions must be documented, and you must be cautious when changing their interface.

  • Datasets: all datasets are publicly available. They exist outside of the package namespace and should not be exported.

  • S3 classes: if you want others to be able to create instances of the class @export the constructor function.

  • S3 generics: the generic is a function so @export if you want it to be usable outside the package

  • S3 methods: every S3 method must be exported, even if the generic is not. Otherwise the S3 method table will not be generated correctly and internal generics will not find the correct method.

  • S4 classes: if you want others to be able to extend your class, @export it. If you want others to create instances of your class, but not extend it, @export the constructor function, but not the class.

    # Can extend and create
    #' @export
    setClass("A")
    
    # Can extend, but constructor not exported
    #' @export
    B <- setClass("B")
    
    # Can create, but not extend
    #' @export C
    C <- setClass("C")
    
    # Can create and extend
    #' @export D
    #' @exportClass D
    D <- setClass("D")
  • S4 generics: @export if you want the generic to be publicly usable.

  • S4 methods: you only need to @export methods for generics that you did not define. But @exporting every method is a good idea as it will not cause problems and prevents you from forgetting to export an important method.

  • RC classes: the same principles apply as for S4 classes. @export will only export the class.

S4

To have S4 work consistently you need two things:

  • Depends: methods in DESCRIPTION
  • imports(methods) in NAMESPACE, typically generated with the roxygen #' @imports methods placed somewhere in your package (it doesn’t matter where, but if you have package level docs, that’s the best place).

This unusual combination of depends and imports is necessary because of an oddity of the methods package: it is loaded and attached when you run R interactively, but is not attached when you run R with Rscript.

Imports

The NAMESPACE also controls which functions from other packages are made available to your package. Only unique directives are saved to the NAMESPACE file, so you can repeat them as needed to maintain a close link between the functions where they are needed and the namespace file.

If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun(). Alternatively, though no longer recommended due to its poorer readability, use @importFrom, e.g., @importFrom pgk fun, and call the function(s) without ::.

If you are using many functions from another package, use @import package to import them all and make available without using ::.

If you want to add a new method to an S3 generic, import it with @importFrom pkg generic.

If you are using S4 you may also need:

  • @importClassesFrom package classa classb ... to import selected S4 classes.

  • @importMethodsFrom package methoda methodb ... to import methods for selected S4 generics.

To import compiled code from another package, use @useDynLib

  • @useDynLib package imports all compiled functions.

  • @useDynLib package routinea routineb imports selected compiled functions.

  • Any @useDynLib specification containing a comma, e.g. @useDynLib mypackage, .registration = TRUE will be inserted as is into the the NAMESPACE, e.g. useDynLib(mypackage, .registration = TRUE)

S3

One complexity arises when you want to register S3 methods for a generic that’s defined in a suggested package. You can’t use S3method() because the generic is not available at package load time. Instead, you can use the following code to set up hooks that load automatically.

From htmltools:

# COPYRIGHT RStudio, GPL >= 2
registerMethods <- function(methods) {
  lapply(methods, function(method) {
    pkg <- method[[1]]
    generic <- method[[2]]
    class <- method[[3]]
    func <- get(paste(generic, class, sep="."))
    if (pkg %in% loadedNamespaces()) {
      registerS3method(generic, class, func, envir = asNamespace(pkg))
    }
    setHook(
      packageEvent(pkg, "onLoad"),
      function(...) {
        registerS3method(generic, class, func, envir = asNamespace(pkg))
      }
    )
  })
}