Naming your package
When creating a package the first thing (and sometimes the most difficult) is to come up with a name for it. There’s only one formal requirement:
- The package name can only consist of letters and numbers, and must start with a letter.
But I have a few additional recommendations:
Make the package name googleable, so that if you google the name you can easily find it. This makes it easy for potential users to find your package, and it’s also useful for you, because it makes it easier to find out who is using it.
Avoid using both upper and lower case letters: they make the package name hard to type and hard to remember. For example, I can never remember if it’s
Some strategies I’ve used in the past to create packages names:
Find a name evocative of the problem and modify it so that it’s unique:
plyr(generalisation of apply tools),
lubridate(makes dates and times easier),
classifly(high-dimensional views of classification).
lvplot(letter value plots),
meifly(models explored interactively).
Add an extra R:
Once you have a name, create a directory with that name, and inside that create an
R subdirectory and a
DESCRIPTION file (note that there’s no extension, and the file name must be all upper case).
A minimal description file (this one is taken from an early version of plyr) looks like this:
Package: plyr Title: Tools for splitting, applying and combining data Description: Version: 0.1 Author: Hadley Wickham <email@example.com> Maintainer: Hadley Wickham <firstname.lastname@example.org> License: MIT
This is the critical subset of package metadata: what it’s called (
Package), what it does (
Description), who’s allowed to use and distribute it (
License), who wrote it (
Author), and who to contact if you have problems (
Maintainer). Here I’ve left the
Description blank to illustrate that if you haven’t decided what the correct value is yet, it’s ok to leave it blank.
Again, the six required elements are:
Package: name of the package. Should be the same as the directory name.
Title: a one line description of the package.
Description: a more detailed paragraph-length description.
Version: the version number, which should be of the the form
?package_versionfor more details on the package version formats. I recommended following the principles of semantic versioning.
Maintainer: a single name and email address for the person responsible for package maintenance.
License: a standard abbreviation for an open source license, like
BSD. A complete list of possibilities can be found by running
file.show(file.path(R.home(), "share/licenses/license.db")). If you are using a non-standard license, put
file LICENSEand then include the full text of the license in a
A more complete
DESCRIPTION (this one from a more recent version of
plyr) looks like this:
Package: plyr Title: Tools for splitting, applying and combining data Description: plyr is a set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each pieces and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of plyr has been generously supported by BD (Becton Dickinson). URL: http://had.co.nz/plyr Version: 1.3 Maintainer: Hadley Wickham <email@example.com> Author: Hadley Wickham <firstname.lastname@example.org> Depends: R (>= 2.11.0) Suggests: abind, testthat (>= 0.2), tcltk, foreach Imports: itertools, iterators License: MIT
DESCRIPTION includes other components that are optional, but still important:
Enhancesdescribe which packages this package needs. They are described in more detail in [[namespaces]].
URL: a url to the package website. Multiple urls can be separated with a comma or whitespace.
Author, you can
Authors@R, which takes a vector of
person() elements. Each person object specifies the name of the person and their role in creating the package:
aut: full authors who have contributed much to the package
ctb: people who have made smaller contributions, like patches.
cre: the package creator/maintainer, the person you should bother if you have problems
Other roles are listed in the help for person. Using
Authors@R is useful when your package gets bigger and you have multiple contributors that you want to acknowledge appropriately. The equivalent
Authors@R syntax for plyr would be:
Authors@R: person("Hadley", "Wickham", role = c("aut", "cre"))
There are a number of other less commonly used fields like
Language. A complete list of the
DESCRIPTION fields that R understands can be found in the R extensions manual.
R loads files in alphabetical order. Unfortunately not every alphabet puts letters in the same order, so you can’t rely on alphabetic ordering if you need one file loaded before another. The order in which files are loaded doesn’t matter for most packages. But if you’re using S4, you’ll need to make sure that classes are loaded before subclasses and generics are defined before methods.
Rather than relying on alphabetic ordering, roxygen2 provides an explicit way of saying that one file must be loaded before another:
@include tag gives a space separated list of file names that should be loaded before the current file:
#' @include class-a.r setClass("B", contains = "A")
@include tags are present in the package, roxygen2 will set the
Collate field in the
DESCRIPTION, which ensures that files are always loaded in the same order.