2 minute read

Often times I’ll find myself wanting to look at the source code of a function, and with R being the open-source language that it is I can often simply type the name of a function, e.g. colnames produces

function (x, do.NULL = TRUE, prefix = "col") 
{
	if (is.data.frame(x) && do.NULL) 
            return(names(x))
        dn <- dimnames(x)
        if (!is.null(dn[[2L]])) 
            dn[[2L]]
        else {
            nc <- NCOL(x)
            if (do.NULL) 
                NULL
            else if (nc > 0L) 
                paste0(prefix, seq_len(nc))
            else character()
        }
    }
<bytecode: 0x9e02f78>
<environment: namespace:base>

and it’s laid bare. Not always however. Recently (to pick an example) I tried to figure out what was inside ‘svymean’ from the ‘survey’ package.

> svymean
function (x, design, na.rm = FALSE, ...) 
{
    .svycheck(design)
    UseMethod("svymean", design)
}

Well that isn’t helpful. Not all is lost though. Sometimes you have internal C or Fortran functions in there like with ‘sample’ (from base R):

> sample
function (x, size, replace = FALSE, prob = NULL) 
{
    if (length(x) == 1L && is.numeric(x) && x >= 1) {
        if (missing(size)) 
            size <- x
        .Internal(sample(x, size, replace, prob))
    }
    else {
        if (missing(size)) 
            size <- length(x)
        x[.Internal(sample(length(x), size, replace, prob))]
    }
}
<bytecode: 0x0000000010504cf0>
<environment: namespace:base>

In here, where it calls

.Internal(sample(x, size, replace, prob))

it’s literally invoking a C funcion called sample, and you need to download R’s source code (or the source package if it’s not in base) and find the relevant C files. Fortunately, the sampling package is all in R. So what then?

To give namespaces a full treatment would take much text (which can be found in the R Manual) but it’s enough here to say that to keep packages from overwriting functions and variables in memory each can have a ‘full address’, e.g. survey:::svymean rather than just svymean, just in case I was using svymean as a variable or something else.

Sometimes it can be enough to specify the namespace if the function isn’t ‘exported’ into the main memory namespace, e.g.

survey:::svymean

but it’s possible you end up where you started:

function (x, design, na.rm = FALSE, ...) 
{
    .svycheck(design)
    UseMethod("svymean", design)
}
<environment: namespace:survey>

Another thing then to try is to see if your function actually has a number of methods, that is if it’s being overloaded and will run different code based on its inputs. In that case you can check for methods with methods():

methods(svymean)
[1] svymean.DBIsvydesign*   svymean.ODBCsvydesign*  svymean.pps*           
[4] svymean.survey.design*  svymean.survey.design2* svymean.svyrep.design* 
[7] svymean.twophase*       svymean.twophase2*    

   Non-visible functions are asterisked

Now we’re getting somewhere! That ouptut has the line

   Non-visible functions are asterisked

and indeed

svymean.pps

doesn’t return anything. However, if we namespace it and call the proper method,

> survey:::svymean.pps
function (x, design, na.rm = FALSE, deff = FALSE, ...) 
{
    if (inherits(x, "formula")) {
        mf <- model.frame(x, model.frame(design), na.action = na.pass)
        xx <- lapply(attr(terms(x), "variables")[-1], function(tt) model.matrix(eval(bquote(~0 + 
            .(tt))), mf))
        cols <- sapply(xx, NCOL)
....
....
    return(average)
}
<environment: namespace:survey>

Success!

Updated: