Create HTML page containing all package examples using knitr and markdown

Create HTML page containing all package examples using knitr and markdown

Create HTML page containing all package examples using knitr and markdown

Yesterday, I was updating my website to include more information about some R packages I am working on. I wanted to create a page showing examples of the functionality that is available in each package. Now, I usually write a quite extensive amount of examples in the help pages, showing most of the functionality. A single page with the examples of each function in the package including the output would be a, although very simply, quite nice way to quickly demonstrate the functionality of the package.

This is not hard to do by hand. Simply copy the examples to an R markdown file, add knitr chunks and a title here and there and it should look good. But this seemed like something that would be nice to do every update, so I wanted to automate it. The results (which can be seen here and here) were quite nice, so I figured I maybe others could use these codes as well. Do keep in mind that these codes are hardly tested!

Below is the code for the function examplePage. It scans all of a packages .Rd files for an example section and extracts them. The function creates and compiles a markdown document with the following title levels:

  1. The package name (only at the top)
  2. The help-file name
  3. Any line in the example section that contain word characters and either starts with three or more #s or starts with an # and ends with an # or -.
  4. Any line that starts with (after spaces) exactly ##, followed by word characters and does not end on # or -.

These titles are chosen specifically such that the control-shift-R command in RStudio or something like #### BIG SECTION ####results in a large title.

Usage:

examplePage has the following arguments:

  • pkg: Path to the package folder, should include a directory man with the Rd files.
  • openChunk: code to open chunks. Defaults to "{r, message=FALSE, warning = FALSE, error = FALSE}". This can be used to enter more knitr options.
  • includeDontshow: Logical stating if don't show environments should be included in the codes. Defaults to FALSE.
  • includeDontrun: Logical stating if don't run environments should be included in the codes. Defaults to TRUE.

The function:

examplePage <- function(pkg, openChunk = "```{r, message=FALSE, warning = FALSE, error = FALSE}", 
    includeDontshow = FALSE, includeDontrun = TRUE, exclude) {
    if (!require("knitr")) 
        stop("'knitr must be intalled.")
    if (!require("markdown")) 
        stop("'knitr must be intalled.")

    # Inner function to find closing brackets:
    findClose <- function(x, openLoc, open = "\\{", close = "\\}") {
        # Find close:
        nest <- 1
        i <- openLoc + 1
        repeat {
            # If open bracket in line:
            if (grepl(open, x[i])) {
                nest <- nest + length(gregexpr(open, x[i])[[1]])
            }
            if (grepl(close, x[i])) {
                nest <- nest - length(gregexpr(close, x[i])[[1]])
            }
            if (nest == 0) 
                break
            i <- i + 1
        }
        return(i)
    }

    files <- list.files(paste0(pkg, "/man"), pattern = "\\.Rd$", ignore.case = TRUE, 
        full.names = TRUE)

    # Exclude:
    if (!missing(exclude)) 
        files <- files[!grepl(exclude, files)]

    # Preparation:
    n <- length(files)
    subs <- character(n)

    # For each rd file:
    for (i in seq_along(files)) {
        # Read file:
        txt <- readLines(files[i])

        # Only include if there is only one example section:
        if (sum(grepl("\\\\examples\\{", txt)) == 1) {
            # Extract examples section:
            start <- grep("\\\\examples\\{", txt)
            end <- findClose(txt, start)
            txt <- txt[(start + 1):(end - 1)]

            # Don't show fields:
            dontshows <- grep("\\\\dontshow\\{", txt)
            if (length(dontshows) > 0) {
                ends <- numeric(length(dontshows))
                for (k in seq_along(dontshows)) {
                  ends[k] <- findClose(txt, dontshows[k])
                }

                # Remove:
                if (includeDontshow) {
                  txt <- txt[-c(dontshows, ends)]
                } else txt <- txt[-do.call(c, mapply(dontshows, ends, FUN = ":", 
                  SIMPLIFY = FALSE))]
            }

            # Don't run fields:
            dontruns <- grep("\\\\dontrun\\{", txt)
            if (length(dontruns) > 0) {
                ends <- numeric(length(dontruns))
                for (k in seq_along(dontruns)) {
                  ends[k] <- findClose(txt, dontruns[k])
                }

                # Remove:
                if (includeDontrun) {
                  txt <- txt[-c(dontruns, ends)]
                } else txt <- txt[-do.call(c, mapply(dontruns, ends, FUN = ":", 
                  SIMPLIFY = FALSE))]
            }

            # Enter main title and first R chunk:
            txt <- c(paste("##", gsub("\\.rd$", "", basename(files[i]), ignore.case = TRUE)), 
                openChunk, txt, "```")

            # Crawl over lines. If a title is encountered, close chunk and replace
            # title with markdown:
            j <- 3
            repeat {
                # Small section (start with exactly two hashes, does not end with nonword:
                if (grepl("^\\s*##\\s*(\\w|\\s)+$", txt[j])) {
                  txt[j] <- gsub("^\\s*##\\s*", "#### ", txt[j])
                  txt <- c(txt[1:(j - 1)], "```", txt[j], openChunk, txt[(j + 
                    1):length(txt)])
                  j <- j + 2

                  # Else large section, starts with #, ends with nonchar, or starts with
                  # more than 2 #'s
                } else if (grepl("\\w", txt[j]) & (grepl("^\\s*###", txt[j]) | 
                  grepl("^\\s*#.*[#-]\\s*$", txt[j]))) {
                  txt[j] <- gsub("^\\W*(?=\\w)", "### ", txt[j], perl = TRUE)
                  txt[j] <- gsub("(?<=\\w)\\W*$", "", txt[j], perl = TRUE)

                  txt <- c(txt[1:(j - 1)], "```", txt[j], openChunk, txt[(j + 
                    1):length(txt)])
                  j <- j + 2
                } else if (grepl("^\\s*#\\W*$", txt[j])) {
                  # If start is comment and no words, remove:
                  txt <- txt[-j]
                  j <- j - 1
                }

                j <- j + 1
                if (j > length(txt)) 
                  break
            }

            emptySections <- which(txt[-length(txt)] == openChunk & txt[-1] == 
                "```")
            if (length(emptySections) > 0) 
                txt <- txt[-c(emptySections, emptySections + 1)]

            txt <- gsub("\\\\%", "%", txt)
            subs[i] <- paste(txt, collapse = "\n")
        }
    }

    subs <- subs[order(nchar(subs), decreasing = TRUE)]
    subs <- c(paste0("# ", basename(pkg), "\n\n```{r,echo=FALSE,message=FALSE}\nlibrary(\"", 
        basename(pkg), "\")\n```"), subs)

    # Write Rmd:
    RmdFile <- paste0(basename(pkg), ".Rmd")
    write(paste(subs, collapse = "\n\n"), RmdFile)

    # Knit:
    mdFile <- gsub("Rmd", "md", RmdFile)
    knit(RmdFile, mdFile)

    # Markdown:
    htmlFile <- gsub("Rmd", "html", RmdFile)
    markdownToHTML(mdFile, htmlFile)

    browseURL(htmlFile)

    return(htmlFile)
}
This entry was posted in Uncategorized and tagged . Bookmark the permalink.

11 Responses to Create HTML page containing all package examples using knitr and markdown

  1. Sebastian says:

    Hello!
    I was recently thinking about the same thing.
    On my system, your code doesn’t work – don’t know why, but I am currently not motivated to look for the reason.

    I am wondering if there could be an easier solution…
    Is there an easy way to get a list with all help topics in a package?
    Then you could write something along the lines of:

    “`{r ExamplesFromHelp}
    GetPages <- function(package.name) {
    # This function would have to be extended to return all help topics – if anyone has a smart idea, this would be greatly appreciated
    return(c('lm', 'glm'))
    }

    package.name <- 'base'
    library(package.name, character.only = TRUE)
    pages <- GetPages(package.name)
    invisible(sapply(pages, example, character.only = TRUE))
    “`

  2. Yihui says:

    Thanks! This is very nice! Are you aware of knitr::knit_rd? Example: http://tengfei.github.com/ggbio/docs/man/

    A simpler example: http://stackoverflow.com/a/11657083/559676

  3. sachaepskamp says:

    @Sebastian: I think there should be a way to do that. The source codes for ‘utils:::example’ give some clues, though they do require that you give a function name. Of course, those should probably not be to hard to extract.

    @Yihui: I wasn’t aware of knit_rd, but it looks awesome! I’ll put the results from that on my site as well :)

  4. Yihui says:

    Sorry the second link should have been: http://stackoverflow.com/a/11657083/559676 Somehow I pasted the same link twice… could you correct that for me? Thanks!

    In fact `knit_rd()` was using the clue in `utils::example()`. I think it is worth a separate exported function in base R to extract example code from installed packages.

  5. sachaepskamp says:

    Done!

  6. Tyler Rinker says:

    Just yesterday I created a function to grab the examples from my roxygen2 code:

    https://github.com/trinker/acc.roxygen2/blob/master/R/examples.R

    I thought I had something pretty nice but then today I look at your post and was thoroughly impressed. Thanks for sharing. Great idea.

  7. Steve Walker says:

    Cool stuff. And thanks for posting the code. I may look at this more closely in the near future.

    Kind of reminds me of staticdocs: https://github.com/hadley/staticdocs

    Here’s an ugly hack I used to get staticdocs working: http://stevencarlislewalker.wordpress.com/2012/10/26/ugly-hack-to-get-staticdocs-working/

  8. Tyler Rinker says:

    This blog post inspired me to look at Hadley’s staticdocs. Right now the package is not well documented and highlights is archived. However, after running the archived version of highlight using staticdocs is pretty easy to do as well: https://github.com/hadley/staticdocs and seems similar to what you’ve created here but may work off of the roxygen2 coding rather than the rd files I believe your approach uses. Thanks again for sharing.

  9. Sebastian says:

    Hello!

    @all: Thanks for the valuable hints from your comments that helped me to implement my original idea (however, the other solutions presented here look a lot more powerful)

    Here a short snippet that can be put into a Rmd chunk:

    Best greetings,
    Sebastian

  10. Sebastian says:

    So here it is:

    RunAllExamples <- function(package.name) {

    library(package.name, character.only = TRUE)

    topics <- names(tools:::fetchRdDB(file.path(find.package(package.name),

    "help", package.name)))

    invisible(sapply(topics, example, character.only = TRUE))

    }

    RunAllExamples('lattice')

  11. Bryan Hanson says:

    Great function Sasha! Thanks for making it available.

    I’d suggest this change so that ‘exclude’ can be longer than one file:

    # Exclude:
    if (!missing(exclude))
    for (i in seq_along(exclude)) {
    files <- files[!grepl(exclude[i], files)]
    }