1  Profiling

Author

George G. Vega Yon, Ph.D.

Published

August 14, 2025

Code profiling is a fundamental tool for programmers. With profiling, it is possible to identify potential bottle necks in your code, as well as exessive memory usage. Some important things to consider when profiling your code:

Here is a proposed workflow for profiling and optimizing your code in general

1.1 A proposed workflow

Here is a formula to follow when doing code profiling/optimization:

  1. Ask yourself these questions before you jump into profiling:

Figure showing a silly diagram about when to profile. Generally, you only want to do that if it is taking more than a few seconds.

  1. If you succeed, then ensure that the profiling is done in a finite time, this is, use a subset of the data to avoid having long waits. Running the profiler will add overhead computing time, so try to keep it short (e.g., 1 minute).

  2. There may be many things that could be optimized, focus on what would deliver the highest impact. Could be a function that is only called once but takes a long time to execute, or a function that is called multiple times but is relatively fast.

  3. Before making any changes, ensure that you have a backup of your original code, as well as a copy of the current profiling results. You should also ensure to save (if possible) the outcome of the code.

  4. Re-run the profiler and compare performance. If no changes are observed, then go back to step 2. Ensure the new version of the code maintains the same functionality as the original (check the results).

1.2 Profiling code in R

In the R programming language, the most used profiling tool comes with the profvis package. The package provides a wrapper of the Rprof function, which is a built-in R function for profiling code. The profvis package makes it easier to visualize the profiling results in a web-based interface.

To use profvis, you first need to install it from CRAN:

install.packages("profvis")

Then, you can use it to profile your R code like this:

library(profvis)

profvis({
  # Your R code here
})

This will execute the code and generate a visualization of the profiling results in a new browser window. You can also save the output using the htmlwidgets package:

pv <- profvis({
  # Your R code here
})

htmlwidgets::saveWidget(pv, "profvis.html")

Once open, you will see two visualizations: the flamegraph and the data. The flamegraph is one of the most useful visualizations. It directly maps the time and memory used by each line of code

The data visualization shows the distribution of time spent in each function. Using a tree structure, which allows taking deep dives into the call stack.

Tip

When developing R packages, it is a good idea to pair your profvis::profvis call with devtools::load_all() to ensure that all source code is available to the profiler. Otherwise, the flamegraph won’t show your code (it will say “unavailable”).

1.3 Exercise: Identifying the bottle neck

# Generate data
times <- 4e5
cols <- 150
data <- as.data.frame(x = matrix(rnorm(times * cols, mean = 5), ncol = cols))
data <- cbind(id = paste0("g", seq_len(times)), data)

pv <- profvis::profvis({
  data1 <- data   # Store in another variable for this run

  # Get column means
  means <- apply(data1[, names(data1) != "id"], 2, mean)

  # Subtract mean from each column
  for (i in seq_along(means)) {
    data1[, names(data1) != "id"][, i] <- data1[, names(data1) != "id"][, i] - means[i]
  }
})

htmlwidgets::saveWidget(pv, "profvis-slow-code.html")

# In interactive mode, we can directly view the profiling results
if (interactive())
  print(pv)

Can you identify where is the bottleneck in the code? What would you do to speed it up?


  1. Code copied verbatim from the profvis R package here.↩︎