A basic use-case of Scikick is presented below. Links to the command help outputs (e.g. sk <command> --help
) are provided throughout the documentation.
# (this cell is hidden with HTML tags) - Remove previous tutorial contents
rm HW/scikick.yml || echo "Does not exist"
rm -rf HW/report || echo "Does not exist"
sed -i '' 's/Hello/Hi/g' HW/hello.Rmd || echo "Unable to change text"
sk init should be executed at the project root in an existing or an empty project (i.e. much like git init
). It will check for required dependencies and create a scikick.yml
file to store the workflow definition.
# Go to hello world directory
cd HW
sk init -y
sk: Checking scikick software dependencies
sk: Importing template analysis configuration file
sk: Writing to scikick.yml
sk add is used to add notebooks to the project.
We will add the first notebook (hello.Rmd
) to the project.
cat hello.Rmd
A simple notebook to define the first part of our greeting.
```{r}
greeting1 = "Hi,"
# Save the greeting1 object as a file for later use
saveRDS(greeting1,"greeting_part1.RDS")
```
Simulated changes
sk add hello.Rmd
sk: Added hello.Rmd
This has added the notebook to the scikick.yml
configuration file (i.e. under “analysis”).
cat scikick.yml
### Scikick Project Workflow Configuration File
# Directory where Scikick will store all standard notebook outputs
reportdir: report
# --- Content below here is best modified by using the Scikick CLI ---
# Notebook Execution Configuration (format summarized below)
# analysis:
# first_notebook.Rmd:
# second_notebook.Rmd:
# - first_notebook.Rmd # must execute before second_notebook.Rmd
# - functions.R # file is used by second_notebook.Rmd
#
# Each analysis item is executed to generate md and html files, E.g.:
# 1. <reportdir>/out_md/first_notebook.md
# 2. <reportdir>/out_html/first_notebook.html
analysis: !!omap
- hello.Rmd:
version_info:
snakemake: 6.0.2
ruamel.yaml: 0.16.12
scikick: 0.2.1
We will also add a second notebook in a plain .R
format (world.R
).
cat world.R
#' A simple notebook to define the second part of our greeting.
greeting2 = "World!"
# Save the greeting2 object as a file for later use
saveRDS(greeting2,"greeting_part2.RDS")
sk add world.R
sk: Added world.R
sk status can now be used to inspect the workflow state for the two notebooks that were added to the project (hello.Rmd
and world.R
).
sk status
m-- hello.Rmd
m-- world.R
m-- system index (homepage)
Scripts to execute: 3
HTMLs to compile ('---'): 3
sk status
shows all notebooks that must be executed and uses a 3 character encoding to show the reason for execution. The ‘m’ in the first slot here indicates the output files for hello.Rmd
, world.R
, and the homepage are all missing in the report/out_md/
directory.
sk run can now be used to call on the snakemake backend to generate all out-of-date or missing output files.
sk run
sk: Creating site layout from scikick.yml
sk: Executing code in system's /workflow/notebook_rules/index.Rmd, outputting to report/out_md/index.md
sk: Adding project map to report/out_md/index.md as report/out_md/index_tmp.md
sk: Converting report/out_md/index_tmp.md to report/out_html/index.html
sk: Executing code in hello.Rmd, outputting to report/out_md/hello.md
sk: Adding project map to report/out_md/hello.md as report/out_md/hello_tmp.md
sk: Converting report/out_md/hello_tmp.md to report/out_html/hello.html
sk: Executing code in world.R, outputting to report/out_md/world.md
sk: Adding project map to report/out_md/world.md as report/out_md/world_tmp.md
sk: Converting report/out_md/world_tmp.md to report/out_html/world.html
sk: Done, homepage is report/out_html/index.html
We can see above that Scikick provides messages when executing each stage of the project (for more details on how this execution works, see the “Core Design” page). After execution is finished, the directory structure looks as follows.
ls -h *.* report/*
full_greeting.Rmd greeting_part2.RDS scikick.yml
greeting_part1.RDS hello.Rmd world.R
report/benchmark:
hello index world
report/logs:
hello_logs.txt index_logs.txt world_logs.txt
report/out_html:
hello.html index.html world.html
report/out_md:
_site.yml hello.md index.md world.md
The report/
directory contains all of Scikick’s outputs.
Opening report/out_html/index.html
in a web browser shows the website
homepage with menu items for each added notebook and a project map with the two notebooks (“Hello” and “World”).
Screenshot of index.html (homepage).
Running sk status
again will result in no jobs to be run.
sk status
Scripts to execute: 0
HTMLs to compile ('---'): 0
And sk run
will do nothing.
sk run
sk: Nothing to be done
Scikick tracks files using snakemake to determine if the report is up-to-date.
For example, if we make changes to hello.Rmd
and use sk status
# Simulating changes to hello.Rmd
printf "\nSimulated changes" >> hello.Rmd
sk status
s-- hello.Rmd
Scripts to execute: 1
HTMLs to compile ('---'): 1
hello.Rmd
is now queued for re-execution and sk run
creates the page report/out_html/hello.html
from scratch.
sk run
sk: Executing code in hello.Rmd, outputting to report/out_md/hello.md
sk: Adding project map to report/out_md/hello.md as report/out_md/hello_tmp.md
sk: Converting report/out_md/hello_tmp.md to report/out_html/hello.html
sk: Done, homepage is report/out_html/index.html
To create a notebook which uses the outputs of other notebooks, we can use sk add -d to ensure notebooks are executed in the correct order.
Let’s add a new notebook full_greeting.Rmd
and specify the hello.Rmd
and world.R
notebooks as dependencies.
cat full_greeting.Rmd
```{r}
greeting1 = readRDS("greeting_part1.RDS")
greeting2 = readRDS("greeting_part2.RDS")
paste(greeting1,greeting2)
```
sk add full_greeting.Rmd -d hello.Rmd -d world.R
sk: Added full_greeting.Rmd
sk: Added dependency hello.Rmd to full_greeting.Rmd
sk: full_greeting.Rmd will be executed after any executions of hello.Rmd
sk: Added dependency world.R to full_greeting.Rmd
sk: full_greeting.Rmd will be executed after any executions of world.R
Executing to create report/out_html/full_greeting.html
.
Screenshot of full_greeting.html.
We see that the project map reflects the workflow configuration and “Full Greeting” was added to the navigation.
The full greeting we have created above is “Hi, World!”, but we would like it to be “Hello, World!”. We must make changes to hello.Rmd
to update the greeting1 object.
# Change from 'Hi' to 'Hello' in hello.Rmd
sed -i '' 's/Hi/Hello/g' hello.Rmd
# Inspect the change
cat hello.Rmd
A simple notebook to define the first part of our greeting.
```{r}
greeting1 = "Hello,"
# Save the greeting1 object as a file for later use
saveRDS(greeting1,"greeting_part1.RDS")
```
Simulated changes
Simulated changes
sk run
sk: Adding project map to report/out_md/index.md as report/out_md/index_tmp.md
sk: Creating site layout from scikick.yml
sk: Converting report/out_md/index_tmp.md to report/out_html/index.html
sk: Adding project map to report/out_md/world.md as report/out_md/world_tmp.md
sk: Converting report/out_md/world_tmp.md to report/out_html/world.html
sk: Executing code in hello.Rmd, outputting to report/out_md/hello.md
sk: Executing code in full_greeting.Rmd, outputting to report/out_md/full_greeting.md
sk: Adding project map to report/out_md/hello.md as report/out_md/hello_tmp.md
sk: Converting report/out_md/hello_tmp.md to report/out_html/hello.html
sk: Adding project map to report/out_md/full_greeting.md as report/out_md/full_greeting_tmp.md
sk: Converting report/out_md/full_greeting_tmp.md to report/out_html/full_greeting.html
sk: Done, homepage is report/out_html/index.html
We see above that, not only was hello.Rmd
re-executed, but full_greeting.Rmd
was also re-executed as it lists hello.Rmd
as a dependency. The final result in full_greeting.html
was updated.
Second screenshot of full_greeting.html.
The usage of sk init
, sk add
, sk status
, and sk run
as shown above is typically enough to begin using Scikick for your data analysis.