-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathworkflow.qmd
More file actions
executable file
·243 lines (169 loc) · 9.3 KB
/
workflow.qmd
File metadata and controls
executable file
·243 lines (169 loc) · 9.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
---
title: "Workflow"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
echo = TRUE,
eval = FALSE
)
```
```{r setup, echo=F, eval=T, message=F}
library(Pmetrics)
r_help <- function(pkg, name, fn = NULL) {
if(is.null(fn)) {fn <- name}
glue::glue("[`{name}`](https://rdrr.io/pkg/{pkg}/man/{fn}.html)")
}
gh_help <- function(name) {
glue::glue("[`{name}`](https://lapkb.github.io/Pmetrics/reference/{name}.html)")
}
pmetrics <- function(){
knitr::asis_output("[Pmetrics]{style=\"color: #841010; font-family: 'Arial', Arial, sans-serif; font-weight: 900;\"}")
}
```
## General Workflow
Recall the general `r pmetrics()` fitting workflow shown in the
following diagram.
```{mermaid}
%%| eval: true
flowchart LR
subgraph RSession[" "]
direction TB
%% DATA["PM_data"]
mid(("edit"))
MODEL["PM_model"]
RESULT["PM_result"]
end
DISK[("Hard Drive")]
MODEL -.-> mid(("edit")) -.-> MODEL
MODEL -- "$fit(PM_data, ...)" --> RESULT
%% RESULT -- "edit" --> MODEL
DISK -- "PM_load()" --> RESULT
RESULT -- "automatic" --> DISK
classDef blue fill:#2f6db6,stroke:#184a8b,color:#fff;
classDef orange fill:#c7662b,stroke:#8a3f18,color:#fff;
classDef disk fill:#d2d3d7,stroke:#7f8084,color:#000;
classDef ghost fill:transparent,stroke:transparent,stroke-width:0px,padding:0px,font-style:italic, color:#555;
class mid ghost;
class DATA,MODEL blue;
class RESULT orange;
class DISK disk;
linkStyle 4 font-style:italic, color:#555
style RSession fill:#e9f0ff,stroke:#9ab0d6,stroke-width:1px,rx:2,ry:2
```
Data are combined with a model during a fitting operation to produce results, which are saved to the hard drive for later retrieval and examination. The blue box represents your R script or Quarto document containing R code that you write to perform your analysis.
## Components of a new project
To implement the above workflow when you work on a new project, you will typically do the following:
1. Create a new directory for your project
2. Create a new R script or Quarto document in that directory
3. Build your project code in that script or document.
4. You can store source data files in a subdirectory, e.g. "data" or "src" and the various runs when you fit a model to the data in another subdirectory, e.g. "Runs".
### Creating a new project directory
If you have loaded `r pmetrics()` already with `library(Pmetrics)`, the command `r gh_help("PM_tree")` will create a project folder with subfolders for you. Remember, to learn more about this function or any other in R, you can get help by typing `?function_name` in the console.
:::{code-copy='false'}
```{r}
#| eval: false
# examples; do not run
PM_tree("MyProject") # creates a new project called "MyProject" in your current working directory
PM_tree("MyProject", path = "somewhere") # creates a new project called "MyProject" in the "somewhere" directory
```
:::
In the above examples, `r pmetrics()` will create a directory called "MyProject" with several subdirectories.
* **Rscript** contains a skeleton Analysis.R script to begin `r pmetrics()` runs in the new project.
* **Runs** is empty at first, but should contain all files required for a run, and it will also contain the resulting numerically ordered run directories created after each `r pmetrics()` run.
* **Sim** can contain any files related to simulations
* **src** is a repository for original and manipulated source data files
You are free to edit this directory tree structure as you please, or make your own entirely.
### Set the working directory
Whether you use `PM_tree` or not, you always need to tell R and `r pmetrics()` which directory to work in. This is the directory where R expects to find files and where it will save output files. There are two ways to do this.
#### Global working directory change
You can set your working directory with the `r r_help("base", "setwd", "getwd")` function.
::: {code-copy='false'}
```{r}
#| label: example-npag-2
setwd("/path/to/your/project/directory") # Mac and Linux users
setwd("C:/path/to/your/project/directory") # Windows users
```
:::
**Windows users:** Make sure that you separate directories with a forward slash "/" or double backslashes "\\". Unfortunately, Windows is the only OS that uses backslashes "", so R conforms to Unix/Linux style with the forward slash. If you copy the example below, make sure to change the path to one that exists on your computer.
#### Local path specification
You can also specify the path in a variable and use it in `r pmetrics()` functions without changing the working directory.
::: {code-copy='false'}
```{r}
#| label: example-npag-3
wd <- "/path/to/your/project/directory" # Mac and Linux users
wd <- "C:/path/to/your/project/directory" # Windows users
dat <- PM_data$new(data = file.path(wd, "data", "mydata.csv")) # see ?file.path for help
```
:::
💡 The `r r_help("base", "file.path")` function creates file paths that are compatible with your operating system.
### Scripting
You can use either an R script or a [Quarto](https://quarto.org) document to paste code from these pages or to write your own code. R scripts have the advantage of being simple and easy to use. Quarto documents have the advantage of being able to combine code with formatted text, images, links, etc. You can use either method to work with `r pmetrics()`. Choose the one that works best for you.
#### R script
- Create a new script with *File -\> New File -\> R Script* (Rstudio) or *R File* (Positron).
- Save the script in your project directory with a name like "Learn.R".
- You can then copy and paste code chunks from these pages into your script and run them line by line or all at once.
- It is useful to annotate your code with comments so you can remember what you did later.
Here's a toy example of an R script.
```{r}
# This is a comment in R. It starts with a # symbol.
# Anything other than a comment should be R code.
a <- 1 + 1
print(a) # This will print the value of a to the console.
```
#### Quarto document
- Create a new document with *File -\> New File -\> Quarto Document* (Rstudio/Positron).
- Use markdown to add headings, links, images, etc.
- Insert R chunks to paste copied code or write your own.
- Execute code from the chunks.
- Render the document to create a nicely formatted output in HTML, PDF, or Word format.
- See [Quarto documentation](https://quarto.org/docs/guide) for more information.
Here's a toy example of a Quarto document. It mixes text formatted in markdown and R code chunks.
:::{code-copy='false' .no-code-style}
```{r}
#| echo: false
#| results: asis
#| eval: true
cat("````qmd\n",
"---\n",
"title: \"My Project\"\n",
"format: html\n",
"---\n\n",
"## Introduction\n",
"This is my first Quarto document. Below is some R code.\n\n",
"```{r}\n",
"# this is an R code chunk\n",
"b <- 2 + 2\n",
"print(b)\n",
"```\n\n",
"You can see that the value of b is printed above.\n\n",
"````",
sep = ""
)
```
:::
When rendered, the above will look like this:
## My Project {.unnumbered .unlisted}
### Introduction {.unnumbered .unlisted}
This is my first Quarto document. Below is some R code.
:::{code-copy='false' .no-code-style}
```{r}
#| eval: true
# this is an R code chunk
b <- 2 + 2
print(b)
```
:::
You can see that the value of b is printed above.
### Data and model objects
You supply the data and model objects. These can come from the hard drive or directly defined within R. We devote whole chapters in this book to creating [data](data.qmd) and [model](models.qmd) objects, which are at the heart of `r pmetrics()`.
### Fit the model to the data
When a compiled model and data are combined using the model's `$fit()` method, the analysis is executed, the results are returned. We'll work through an example in the [NPAG](NPAG.qmd) chapter.
### Load previous results
In addition to returning the `r gh_help("PM_result")` object, at the end of the run, your hard drive will contain a new numerically named folder, e.g., 1, 2, 3, ..., in either the current working directory, or the directory specified in a `path` argument to `$fit()` that contains the files which can be loaded into R subsequently using `PM_load(x)`, replacing `x` with the folder number. `r gh_help("PM_load")` is an alias for `PM_result$new()` because it creates a new `PM_result` object. We'll see how to do that in the example [NPAG](NPAG.qmd) chapter.
### Update models to improve fits
To change model structure between fits, update the R code or the file that defines the model. If continuing a previous run that did not end, simply use `$fit()` and specify the run number you wish to continue as the `prior` argument to `$fit()`. Again, we'll see how to do that in the example [NPAG](NPAG.qmd) chapter.
The great advantage of R6 over Legacy is that in R6, you no longer have to spend time copying files from prior run folders, modifying them, and ensuring that they are in the working directory. After the initial creation of the data and model objects, everything can be done in R from memory, although results are still saved to hard drive for later retrieval.
### Examine and use model fits
Once you have a `r gh_help("PM_result")` object from either a new fit or loading from hard drive, you can use its methods and other `r pmetrics()` functions to examine and summarize the results, compare across model fits, generate plots, simulate, and calculate probabilities of target attainment. These topics are described in subsequent chapters.