We have a bunch of structs which we fill from Google Sheets, one per row, e.g.
|
#[derive(Debug, PartialEq, Eq, Serialize)] |
|
pub(crate) struct ReviewerStaffOnlyDetails { |
|
pub(crate) name: String, |
|
pub(crate) attended_training: bool, |
|
pub(crate) checked: CheckStatus, |
|
pub(crate) quality: String, |
|
pub(crate) notes: String, |
|
} |
We currently do this in an annoyingly manual and inflexible way.
For this particular struct, we populate it here:
|
for (row_index, cells) in sheet.rows.iter().enumerate() { |
|
if row_index == 0 { |
|
continue; |
|
} |
|
if cells.len() < 6 { |
|
continue; |
|
} |
|
|
|
let github_login = GithubLogin::from(cell_string(&cells[0])); |
|
|
|
let notes = match cells.get(6) { |
|
Some(cell) => cell_string(cell), |
|
None => String::new(), |
|
}; |
|
|
|
let checked = match (cell_bool(&cells[3]), cell_bool(&cells[4])) { |
|
(true, false) => CheckStatus::CheckedAndOk, |
|
(true, true) => CheckStatus::CheckedAndCheckAgain, |
|
(false, _) => CheckStatus::Unchecked, |
|
}; |
|
|
|
reviewers.insert( |
|
github_login.clone(), |
|
ReviewerStaffOnlyDetails { |
|
name: cell_string(&cells[1]), |
|
attended_training: cell_bool(&cells[2]), |
|
checked, |
|
quality: cell_string(&cells[5]), |
|
notes, |
|
}, |
|
); |
|
} |
We ignore the headings completely, so if someone changed the sheet, we would start reading incorrect values. Then we hard-code the field order and types of each cell.
In other places, we manually check headings, and just error if they're wrong:
|
for (row_number, cells) in sheet_data.into_iter().enumerate() { |
|
// Some sheets have documentation or pivot table |
|
if row_number == 0 && !cells.is_empty() && cell_string(&cells[0]) != "Name" { |
|
continue; |
|
} |
|
if cells.len() < 7 { |
|
return Err(anyhow::anyhow!( |
|
"Not enough columns for row {} - expected at least 7, got {} containing: {}", |
|
row_number, |
|
cells.len(), |
|
format!("{:#?}", cells), |
|
)); |
|
} |
|
if row_number == 0 { |
|
let headings = cells.iter().take(7).map(cell_string).collect::<Vec<_>>(); |
|
if headings |
|
!= [ |
|
"Name", |
|
"Email", |
|
"Timestamp", |
|
"Course", |
|
"Module", |
|
"Day", |
|
"Location", |
|
] |
|
{ |
|
return Err(anyhow::anyhow!( |
|
"Register sheet contained wrong headings: {}", |
|
headings.join(", ") |
|
)); |
|
} |
|
} else { |
|
if cells[0].effective_value.is_none() { |
|
break; |
|
} |
|
let (sprint_number, attendance) = read_row(&cells, register_url.clone()) |
|
.with_context(|| format!("Failed to read attendance from row {}", row_number))?; |
|
if attendance.timestamp.date_naive() <= start_date |
|
|| attendance.timestamp.date_naive() >= end_date |
|
{ |
|
continue; |
|
} |
|
let sprint_index = sprint_number - 1; |
|
while sprints.len() < sprint_number { |
|
sprints.push(IndexMap::new()); |
|
} |
|
if sprints[sprint_index].contains_key(&attendance.email) { |
|
warn!( |
|
"Register sheet contained duplicate entry for sprint {} trainee {}", |
|
sprint_number, attendance.email |
|
); |
|
} else { |
|
sprints[sprint_index].insert(attendance.email.clone(), attendance); |
|
} |
|
} |
|
} |
We also handle partial rows poorly.
I would love instead for us to just use serde. We could derive Deserialize for each struct we populate, and use a serde adaptor to read the sheet into a Vec<T>.
There is an existing serde_sheets crate for doing this, but it has some issues. If it runs into issues with a row, it just prints an error to stdout and continues (https://docs.rs/serde_sheets/latest/src/serde_sheets/lib.rs.html#200).
Ideally we would either improve this crate, or write our own one. I'm imagining supporting a function along the lines of:
pub async fn read_all<T: DeserializeOwned>(
sheets: &mut Sheets,
document_id: &str,
tab_name: &str,
) -> SheetReadResult<T>;
enum SheetReadResult<T> {
AllOk(Vec<T>),
SomeOk {
ok: Vec<T>,
partial_records: Vec<Record>,
deserialization_error_records: Vec<(Record, serde::Error)>,
},
Err(SheetsError),
}
struct Record {
row_number: usize,
cells: Vec<CellData>,
}
We have a bunch of structs which we fill from Google Sheets, one per row, e.g.
trainee-tracker/src/prs.rs
Lines 217 to 224 in aa0d578
We currently do this in an annoyingly manual and inflexible way.
For this particular struct, we populate it here:
trainee-tracker/src/reviewer_staff_info.rs
Lines 48 to 79 in aa0d578
We ignore the headings completely, so if someone changed the sheet, we would start reading incorrect values. Then we hard-code the field order and types of each cell.
In other places, we manually check headings, and just error if they're wrong:
trainee-tracker/src/register.rs
Lines 105 to 160 in aa0d578
We also handle partial rows poorly.
I would love instead for us to just use
serde. We could deriveDeserializefor each struct we populate, and use aserdeadaptor to read the sheet into aVec<T>.There is an existing
serde_sheetscrate for doing this, but it has some issues. If it runs into issues with a row, it just prints an error to stdout and continues (https://docs.rs/serde_sheets/latest/src/serde_sheets/lib.rs.html#200).Ideally we would either improve this crate, or write our own one. I'm imagining supporting a function along the lines of: