Skip to content

Commit 6fc17ef

Browse files
committed
section finished
1 parent 6c4aff1 commit 6fc17ef

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

hands-on.qmd

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,9 @@ DBI::dbDisconnect(conn, shutdown = TRUE)
290290

291291
## How did we create this database
292292

293-
You might be wondering, how we created this database from our csv files. Most databases have some function to help you import csv files into databases. Note that since there is not data modeling (does not have to be normalized or tidy) constraints nor data type constraints a lot things can go wrong. This is a great opportunity to implement a QA/QC on your data and help you to keep clean and tidy moving forward as new data are collected. As an example, here's
293+
You might be wondering how we created this database from our csv files. Most databases provide functions to import data from csv and other types of files. It is also possible to load data into the database programmatically from within R, one row at a time, using insert statements, but it is more common to load data from csv files. Note that since there is little data modeling within a csv file (the data does not have to be normalized or tidy), and no data type or value constraints can be enforced, a lot things can go wrong. Putting data in a database is thus a great opportunity to implement QA/QC and help you keep your data clean and tidy moving forward as new data are collected.
294+
295+
To look at one example, below is the SQL code that was used to create the `Bird_eggs` table:
294296

295297
```{sql eval=FALSE}
296298
CREATE TABLE Bird_eggs (
@@ -309,11 +311,11 @@ CREATE TABLE Bird_eggs (
309311
COPY Bird_eggs FROM 'ASDN_Bird_eggs.csv' (header TRUE);
310312
```
311313

312-
DuckDB's `COPY` SQL command reads a csv file into a database table. Had we not already created the table in the previous statement, DuckDB would have created a table automatically and guessed at column names and data types. But by explicitly declaring the table, we are able to better characterize the data. Notable in the above:
314+
DuckDB's `COPY` SQL command reads a csv file into a database table. Had we not already created the table in the previous statement, DuckDB would have created it automatically and guessed at column names and data types. But by explicitly declaring the table, we are able to add more characterization to the data. Notable in the above:
313315

314316
- `NOT NULL` indicates that missing values are not allowed.
315-
- Constraints (e.g., `Egg_num BETWEEN 1 and 20`) express expectations about the data and either.
317+
- Constraints (e.g., `Egg_num BETWEEN 1 and 20`) express our expectations about the data.
316318
- A `FOREIGN KEY` declares that a value must refer to an existing value in another table, i.e., it must be a reference.
317319
- A `PRIMARY KEY` identifies a quantity that should be unique within each row, and that serves as a row identifier.
318320

319-
Understand that a table declaration serves as documentation, the database actually
321+
Understand that a table declaration serves as more than documentation; the database actually enforces constraints.

0 commit comments

Comments
 (0)