You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that those are not data frames but tables. What `dbplyr` is actually doing behind the scenes is translating all those dplyr operations into SQL, sending the SQL code to query the database, retrieving results, etc.
170
+
:::{.callout-note}
171
+
## Note
172
+
Note that those are **not** data frames but tables. What `dbplyr` is actually doing behind the scenes is translating all those dplyr operations into SQL, sending the SQL code to query the database, retrieving results, etc.
173
+
:::
169
174
170
-
#### How can I get a "real data frame?"
175
+
#### How can I get a "real" data frame?
171
176
172
177
You add `collect()` to your query.
173
178
@@ -180,9 +185,11 @@ species_db %>%
180
185
collect()
181
186
```
182
187
183
-
Note it means the full query is going to be ran and save in your environment. This might slow things down so you generally want to collect on the smallest data frame you can
188
+
Note it means the full query is going to be ran and save in your R environment. This might slow things down, so you generally want to collect on the smallest data frame you can.
189
+
190
+
#### How can you see the SQL query?
184
191
185
-
#### How can you see the SQL query equivalent to the tidyverse code? => `show_query()`
192
+
Adding `show_query()` at the end of your code block will let you see the SQL code that has been used to query the database.
186
193
187
194
```{r}
188
195
# Add show_query() to the end to see what SQL it is sending!
@@ -203,6 +210,7 @@ Here is how you could run the query using the SQL code directly:
203
210
dbGetQuery(conn, "SELECT Scientific_name FROM Species WHERE (Relevance = 'Study species') ORDER BY Scientific_name LIMIT 3")
204
211
```
205
212
213
+
206
214
You can do pretty much anything with these quasi-tables, including grouping, summarization, joins, etc.
207
215
208
216
Let's count how many species there are per Relevance categories:
@@ -221,13 +229,15 @@ species_db %>%
221
229
summarize(num_species = n()) %>%
222
230
show_query()
223
231
```
232
+
224
233
You can also create new columns using mutate:
225
234
226
235
```{r}
227
236
species_db %>%
228
237
mutate(Code = paste("X", Code)) %>%
229
238
head()
230
239
```
240
+
231
241
How does the query looks like?
232
242
233
243
```{r}
@@ -236,21 +246,22 @@ species_db %>%
236
246
head() %>%
237
247
show_query()
238
248
```
249
+
239
250
:::{.callout-caution}
240
-
****Limitation: no way to add or update data in the database, `dbplyr` is view only. If you want to add or update data, you'll need to use the `DBI` package functions.***
251
+
***Limitation: no way to add or update data in the database, `dbplyr` is view only. If you want to add or update data, you'll need to use the `DBI` package functions.***
241
252
:::
242
253
243
254
### Average egg volume analysis
244
255
245
-
Let's reproduce the egg volume analysis we just did. We can calculate the average bird eggs volume per species directly on the database
256
+
Let's reproduce the egg volume analysis we just did. We can calculate the average bird eggs volume per species directly on the database:
246
257
247
258
```{r}
248
259
# loading all the necessary tables
249
260
eggs_db <- tbl(conn, "Bird_eggs")
250
261
nests_db <- tbl(conn, "Bird_nests")
251
262
```
252
263
253
-
Compute the volume:
264
+
Compute the volume using the same code as previously!!
0 commit comments