-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CKANConnection uses an old dbplyr interface #187
Comments
With the release of dbplyr 2.5.0 this is now pretty urgent. Any plans to address this? |
@rdenham thanks for the ping! I've taken a quick stab at upgrading the dbplyr generics. Feel free to try out my branch (see PR) if you have time, any feedback appreciated. |
Thanks for your prompt reply @florianm ! The problem I have is with the dplyr interface, just like in the example at the start of this issue: library("ckanr"); library("dplyr")
ckan <- src_ckan("https://dados.mg.gov.br/") This will throw a slightly misleading error (see this issue) but basically, the dbplyr function My example url, in case it's useful, is: ckan <- src_ckan("https://www.data.qld.gov.au/") Please let me know if I've missed something. |
Just having a look around, it seems like https://github.com/r-dbi/bigrquery might provide me with some idea on how to upgrade this part to work with DBI; there are some similar ideas in there. |
I have missed that one, I initially went by the dbplyr upgrade guide. So we need to upgrade src_ckan to use https://dbi.r-dbi.org/reference/DBIConnection-class.html in
Any ideas welcome! I'll set up a local CKAN test instance if I get a chance later. Greetings from Western Australia! (was formerly involved in data.wa.gov.au) |
Thanks, I'll spend a little bit of time on it later tonight. Shouldn't be hard, just I need to work out how it all fits together first :-).
And the same to you from Queensland! |
If I get this right, we should drop the approach of: my_ckan <- src_ckan("http://demo.ckan.org") and instead use something like: con <- dbConnect(ckanr::ckan(), url = "https://www.data.qld.gov.au/")
tbl(con, "587f65ae-6675-4b8e-bac5-606ce7f4446a") where ckan <- function(url) {
if (!requireNamespace("dplyr", quietly = TRUE)) {
stop("Please install dplyr", call. = FALSE)
}
drv <- new("CKANDriver")
} Then the nomal Here's my test: ## DBI approach
sql = 'select "Density class", "Full" from "587f65ae-6675-4b8e-bac5-606ce7f4446a" limit 5'
dbGetQuery(con, sql)
## dplyr approach
tbl(con, "587f65ae-6675-4b8e-bac5-606ce7f4446a") %>%
select(`Density class`, Full)
I do get a |
Thanks for that! I can get the connection to work but am running into various errors with the examples. Pending a closer look, this seems like we'd need to implement/change some of the generics and update examples and vignettes. Examples of errors I'm seeing:
I'll keep looking into this as bandwidth allows. |
my thoughts were to drop ckan <- function() {
drv <- new("CKANDriver")
} Then do the connection like any other DBI connection, ie con <- dbConnect(ckanr::ckan(), url = "https://www.data.qld.gov.au/") The This evening I'll try to go through the dplyr.R code in ckanr to try to clarify, and drop the deprecated components. I'll share that and you can see if that's the approach you'd like to take. |
I'm not a maintainer and only speaking as a contributor here, so this is just my opinion.
If we retain the function name
Pro your approach:
Pedantry aside, I can create a valid DCI connection to the QLD CKAN, but
|
Obviously, I'd defer to you and the package authors, but my reasoning for the con <- DBI::dbConnect(ckanr::ckan(), url = "https://www.data.qld.gov.au/") style is that it follows the standard approach of connecting to a backend, like con <- dbConnect(RSQLite::SQLite(), dbname = ":memory:") and similarly for SQLite, duckdb etc. So consistent and therefore intuitive for users. I think this would also solve your difficulty in using library(dbplyr)
library(dplyr)
library(ckanr)
drv = ckan()
dbGetInfo(drv)
summary(drv)
dbUnloadDriver(drv) # not sure what this should do
con <- dbConnect(drv, url = "https://www.data.qld.gov.au/")
dbDisconnect(con) # not sure what this should do
dbGetInfo(con) # shows url
sql = 'select "Density class", "Full" from "587f65ae-6675-4b8e-bac5-606ce7f4446a" limit 5'
rs <- dbSendQuery(con, sql)
dbFetch(rs)
dbGetQuery(con, sql)
dbListTables(con, limit =5)
res = dbReadTable(con, '587f65ae-6675-4b8e-bac5-606ce7f4446a')
class(res)
# doesn't work at the moment
#dbExistsTable(con, '587f65ae-6675-4b8e-bac5-606ce7f4446a')
sql = 'select "Density class", "Full" from "587f65ae-6675-4b8e-bac5-606ce7f4446a" limit 5'
rs <- dbSendQuery(con, sql)
dbFetch(rs)
dbListFields(rs)
## dplyr interface
tab1 <- tbl(con, "587f65ae-6675-4b8e-bac5-606ce7f4446a")
# check translations
# explain won't wok
try(
tab1 %>%
summarise(sdf = sd(Full)) %>%
explain())
# but show query does
tab1 %>%
summarise(sdf = sd(Full)) %>%
show_query()
# paste
tab1 %>%
mutate(x = paste(`_id`, `Monitoring period`)) %>%
show_query()
# count
tab1 %>%
summarise(nobs=n()) %>%
show_query()
tab1 %>%
summarise(xc=cor(Full)) %>%
show_query()
# not sure that's what we want
tab1 %>%
summarise(xp=paste(`Density class`)) %>%
show_query()
tab1 %>%
mutate(x = paste(`_id`, `Monitoring period`)) %>%
select(x)
tab1 %>%
select(-`_full_text`) I think that most of the code in I notice that the vignette doesn't have much in it on the |
This looks great, I'll try it out! IMHO it would be beneficial to replace ckanr wrappers with idiomatic DBI / dplyr / dbplyr code and examples (vignette?). The only purpose of ckanr wrappers would be to handle CKAN datastore API behaviour or to simplify code. |
Here is the message when we connect to a CKAN site with DataStore enabled:
According to Hadley this should be a simple fix and the details are available at dbplyr 2.0.0 backend API • dbplyr.
Created on 2022-10-21 with reprex v2.0.2
Session Info and Traceback
Created on 2022-10-21 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: