-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quosure based resamplr #11
Comments
Some messing around with this:
Suppose the use provides a function to generate the item, and a function to extract samples:
The required inputs for this are:
|
Question: how to specify the indexes and the extraction function?
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Use the new rlang quosures and tidy eval for resampling functions. Quosures are unevaluated expressions so dont' take up much memory, and keep the environment in which they should be evaluated, so we don't need to rely on R's internal copy-on-modify mechanics. The latter is nice since I don't want to rely on something that is a little magical, and can easily break without me noticing.
I'm still uncertain about what this looks like, so this issue includes (will include) comments as a puzzle through it.
What any resampling method needs:
Number of elements to resample from or a vector of identifiers.There are two general classes of resampling methods:
An extraction function: a function of arguments,
x
(the object to extract from) andidx
(which gives the elements to extract).Resampling object
Yeah, so quosures are awesome, but how would this work?
The function that creates them will look something like
The
expr
can be a quosure or we just grab it unevaluated and capture the environment.Provide a single expression
The expression can be evaluated for its type and apply the appropriate functions to get the identifiers and extract elements from the object. These could also be optionally provided.
I could just have the user write an expression where
.idx
stands in for the indexes to be provided.The problem with this is then the user needs to provide the identifiers to sample. However, they don't need to provide the extraction method. This is very general, but can be somewhat redundant, since the user has to effectively write the extraction function every time they use it.
Two ideas for the object itself
tidy_eval
or a wrapper provided by this package.function()
. This has the nice feature that to evaluate it, the user only has toThe identifiers can be extracted via a function since in either case they'll reside in some environment.
Resampling functions
Expose all the lower-level functions which work on only identifiers or number of obs.
The following functions for each resampling algorithm could be written:
Where
x
is some arbitrary object,idx
is a vector of identifiers, andn
is the number of elements. Thebootstrap_n
form is the lower level function since it is all that is needed for the resampling algorithm, andbootstrap_idx
will simply applybootstrap_n
to vectors of identifiers.Then
bootstrap()
is only responsible for providing a lower level function with the number of elements or a list of identifiers.To handle groups:
bootstrap()
is a generic function with methods:list
: use grouped bootstrapping.default
: use non-grouped bootstrappingThere is some ambiguity if for some reason identifiers have to be a
list
, but that's tough shit. Identifiers should be atomic vectors. If that is really needed, the user needs to deal with it in the extraction function.In base R, there is
sample
andsample.int
, but I can't use that naming convention since.
should be reserved for S3 methods, and something likesample_int
would suggest that it returns integer values, e.g.map_int
in purrr.One idea would be to have a single generic function with methods, and internal logic that does different things for a scalar integer. I cannot treat a vector with length one as the number of obs, since it doesn't handle the edge case of a single integer identifier.
Notes
I must ensure that resampling a resample object works
The text was updated successfully, but these errors were encountered: