-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dealing with rows #202
Comments
This would actually fix a number of current issues with handling missings, especially when expanding categorical variables into contrasts. |
Just as a way of organizing thoughts about how to approach this. There are (at
So it might be necessary to be able to switch between these modes somehow; maybe |
The current major blocker for getting rid of
TableStatisticalModel
wrappers (#32) and theModelFrame
/ModelMatrix
structs is that they also keep a mask for the rows that are actually included from the underlying table. It's necessary to keep these around if (possibly among other things) you want topredict
back into the original table, since you need to know where the missing values were. So just keeping theFormulaTerm
around isn't enough to have a completely stand-alone table-to-matrix transformation which can replaceModelFrame
.A related issue is that missing values in the output are now also introduced by terms themselves, because of the
lead
/lag
support.So my current thinking is that we ought to give up on trying to never generate missing values in the output (consistent with #153) and set rows to missing where any missing value is encountered in any column of the table, and provide some kind of functionality to compute the missings mask if necessary so that consumers can decide what to do. That leads to a situation where the consumer has to do a bit more fiddly book-keeping and it also means that the terms are not fully stand-alone but I'm not sure I see an alternative at this point (except for some kind of extremely lazy architecture where the terms actually hold views of the underlying table or something like that, I think @nalimilan were talking about this at ZiF...)
The text was updated successfully, but these errors were encountered: