-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New notebook-level metadata indicating whether outputs should be saved #299
Comments
I don't think that should be in the notebook metadata. |
Note that there might be two different user needs here:
In both cases, the question is where the setting would be best stored (but ideally it would be a choice where the setting is both visible to the client and the contents manager). You could have a server-wide setting to enable/disable/filter, but there would also be benefit to tying a field to a specific notebook: either "I know that this notebook has safe outputs, please save them" or "I know that this notebook has code that can produce sensitive outputs (like PII), please never save them". In that case, using notebook metadata is the best way (setting will endure even if notebook is copied/moved or shared with others). |
Following up on @Carreau 's proposal for managing Jupyter notebook results, I would like to suggest creating a As an example, consider the following
This would result in the following results inclusion for notebooks:
For context, we are actively working on a prototype implementation for Databricks of this for exporting Jupyter notebooks, and we’d like to have a conversation about standardizing if it is useful to the wider community. Please let me know if you have any further questions or suggestions. |
For a long time, people have struggled with Jupyter notebooks and version control. One complicating factor causing churn in notebooks is when output is saved. For example, in ipywidgets we finally insisted on our example notebooks always being saved only after clearing any outputs to prevent churn in the repo. Saving output can also trigger security or business concerns in certain situations as well. There are many situations when a user would like to indicate to the system that a particular notebook should be saved with only the inputs and outputs should be stripped out.
What do people think of having a new notebook-level metadata key that indicates the user wishes to only save inputs, i.e., the user wishes to effectively clear the outputs before saving? Perhaps
jupyter.exclude_outputs
, which if true, is an explicit user hint to the tool saving the notebook that it should strip outputs before saving.Not all tools would obey this hint. For example, I imagine that nbconvert would save outputs if the appropriate options were given, regardless of this hint. However, I think it would be great if JupyterLab/Jupyter Notebook and other frontends could respect the setting.
Disclaimer: we are also looking at how outputs are saved in Jupyter notebook exports at Databricks, where users may have a business need to indicate outputs should not be saved in a notebook.
The text was updated successfully, but these errors were encountered: