You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In /fink_filters/classification.pypyspark is imported on top of the file. However, for most functions defined there (2 out of 3), pyspark is not needed nor used. Only extract_fink_classification uses it, but extract_fink_classification_from_pdf(pdf) doesn't.
What's the expected behavior :
We would expect pyspark not to be needed when a user imports extract_fink_classification_from_pdf(pdf) to extract the classification from a simple panda dataframe, or even when using extract_fink_classification_() that just needs the right fields from the alert (this function is used by the one mentionned before, and the one mentionned next). But only when using extract_fink_classification that is meant to be used with Apache Spark.
What can be done :
The easiest solution would be to have the basic classification function extract_fink_classification_ (used by both panda and spark functions later on) in its own module. And then 2 seperate modules, one that imports pandas as needed, and one that imports spark, where both import the module that contains extract_fink_classification_.
Otherwise, maybe just finding a way to import the right libraries used by each function, rather than all of them when importing the whole module (what i mean is that, if i import only ``extract_fink_classification_from_pdf, it could maybe only import the libraries used by it, and the same would happen when importing only extract_fink_classification` where this time it would import pyspark).
The text was updated successfully, but these errors were encountered:
In
/fink_filters/classification.py
pyspark
is imported on top of the file. However, for most functions defined there (2 out of 3), pyspark is not needed nor used. Onlyextract_fink_classification
uses it, butextract_fink_classification_from_pdf(pdf)
doesn't.What's the expected behavior :
We would expect
pyspark
not to be needed when a user importsextract_fink_classification_from_pdf(pdf)
to extract the classification from a simple panda dataframe, or even when usingextract_fink_classification_()
that just needs the right fields from the alert (this function is used by the one mentionned before, and the one mentionned next). But only when usingextract_fink_classification
that is meant to be used with Apache Spark.What can be done :
The easiest solution would be to have the basic classification function
extract_fink_classification_
(used by both panda and spark functions later on) in its own module. And then 2 seperate modules, one that imports pandas as needed, and one that imports spark, where both import the module that containsextract_fink_classification_
.Otherwise, maybe just finding a way to import the right libraries used by each function, rather than all of them when importing the whole module (what i mean is that, if i import only ``extract_fink_classification_from_pdf
, it could maybe only import the libraries used by it, and the same would happen when importing only
extract_fink_classification` where this time it would import pyspark).The text was updated successfully, but these errors were encountered: