Error loading a model that was saved with mlnet auto-train #423

RokoToken · 2020-01-29T03:46:19Z

Describe the bug
When using the mlnet auto-train tool to create a model, and then load that model using NimbusML, an exception is being thrown.

To Reproduce
Steps to reproduce the behavior:

Run mlnet auto-train --dataset ... --task ... to create an ML.NET .zip model file.
Using NimbusML, attempt to load that model file and score some data like the following:

dataset = FileDataStream.read_csv('TrainingData.csv')
pipeline = Pipeline()
pipeline.load_model("MLModel.zip")
scores = pipeline.predict(dataset, y='target', evaltype='binary')

Expected behavior
Loading and scoring the model should work as expected.

Actual behavior
You get an exception and scoring is not completed:

Error: *** System.ArgumentOutOfRangeException: 'Could not find label column 'PredictedLabel'
Parameter name: input'Traceback (most recent call last):
  File "nimbus.py", line 7, in <module>
    scores = pipeline.predict(test_df, evaltype='binary')
  File "C:\Users\eric\Omni\venv\lib\site-packages\nimbusml\internal\utils\utils.py", line 220, in wrapper
    params = func(*args, **kwargs)
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\pipeline.py", line 2228, in predict
    as_binary_data_stream=as_binary_data_stream, **params)
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\internal\utils\utils.py", line 220, in wrapper
    params = func(*args, **kwargs)
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\pipeline.py", line 2172, in _predict
    raise e
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\pipeline.py", line 2169, in _predict
    **params)
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\internal\utils\entrypoints.py", line 449, in run
    output_predictor_modelfilename)
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\internal\utils\entrypoints.py", line 306, in _try_call_bridge
    raise e
  File "C:\Users\eric\venv\lib\site-packages\nimbusml\internal\utils\entrypoints.py", line 278, in _try_call_bridge
    ret = px_call(call_parameters)
RuntimeError: Error: *** System.ArgumentOutOfRangeException: 'Could not find label column 'PredictedLabel'
Parameter name: input'

Desktop (please complete the following information):

OS: Windows
Browser N/A
Version 1.6.1

Additional Context

I attempted to solve this by adding an additional column named 'PredictedLabel' inside of 'TrainingData.csv' but it gave the same error

The text was updated successfully, but these errors were encountered:

ganik · 2020-01-29T04:34:50Z

@RokoToken thank you for reporting this. Could you share the model.zip and small subset of TrainingData.csv for us to repro this issue. thx

RokoToken · 2020-01-29T22:12:22Z

Modified Titanic CSV Dataset

survived,sex,class,deck,embark_town,alone
TRUE,male,Third,unknown,Southampton,n
TRUE,female,First,C,Cherbourg,n
TRUE,female,Second,unknown,Southampton,y

MLNet CLI Command:

mlnet auto-train --task multiclass-classification --dataset "titanic.csv" --label-column-name "class"

Nimbus Code:

from nimbusml import Pipeline, FileDataStream
dataset = FileDataStream.read_csv('titanic.csv')
pipeline = Pipeline()
pipeline.load_model("MLModel.zip")
scores = pipeline.predict(dataset, y='class', evaltype='binary')
print(scores)

Error:

Error: *** System.ArgumentOutOfRangeException: 'Could not find label column 'PredictedLabel'

justinormont · 2020-01-29T23:57:54Z

There was a similar issue 6mo ago -- #201 -- We were fixing NimbusML scoring of models trained in the AutoML.NET CLI.

@RokoToken: Can you post your MLModel.zip? Also, which version of the CLI are you using? mlnet --version

RokoToken · 2020-01-30T00:39:16Z

@justinormont @ganik
mlnet version = 0.15.28007.4 @BuiltBy: dlab14-DDVSOWINAGE054
MLModel.zip

RokoToken · 2020-02-03T21:51:46Z

Is there a workaround for this? Should I use an older version of MLNet CLI? Is there a way to modify the output column through the Nimbus pipeline? Something like:

from nimbusml import Pipeline, FileDataStream
dataset = FileDataStream.read_csv('titanic.csv')
pipeline = Pipeline( add_output_column=PredictedLabel )
pipeline.load_model("MLModel.zip")
scores = pipeline.predict(dataset, y='class', evaltype='binary')
print(scores)

ganik · 2020-02-03T22:00:19Z

@RokoToken, the workaround will be to find the pipeline params from AutoML.NET and re-train same pipeline using either just ML.NET or NimbusML. Also can you try using pipeline.score(...)

justinormont · 2020-02-04T06:04:08Z

@ganik: Do you see anything odd with the posted model?

@RokoToken: I would expect that the AutoML․NET CLI is producing a normal ML․NET model. Your current version is the newest released version.

You can also re-train your model from the generated code which the CLI produced. You can uncomment the line ModelBuilder.CreateModel(), and run the project. You can also update the project requirements, as the codegen references an older version of ML․NET.

ganik · 2020-02-26T00:37:56Z

@RokoToken sorry for delay, could you share pls titanic.csv file. The model does look ok, so it should work. thx

ganik · 2020-02-26T23:41:37Z

I was able to debug through and get scoring after few fixes in NimbusML python code (not ML.NET). However return scores are NaN.
Script:
`from nimbusml import Pipeline, FileDataStream

dataset = FileDataStream.read_csv('E:/sources/tmp/titanic.csv')
print(dataset.head(3))

pipeline = Pipeline()
pipeline.load_model("E:/sources/tmp/MLModel.zip")
scores = pipeline.predict(dataset)
print(scores.head(3))`

and output:

@justinormont Could you see if you can score this in ML.NET. I am not getting any scores from this model.
I used this csv test file below:
survived,sex,class,deck,embark_town,alone TRUE,male,Third,unknown,Southampton,n TRUE,female,First,C,Cherbourg,n TRUE,female,Second,unknown,Southampton,y

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading a model that was saved with mlnet auto-train #423

Error loading a model that was saved with mlnet auto-train #423

RokoToken commented Jan 29, 2020 •

edited

Loading

ganik commented Jan 29, 2020

RokoToken commented Jan 29, 2020 •

edited

Loading

justinormont commented Jan 29, 2020 •

edited

Loading

RokoToken commented Jan 30, 2020 •

edited

Loading

RokoToken commented Feb 3, 2020

ganik commented Feb 3, 2020

justinormont commented Feb 4, 2020

ganik commented Feb 26, 2020

ganik commented Feb 26, 2020 •

edited

Loading

Error loading a model that was saved with mlnet auto-train #423

Error loading a model that was saved with mlnet auto-train #423

Comments

RokoToken commented Jan 29, 2020 • edited Loading

ganik commented Jan 29, 2020

RokoToken commented Jan 29, 2020 • edited Loading

justinormont commented Jan 29, 2020 • edited Loading

RokoToken commented Jan 30, 2020 • edited Loading

RokoToken commented Feb 3, 2020

ganik commented Feb 3, 2020

justinormont commented Feb 4, 2020

ganik commented Feb 26, 2020

ganik commented Feb 26, 2020 • edited Loading

RokoToken commented Jan 29, 2020 •

edited

Loading

RokoToken commented Jan 29, 2020 •

edited

Loading

justinormont commented Jan 29, 2020 •

edited

Loading

RokoToken commented Jan 30, 2020 •

edited

Loading

ganik commented Feb 26, 2020 •

edited

Loading