Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API gloss update "non-updated fields" #1368

Open
susanodd opened this issue Nov 5, 2024 · 4 comments
Open

API gloss update "non-updated fields" #1368

susanodd opened this issue Nov 5, 2024 · 4 comments
Assignees
Labels

Comments

@susanodd
Copy link
Collaborator

susanodd commented Nov 5, 2024

          @rem0g One more thing relevant to this issue:

We noticed that you send all the fields in the request to update the gloss. However this gives a lot of non-informative changes in the revision history (see https://signbank.cls.ru.nl/dictionary/gloss/49249/history)

We clearly did not document this well ;)
However the intention for this api was to only include the fields that are changed so it only updates those and leaves the others as is and they are not added to the revision history. Just so you know! You can close this issue again when read.

Originally posted by @Jetske in #1360 (comment)

@susanodd
Copy link
Collaborator Author

susanodd commented Nov 5, 2024

DISCUSSION @Woseseltops read here

I made a separate issue for this, since it's not a bug, but a usage concern.

  • For the Signbank Gloss Detail Edit View and the Batch Edit, each of the updates in the Phonology panel, etc. is done by a separate AJAX call. So the updates themselves are very small, one field at a time. This was to limit database locking.

  • But for the API update gloss, this has been communicated poorly in the documentation / Swagger pages.

  • By including a "giant json dictionary" of updates, which in practice is being used with the majority of the fields empty.

  • Signbank is forced to calculate all the "changed fields".

  • This is API server overload compared to the Gloss Edit and Batch Edit interfaces.

  • The gloss update API is causing the database to get locked because of this.

@Woseseltops
Copy link
Collaborator

So the situation is that api_update_gloss is called, and in the payload has giant dictionary of fields, most of which are identical to what we already have in the database. Before we create lots of write actions, we want to check whether a field has actually changed, right? Can't we just load the object like this:

    gloss = get_object_or_404(Gloss, glossid=glossid)

And then for each field in the incoming JSON check if it's different before we do anything with it?

@susanodd
Copy link
Collaborator Author

susanodd commented Nov 21, 2024

Yes, that is what is being done.

I suppose it could be made more like a tuple (the minimal pairs computation makes use of a tuple for comparison), but...

The problem is that the "field choices" are in human readable form. But the gloss has object foreign keys.

So the lookup to compare data is also looking up to find out whether a human readable value is the same as the field choice of the foreign key.

Moreover, the API supports multiple languages to be used in the human readable form. (The language code is given in the header.)

So roughly, to process the API:

  1. activate the language code
  2. for each of the field choice fields in the update json
    i. type check that the human readable value exists as a choice
    ii. compare human readable form with gloss.field.name if gloss.field is not null
    iii. update the field if different
  3. save the gloss

The operation needs to fail if (i) fails.

In additional to field choices, the user can also update annotations, senses, and lemma's if the lemma is not shared by other glosses. These fields refer to other objects, as well as video and image files. These updates can also fail.

Because the update involves many steps that all need to be done in a transaction, any can fail and the database can also become locked. The intention of the API was that the user only provides fields that actually need to be updated.

It becomes tricky because "empty" input can represent deletion as well as ignore.
It was intended that this would mean "delete", since only changed fields should be included in the json.
But since all the fields are being provided in practice, all fields needs to be checked.
Internally, the gloss fields are NULL if never set. But usage of '-' in the input sets them to the field choice with machine value 0. Then they have an object as a field choice.

@susanodd
Copy link
Collaborator Author

susanodd commented Nov 25, 2024

@Woseseltops you can see the extent of the above if in the Admin you look at Gloss Revisions

I can modify it so the "empty" ones are not stored, but the comparison operations are being performed on all the empty fields that are not changed.
In the specification, the API user is not supposed to include unchanged fields. This is because "empty" is the same as "delete", not "ignore".
(See also #1397 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants