WC2-431: Decipher how to aggregate ETL based on current tableau #1833

butofleury · 2024-11-27T12:51:27Z

What problem is this PR solving? Explain here in one sentence.

Related JIRA tickets : WC2-431

Self proofreading checklist

Did I use eslint and black formatters
Is my code clear enough and well documented
Are my typescript files well typed
New translations have been added or updated if new strings have been introduced in the frontend
My migrations file are included
Are there enough tests
Documentation has been included (for new feature)

Changes

Aggregated data by period(month and year) and org unit.

How to test

From the local instance with wfp coda database, run ETL script to generate data for:

South Sudan: docker compose run iaso manage etl_ssd
Nigeria: docker compose run iaso manage etl_ng

Then login on django admin and check on Monthly statistics table. You should see data aggregated by period and org unit. You can also filter the data per account(country).

Print screen / video

Administration-de-Wfp-Iaso.webm

Follow the Conventional Commits specification

The merge message of a pull request must follow the Conventional Commits specification.

This convention helps to automatically generate release notes.

Use lowercase for consistency.

Example:

fix: empty instance pop up

Refs: IA-3665

Note that the Jira reference is preceded by a line break.

Both the line break and the Jira reference are entered in the Add an optional extended description… field.

…on-current-tableau

…ics table

…on-current-tableau

bramj

Tested locally and working. However, I think variable naming could be better to improve the overall code readability. As well as (but more difficult) avoiding function side-effects.

bramj · 2024-12-03T13:42:28Z

plugins/wfp/aggregate_journeys.py

+from operator import itemgetter
+
+
+class AGGREGATE_JOURNEY:


Is there a reason why this is not CamelCase?

bramj · 2024-12-03T14:02:43Z

plugins/wfp/common.py

+
+        monlthly_journey.save()
+
+    def journey_with_visit_and_steps_per_visit(self, account, program):


programme?

bramj · 2024-12-03T14:08:49Z

plugins/wfp/common.py

+        monlthly_journey.save()
+
+    def journey_with_visit_and_steps_per_visit(self, account, program):
+        aggregated_journey = []


Some variable names are pretty confusing. I think this would be better called aggregated_journeys to better reflect it's a list, what do you think?

bramj · 2024-12-03T14:09:05Z

plugins/wfp/common.py

+        )
+        data_by_journey = groupby(list(journeys), key=itemgetter("visit__org_unit_id"))
+
+        for org_unit, journey in data_by_journey:


bramj · 2024-12-03T14:09:16Z

plugins/wfp/common.py

+        data_by_journey = groupby(list(journeys), key=itemgetter("visit__org_unit_id"))
+
+        for org_unit, journey in data_by_journey:
+            visit_by_period = groupby(journey, key=itemgetter("period"))


visits_by_period

bramj · 2024-12-03T14:09:38Z

plugins/wfp/common.py

+        for org_unit, journey in data_by_journey:
+            visit_by_period = groupby(journey, key=itemgetter("period"))
+            assistance = {"rutf_quantity": 0, "rusf_quantity": 0, "csb_quantity": 0}
+            aggregated_journey = AGGREGATE_JOURNEY().group_by_period(


aggregated_journeys

bramj · 2024-12-03T14:20:13Z

plugins/wfp/models.py

+    number_visits = models.IntegerField(default=0)
+    given_sachet_rusf = models.FloatField(null=True, blank=True)
+    given_sachet_rutf = models.FloatField(null=True, blank=True)
+    given_quantity_csb = models.FloatField(null=True, blank=True)


Is there a reason these 3 quantity fields are floats? Intuitively I would think an integer makes more sense?

bramj · 2024-12-03T14:23:28Z

plugins/wfp/aggregate_journeys.py

+                row["number_visits"] = len(all_visits)
+        return row
+
+    def group_by_period(self, visit_by_period, org_unit, all_journey, assistance):


visits_by_period, all_journeys

bramj · 2024-12-03T14:55:17Z

plugins/wfp/common.py

@@ -637,3 +641,72 @@ def save_entity_journey(self, journey, beneficiary, record, entity_type):
        journey.save()

        return journey
+
+    def save_monthly_journey(self, journey, account):


This feels like journey is an instance of Journey, but in reality it is a dict representing 1 monthly statistic. Maybe the naming can be improved?

bramj · 2024-12-03T15:00:16Z

plugins/wfp/aggregate_journeys.py

+            row["given_sachet_rutf"] = assistance.get("rutf_quantity", 0)
+            row["given_quantity_csb"] = assistance.get("csb_quantity", 0)
+            all_journey.append(row)
+        return all_journey


The code seems is working, but I find it pretty hard to follow to be honest. I think the main reason is that there are certain variables like assistance and row that are changed in place, and other methods that depend on these changes. I'm not sure if that's fixable, but I think we should avoid manipulating objects in place, and instead have functions return copies. We should strive for functions without side-effects, which makes the code easier to understand.

butofleury added 10 commits August 9, 2024 12:23

WIP: Creating Perido statistics tables

704bc23

Fix conflicts

23c7733

Merge wfp migrations files

a9778d6

Remove unused field in admin django

7824531

Merge branch 'main' into WC2-431-Decipher-how-to-aggregate-ETL-based-…

8b7fe7b

…on-current-tableau

Merge branch 'main' into WC2-431-Decipher-how-to-aggregate-ETL-based-…

a9336fa

…on-current-tableau

Adding account and year fields to monthly aggregated data

32e3110

Clean aggregated data before populating monlthly statistics table

d884819

Remove yearly statistics table as year field added in monthly statist…

5645f7e

…ics table

Refactoring aggregate function by grouping data by categories

82feb85

butofleury requested review from madewulf and bramj November 27, 2024 12:51

butofleury added 2 commits November 27, 2024 14:52

Reformatting by black

6ccc645

Merge branch 'main' into WC2-431-Decipher-how-to-aggregate-ETL-based-…

2262831

…on-current-tableau

bramj requested changes Dec 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WC2-431: Decipher how to aggregate ETL based on current tableau #1833

WC2-431: Decipher how to aggregate ETL based on current tableau #1833

butofleury commented Nov 27, 2024 •

edited by jira bot

Loading

bramj left a comment

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024

bramj Dec 3, 2024


		monlthly_journey.save()

		def journey_with_visit_and_steps_per_visit(self, account, program):

WC2-431: Decipher how to aggregate ETL based on current tableau #1833

Are you sure you want to change the base?

WC2-431: Decipher how to aggregate ETL based on current tableau #1833

Conversation

butofleury commented Nov 27, 2024 • edited by jira bot Loading

Self proofreading checklist

Changes

How to test

Print screen / video

Follow the Conventional Commits specification

bramj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

butofleury commented Nov 27, 2024 •

edited by jira bot

Loading