Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow For Updating The Data #40

Open
Tracked by #28
harshkhandeparkar opened this issue Sep 11, 2023 · 9 comments
Open
Tracked by #28

Workflow For Updating The Data #40

harshkhandeparkar opened this issue Sep 11, 2023 · 9 comments

Comments

@harshkhandeparkar
Copy link
Member

Is it possible to make a github actions workflow that updates the data each semester?

If not, we could make a workflow that, given the timetable link, generates all data and updates the frontend. This workflow could be triggered by a release or on push.

@proffapt
Copy link
Member

A tag is better, if not going for a release.. because writing release notes is pain in itself :/

@harshkhandeparkar
Copy link
Member Author

A tag is better, if not going for a release.. because writing release notes is pain in itself :/

We don't necessarily have to write release notes. Making a release creates a tag anyway, and we can keep track of changes.

@proffapt
Copy link
Member

tag hi banado na jab release nahi karna toh ¯_(ツ)_/¯

@harshkhandeparkar
Copy link
Member Author

tag hi banado na jab release nahi karna toh ¯_(ツ)_/¯

Mujhe laga tag mein date nahi dikhti. Apparently dikhti hai. Chalega tag bhi.

@shikharish
Copy link
Member

It is not possible to update data using a github actions workflow.

  • For 1st year timetable, the PDF parsing is not accurate for some pages and need to be fixed manually.
  • Even for other years, sometimes the slots format may change in ERP(like it did in Spring 24)

@harshkhandeparkar
Copy link
Member Author

It is not possible to update data using a github actions workflow.

* For 1st year timetable, the PDF parsing is not accurate for some pages and need to be fixed manually.

* Even for other years, sometimes the slots format may change in ERP(like it did in Spring 24)

In that case it would be good to open a PR using the workflow. The maintainers can manually review and make changes to the PR.

@shikharish
Copy link
Member

The current workflow is:

  1. Get 1st year pdf from ERP
  2. Run update script(it parses pdf file to xlsx file)
  3. got errors in parsing
  4. check what is the error in xlsx file(usually a column is parsed twice)
  5. manually fix errors(in code not xlsx file)
  6. run update script again
  7. 1st year scraper run successfully and it creates a json file will all 1st year data
  8. for other years scraper, erp-login-go package is used and it would prompt for credentials and otp. After successful login, all data would be fetched. It updates the json file created in previous step.

@harshkhandeparkar
Copy link
Member Author

The current workflow is:

1. Get 1st year pdf from ERP

2. Run update script(it parses pdf file to xlsx file)

3. got errors in parsing

4. check what is the error in xlsx file(usually a column is parsed twice)

5. manually fix errors(in code not xlsx file)

6. run update script again

7. 1st year scraper run successfully and it creates a json file will all 1st year data

8. for other years scraper, erp-login-go package is used and it would prompt for credentials and otp. After successful login, all data would be fetched. It updates the json file created in previous step.

Many of these steps can be automated. If there is an error, the code can be updated and the rest of the workflow can still be automates.

@proffapt proffapt self-assigned this Jun 8, 2024
@proffapt
Copy link
Member

proffapt commented Jun 8, 2024

I will automate the shit out of it :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

4 participants