Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same file with identical content gets downloaded again every time Moodle-DL is executed #206

Open
TheBlueKingLP opened this issue Jan 21, 2024 · 9 comments
Labels
bug Something isn't working not reproducible Given the information provided, we were not able to recreate the issue

Comments

@TheBlueKingLP
Copy link

Description of the bug

There is a file in my moodle account that kept getting downloaded again every time I execute Moodle-DL. The old file then renamed to append _old_01.md
For example:
XXX_old_28.md
XXX_old_27.md
XXX_old_26.md

Steps to reproduce the issue

no argument

2024-01-21 21:42:08  DEBUG  {task}  [0] Starting Task: Task (0, File (module_id: 1, section_name: "Topic 1", section_id: "<id are the same>", module_name: "Lesson 1 exer", content_filepath: /, content_filename: "Lesson 1 exer", content_fileurl: "", content_filesize: 1790, content_timemodified: 0, module_modname: label, content_type: description, content_isexternalfile: False, saved_to: "", time_stamp: 0, modified: True, moved: False, deleted: False, notified: False, hash: <all hash are different>, file_id: None, old_file_id: None), Course (id: 5845, fullname: "Course Title", overwrite_name_with: "None", create_directory_structure: True, files: 26), TaskStatus(state=<TaskState.STARTED: 'STARTED'>, bytes_downloaded=0, external_total_size=0, error=None, yt_dlp_failed_with_error=False, yt_dlp_used_generic_extractor=False, yt_dlp_current_file=None, yt_dlp_total_size_per_file={}, yt_dlp_bytes_downloaded_per_file={}))
2024-01-21 21:42:08  DEBUG  {task}  [0] Renaming old file
2024-01-21 21:42:08  DEBUG  {task}  [0] Starting downloading of: Course Title/Topic 1/Lesson 1 exer.md
2024-01-21 21:42:08  DEBUG  {task}  [0] Creating a description file
2024-01-21 21:42:08  DEBUG  {task}  [1] Starting Task: Task (1, File (module_id: 2, section_name: "Topic 1", section_id: "<id are the same>", module_name: "Lesson 1 exer", content_filepath: /, content_filename: "Lesson 1 exer", content_fileurl: "", content_filesize: 1823, content_timemodified: 0, module_modname: label, content_type: description, content_isexternalfile: False, saved_to: "", time_stamp: 0, modified: True, moved: False, deleted: False, notified: False, hash: <all hash are different>, file_id: None, old_file_id: None), Course (id: 5845, fullname: "Course Title", overwrite_name_with: "None", create_directory_structure: True, files: 26), TaskStatus(state=<TaskState.STARTED: 'STARTED'>, bytes_downloaded=0, external_total_size=0, error=None, yt_dlp_failed_with_error=False, yt_dlp_used_generic_extractor=False, yt_dlp_current_file=None, yt_dlp_total_size_per_file={}, yt_dlp_bytes_downloaded_per_file={}))
2024-01-21 21:42:08  DEBUG  {task}  [1] Renaming old file
2024-01-21 21:42:08  DEBUG  {task}  [1] Starting downloading of: Course Title/Topic 1/Lesson 1 exer_01.md
2024-01-21 21:42:08  DEBUG  {task}  [1] Creating a description file
2024-01-21 21:42:08  DEBUG  {task}  [2] Starting Task: Task (2, File (module_id: 3, section_name: "Topic 1", section_id: "<id are the same>", module_name: "Lesson 2 exer", content_filepath: /, content_filename: "Lesson 2 exer", content_fileurl: "", content_filesize: 1802, content_timemodified: 0, module_modname: label, content_type: description, content_isexternalfile: False, saved_to: "", time_stamp: 0, modified: True, moved: False, deleted: False, notified: False, hash: <all hash are different>, file_id: None, old_file_id: None), Course (id: 5845, fullname: "Course Title", overwrite_name_with: "None", create_directory_structure: True, files: 26), TaskStatus(state=<TaskState.STARTED: 'STARTED'>, bytes_downloaded=0, external_total_size=0, error=None, yt_dlp_failed_with_error=False, yt_dlp_used_generic_extractor=False, yt_dlp_current_file=None, yt_dlp_total_size_per_file={}, yt_dlp_bytes_downloaded_per_file={}))
2024-01-21 21:42:08  DEBUG  {task}  [2] Renaming old file
2024-01-21 21:42:08  DEBUG  {task}  [2] Starting downloading of: Course Title/Topic 1/Lesson 2 exer.md
2024-01-21 21:42:08  DEBUG  {task}  [2] Creating a description file
2024-01-21 21:42:08  DEBUG  {task}  [3] Starting Task: Task (3, File (module_id: 5, section_name: "Topic 1", section_id: "<id are the same>", module_name: "Lesson 2 exer", content_filepath: /, content_filename: "Lesson 2 exer", content_fileurl: "", content_filesize: 1779, content_timemodified: 0, module_modname: label, content_type: description, content_isexternalfile: False, saved_to: "", time_stamp: 0, modified: True, moved: False, deleted: False, notified: False, hash: <all hash are different>, file_id: None, old_file_id: None), Course (id: 5845, fullname: "Course Title", overwrite_name_with: "None", create_directory_structure: True, files: 26), TaskStatus(state=<TaskState.STARTED: 'STARTED'>, bytes_downloaded=0, external_total_size=0, error=None, yt_dlp_failed_with_error=False, yt_dlp_used_generic_extractor=False, yt_dlp_current_file=None, yt_dlp_total_size_per_file={}, yt_dlp_bytes_downloaded_per_file={}))
2024-01-21 21:42:08  DEBUG  {task}  [3] Renaming old file
2024-01-21 21:42:08  DEBUG  {task}  [3] Starting downloading of: Course Title/Topic 1/Lesson 2 exer_02.md
2024-01-21 21:42:08  DEBUG  {task}  [3] Creating a description file
2024-01-21 21:42:08  DEBUG  {task}  [4] Starting Task: Task (4, File (module_id: 4, section_name: "Topic 1", section_id: "<id are the same>", module_name: "Lesson 2 exer", content_filepath: /, content_filename: "Lesson 2 exer", content_fileurl: "", content_filesize: 1769, content_timemodified: 0, module_modname: label, content_type: description, content_isexternalfile: False, saved_to: "", time_stamp: 0, modified: True, moved: False, deleted: False, notified: False, hash: <all hash are different>, file_id: None, old_file_id: None), Course (id: 5845, fullname: "Course Title", overwrite_name_with: "None", create_directory_structure: True, files: 26), TaskStatus(state=<TaskState.STARTED: 'STARTED'>, bytes_downloaded=0, external_total_size=0, error=None, yt_dlp_failed_with_error=False, yt_dlp_used_generic_extractor=False, yt_dlp_current_file=None, yt_dlp_total_size_per_file={}, yt_dlp_bytes_downloaded_per_file={}))
2024-01-21 21:42:08  DEBUG  {task}  [4] Renaming old file
2024-01-21 21:42:08  DEBUG  {task}  [4] Starting downloading of: Course Title/Topic 1/Lesson 2 exer_01.md
2024-01-21 21:42:08  DEBUG  {task}  [4] Creating a description file
2024-01-21 21:42:08  DEBUG  {task}  [1] Download finished
2024-01-21 21:42:08  DEBUG  {task}  [0] Download finished
2024-01-21 21:42:08  DEBUG  {task}  [2] Download finished
2024-01-21 21:42:08  DEBUG  {task}  [3] Download finished
2024-01-21 21:42:08  DEBUG  {task}  [4] Download finished

It says there are files that changed but the file is actually identical with no change in content.
See anonymized output below

14 changes found for the configured Moodle-Account.
Course Title
<file that are not moved to _old_XX.md are redacted>
≠       Course Title/Topic 1/Lesson 1 exer.md
≠       Course Title/Topic 1/Lesson 3 exer.md
≠       Course Title/Topic 1/Lesson 1 exer_01.md
≠       Course Title/Topic 1/Lesson 2 exer_02.md
≠       Course Title/Topic 1/Lesson 2 exer_01.md
≠       Course Title/Topic 1/Lesson 3 exer_02.md
≠       Course Title/Topic 1/Lesson 3 exer_03.md
≠       Course Title/Topic 1/Lesson 2 exer.md
≠       Course Title/Topic 1/Lesson 3 exer_01.md

Expected behavior

The file is only downloaded once if the content is identical, without moving old file to _old_XX.md

Possible Fix

Technical details

  • OS: Arch Linux with kernel 6.7.0-arch3-1
  • Moodle-DL Version moodle-dl 2.3.2.0

P.S. if you need a more detailed/anonymized log I can send it to you privately.

@TheBlueKingLP TheBlueKingLP added the bug Something isn't working label Jan 21, 2024
@C0D3D3V
Copy link
Owner

C0D3D3V commented Jan 21, 2024

😅 There is probably something changing in the html file (that gets stripped out in markdown).
Do you know how to debug python?

@TheBlueKingLP
Copy link
Author

Sorry I am not really a programmer. I can code basic scripts but not sophisticated softwares like Moodle-DL. However, I can probably get whatever info you need if instruction is given.

@C0D3D3V
Copy link
Owner

C0D3D3V commented Mar 28, 2024

Please send me a screenshot of your course. Can it be that you have multiple lessons with the same name?

@C0D3D3V
Copy link
Owner

C0D3D3V commented Mar 28, 2024

Mh I just tested it to have two lessons with the same name. For me it works, without redownloading. So I guess it is really something changing in the description of the lessons.

Its pretty funny that moodle adds links to the other lessons with the same name to the description ^^ (at least for one other lesson with the same name). If you have more than two lessons with the same name in the same section the links kind of make no sense. But moodle-dl downloads them correctly.
Edit: Correcting myself, moodle does not refer to lessons with same name. That was because I used the name of the lesson in the description. The links were generated by the auot linking feature: https://docs.moodle.org/403/en/Auto-linking

So I need a call with you, so we can debug this together. Maybe on discord. Contact me via mail please

@C0D3D3V C0D3D3V added the not reproducible Given the information provided, we were not able to recreate the issue label Mar 28, 2024
@C0D3D3V
Copy link
Owner

C0D3D3V commented Mar 28, 2024

you could also provide my the files that get always updated (including the old files) maybe I see there what is changing.

I wonder even, what "Lesson 1 exer.md" should be? Because moodle-dl normaly does not create such names.

@TheBlueKingLP
Copy link
Author

Sorry for the late reply, I got a new account for a new Moodle instance and it is still having the issue:
SE (copy).md
SE (copy) (copy).md
SE (copy) (copy) (copy).md
SE (copy) (copy) (copy)_old.md
SE (copy) (copy)_old.md
SE (copy)_01.md
SE (copy)_01_old.md
SE (copy)_old.md

I suspect it is caused by multiple "section"(not sure if this is the right term) with the same name.
(sorry, I had to black out most of the things on screen to post this publicly, but hope this is enough to give some context of the duplicated files.)

image

@C0D3D3V
Copy link
Owner

C0D3D3V commented Sep 26, 2024

That are not sections but labels. Labels have a name on moodle additionally to the text the label contains. If you make a copy of a label on moodle, it will add the "(copy)" suffix to the name of the copy.

If two labels have the same name, moodle_dl will add the _01 suffix to one of the labels.

If a label gets redownloaded, moodle-dl adds the (old) suffix to the file name of the old downloaded file.

I have to investigate how we could fix this. I probably need more information from you, e.g. via mail.
In any case we probably need to add a feature that allows to numbering items in a section.

PS. you can turn of downloading of labels if you disable downloading descriptions.

@TheBlueKingLP
Copy link
Author

That are not sections but labels. Labels have a name on moodle additionally to the text the label contains. If you make a copy of a label on moodle, it will add the "(copy)" suffix to the name of the copy.

If two labels have the same name, moodle_dl will add the _01 suffix to one of the labels.

If a label gets redownloaded, moodle-dl adds the (old) suffix to the file name of the old downloaded file.

I have to investigate how we could fix this. I probably need more information from you, e.g. via mail. In any case we probably need to add a feature that allows to numbering items in a section.

PS. you can turn of downloading of labels if you disable downloading descriptions.

Would it be possible to add an option to prefix or suffix the downloaded file name with the ID of the label? For example, I see each label has a unique ID such as data-id="XXXXXX" in the <li> of that label. So that the final file will be something like XXXXXX-TN.md or TN.XXXXXX.md.

image

@C0D3D3V
Copy link
Owner

C0D3D3V commented Sep 26, 2024

Not exactly like the data-id. The data-id comes not from the moodle database, but probably from some sort of web framework like Angular.

But each label has an instance id and an module id, we could use these. Alternatively, we can use the sort order numbers, so that you would have a files ftructure like:

1. First Section
1. First Section / 1. Label in First Section
1. First Section / 2. First File in First Section
1. First Section / 3.  Directory or Assignment 
1. First Section / 3.  Directory or Assignment /1. File in that Directory or Assignment
1. First Section / 3.  Directory or Assignment /2. File in that Directory or Assignment
1. First Section / 4. Label in First Section
1. First Section / 5. Another Label in First Section
2. Second Section
2. Second Section / 1. Label in that Section
2. Second Section / 2. First File in that Section
2. Second Section / 3.  Quiz 
2. Second Section / 3.  Quiz /1. File in that Quiz
2. Second Section / 3.  Quiz /2. File in that Quiz

That is basically requested in #217

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working not reproducible Given the information provided, we were not able to recreate the issue
Projects
None yet
Development

No branches or pull requests

2 participants