-
Notifications
You must be signed in to change notification settings - Fork 704
[ENH] Refactor _loss_mapping #1967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Converting this to draft PR. Looking for feedback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and Welcome to pytorch-forecasting
@ParamThakkar123!
I have few questions/suggestions:
-
Donot use
get_test_train_params
for model_pkg as the fixtures come fromget_base_test_params
-
I suspect somewhere
compatible_loss
list
is returned empty and this is the reason, some models are looking forget_test_train_params
(which should not happen for v1). -
Also, can you please update the description of the PR to add the following info:
- The EXACT changes you are doing to the files
- Why are you doing it
- Expected behavior of the test framework after these changes.
This would make reviewing easy for the reviewers.
If you have any problems/doubts, please feel free to ping us on discord. You can also join weekly community meet which happens every Friday at 13:00 UTC at the meetups channels on discord. |
Hello @phoeenniixx Thanks for the amazing feedback. And yeah there are some metrics for which no compatible losses are found. So should I skip them in the test loop? |
Or what kind of changes should I exactly be doing ? |
A fallback? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahead of a review - you are reformatting a larger number of files.
Could you kindly do that in a separate pull request, to avoid a large scope?
Yes. Will do that |
So, here we use
These were collected before, so it should be possible to collect them with the new changes, so, I think we should not skip them. Rather, please check if there are some bugs To check if all the losses are being collected correctly, please run tests from Ideally, both of these tests should pass and all the losses should be collected correctly. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1967 +/- ##
=======================================
Coverage ? 87.12%
=======================================
Files ? 158
Lines ? 9358
Branches ? 0
=======================================
Hits ? 8153
Misses ? 1205
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@ParamThakkar123, how did you test whether the tests now get collected correctly? |
I ran the tests as mentioned by @phoeenniixx multiple times and looked at the errors that occurred then found bugs in my code which were related to some tags in the for loop for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is really good!
I made some requests, please have a look at them
Additionally,
- Please remove the files which are being added due to formatting (as @fkiraly mentioned). I suspect you are using
pre-commit run --all-files
everytime before commiting, and that is why they get committed everytime. - I would suggest to use
pre-commit run --files <path_to_the_changed_files>
instead. - If you are using linux, and cli to commit,
pre-commit
automatically runs on the changed files, when you usegit commit
, so idts you would need to run pre-commit manually there. But if you are on windows (or maybe MacOS, I donot have any experience with MacOS), i would suggest usingpre-commit run --files <path_to_the_changed_files>
rather thanpre-commit run --all-files
pytorch_forecasting/metrics/_distributions_pkg/_beta/_beta_distribution_loss_pkg.py
Outdated
Show resolved
Hide resolved
@@ -16,6 +17,15 @@ class BetaDistributionLoss_pkg(_BasePtMetric): | |||
"distribution_type": "beta", | |||
"info:metric_name": "BetaDistributionLoss", | |||
"requires:data_type": "beta_distribution_forecast", | |||
"clip_target": True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we need clip_target
and data_loader_kwargs
in tags as they donot convey any useful info, rather they are the params to make the dataloaders more comaptible with each metric, maybe we should move it somewhere else in the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You're right. Where do you I should be moving it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe they can be one of the properties? or args in __init__
?
what do you think is a better idea? a property
or an arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fkiraly what is your opinion about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phoeenniixx I think we can keep it as an arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is it being passed?
@phoeenniixx I have implemented the changes you suggested. Can you please review it ? |
@@ -1,7 +1,25 @@ | |||
"""NHiTS package container.""" | |||
|
|||
from pytorch_forecasting.metrics.distributions import ( | |||
BetaDistributionLoss, | |||
ImplicitQuantileNetworkDistributionLoss, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we importing these classes and DISTR_LOSSES_NUMERIC
here in _nhits_pkg
? are they usedf anywhere in the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phoeenniixx I did that because there's a for loop in the code at line number 108 which loops over this DISTR_LOSSES_NUMERIC
array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I see, but that still doesnt explain the imports of loss classes, from what i can see, they were already being imported in _get_test_dataloaders_from
see here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I am a bit uncomfortable about defining DISTR_LOSSES_NUMERIC
, we imported it before as they were already present in _loss_mapping
, and now you are refactoring it, so I think we could also refactor a little here? Like we just see if the loss is MultivariateNormalDistributionLoss
, and we return dataloaders_fixed_window_without_covariates()
and for rest of the distribution losses, we perform required operations and return dataloaders_with_covariates
? idts we would need DISTR_LOSSES_NUMERIC
then do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, So probably what we can do here is remove the redundant import but keep the DISTR_LOSSES_NUMERIC
as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could even refactor get_test_dataloaders_from
of nhits_pkg
here as well :)
Please look at my previous comment. what do you think?
Also, why are we importing the unused classes here? (referring to your latest commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we dont even need to use DISTR_LOSSES_NUMERIC
now, thanks to your brilliant refactoring :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay got it @phoeenniixx I will make the suggested changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up: there are other losses as well being tested for nhits
along with distribution losses (like point predictions losses), so you'd have to add support for them as well, like previously there was for TweedieLoss
(see the previous check for the loss) and for other losses dataloaders_fixed_window_without_covariates
was being passed...
@phoeenniixx I made the changes. Does the recent change look better now ? |
Are all the tests passing locally? |
yup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think
clip_target
anddata_load_kwargs
are not tags. To know where they are best moved, where are they being passed and when? - I would rename
expected_loss_ndim
toloss_ndim
or similar, orn_pred_dim
?
@fkiraly They are being passed in the |
I see - that is a test config function, so I would either encode the parameters there, or in an additional private function - not in tags. |
""" | ||
compatible_losses = [] | ||
for pkg in METRIC_PKGS: | ||
pkg_pred_types = pkg._tags.get("info:pred_type", []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could use get_class_tags
, see here for the example.
Yes, that's why I proposed using |
Reference Issues/PRs
Fixes #1957
What does this implement/fix? Explain your changes.
This refactor moves legacy loss compatibility mappings and tensor shape checks to metric tags in the respective _pkg classes.
Test-specific parameters and dataloader configurations are now provided by a standardized _get_test_dataloaders_from method in each metric package.
All test logic for selecting compatible losses and checking output shapes is now tag-driven, improving maintainability and extensibility.
Legacy mapping code and if/else logic have been removed from the test suite.
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Any other comments?
PR checklist
pre-commit install
.To run hooks independent of commit, execute
pre-commit run --all-files