-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Jave: Use force local to make parsing local after global regex finding. #20378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
90c450b
to
0f65b20
Compare
0f65b20
to
2201974
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes regex flow analysis in Java by making parsing local after global regex finding. The change uses forceLocal
to localize the usedAsRegex
predicate while maintaining the global regex finding capabilities.
Key changes:
- Refactors the
usedAsRegex
predicate to use local evaluation withforceLocal
- Extracts the original logic into a helper predicate
usedAsRegexImpl
- Adds
overlay[local]
annotation to the main predicate
* Holds if `regex` is used as a regex, with the mode `mode` (if known). | ||
* If regex mode is not known, `mode` will be `"None"`. | ||
* | ||
* As an optimisation, only regexes containing an infinite repitition quatifier (`+`, `*`, or `{x,}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the comment: 'repitition' should be 'repetition'.
Copilot uses AI. Check for mistakes.
Code change looks good. Do you have links to DCA experiments? Since we are moving the overlay frontier there is a risk of optimisation regressions under non-overlay evaluation, so I would expect a DCA experiment that shows no performance impact under non-overlay evaluation and a DCA experiment that shows little to no accuracy regression under overlay evaluation and possibly a speedup. |
I just ran DCA and I think the results look food: https://github.com/github/codeql-dca-main/blob/data/alexet/pr-20378-220197__nightly__nightly-queries/reports/summaries/time.theme.md |
Yes, timing results look good. There were extraction differences for |
smowton states that the extractor errors can't possibly be caused by this PR so should be disregarded. |
With this the regex parsing becomes local.
The assumption is that strings in the base don't become regexs or stop being regexs. From what I have seen that doesn't seem likely.