-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Primops: init concatMapAttrs #13986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Primops: init concatMapAttrs #13986
Conversation
FWIW It might be also productive to work on reducing the overhead/allocations for attrset merges. Do you have a particular benchmark in mind which shows this overhead? |
I have a simple eval test that i used for comparison:
Result
How i got these: With evaluating {
"cpuTime": 0.01888599991798401,
"envs": {
"bytes": 2576,
"elements": 254,
"number": 68
},
"gc": {
"cycles": 1,
"heapSize": 402915328,
"totalBytes": 49008
},
"list": {
"bytes": 88,
"concats": 0,
"elements": 11
},
"nrAvoided": 61,
"nrExprs": 3277,
"nrFunctionCalls": 52,
"nrLookups": 13,
"nrOpUpdateValuesCopied": 552,
"nrOpUpdates": 11,
"nrPrimOpCalls": 10,
"nrThunks": 726,
"sets": {
"bytes": 22232,
"elements": 1373,
"number": 33
},
"sizes": {
"Attr": 16,
"Bindings": 8,
"Env": 8,
"Value": 16
},
"symbols": {
"bytes": 8276,
"number": 842
},
"time": {
"cpu": 0.01888599991798401,
"gc": 0.0,
"gcFraction": 0.0
},
"values": {
"bytes": 14080,
"number": 880
}
} With evaluating {
"cpuTime": 0.03371499851346016,
"envs": {
"bytes": 1352,
"elements": 136,
"number": 33
},
"gc": {
"cycles": 1,
"heapSize": 402915328,
"totalBytes": 39504
},
"list": {
"bytes": 8,
"concats": 0,
"elements": 1
},
"nrAvoided": 42,
"nrExprs": 1714,
"nrFunctionCalls": 24,
"nrLookups": 5,
"nrOpUpdateValuesCopied": 460,
"nrOpUpdates": 1,
"nrPrimOpCalls": 5,
"nrThunks": 576,
"sets": {
"bytes": 18752,
"elements": 1162,
"number": 20
},
"sizes": {
"Attr": 16,
"Bindings": 8,
"Env": 8,
"Value": 16
},
"symbols": {
"bytes": 7821,
"number": 766
},
"time": {
"cpu": 0.03371499851346016,
"gc": 0.0,
"gcFraction": 0.0
},
"values": {
"bytes": 11504,
"number": 719
}
} Offset, determined via evaluating {
"cpuTime": 0.01800299994647503,
"envs": {
"bytes": 1032,
"elements": 116,
"number": 13
},
"gc": {
"cycles": 1,
"heapSize": 402915328,
"totalBytes": 38224
},
"list": {
"bytes": 8,
"concats": 0,
"elements": 1
},
"nrAvoided": 10,
"nrExprs": 1714,
"nrFunctionCalls": 4,
"nrLookups": 4,
"nrOpUpdateValuesCopied": 460,
"nrOpUpdates": 1,
"nrPrimOpCalls": 4,
"nrThunks": 576,
"sets": {
"bytes": 17888,
"elements": 1114,
"number": 8
},
"sizes": {
"Attr": 16,
"Bindings": 8,
"Env": 8,
"Value": 16
},
"symbols": {
"bytes": 7767,
"number": 748
},
"time": {
"cpu": 0.01800299994647503,
"gc": 0.0,
"gcFraction": 0.0
},
"values": {
"bytes": 11184,
"number": 699
}
} |
@roberth Not quite sure how to benchmark this. I am also questioning if using |
"while evaluating the second argument passed to builtins.concatMapAttrs"); | ||
auto inAttrs = args[1]->attrs(); | ||
|
||
std::map<SymbolIdx, Value*> attrMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must use traceable_allocator to make those allocations visible to the GC I think. Otherwise they are invisible to Boehm and those values can be freed.
This actually should be much less of a concern after #13987. |
Value * result = state.allocValue(); | ||
result->mkApp(funApp, i.value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of allocating the result on the heap? It could just be a variable on the stack, no?
for (const auto& [name, value] : attrMap) { | ||
attrs.insert(name, value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::ranges::copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For large inputs, a k-way merge would make more sense.
The pkgs/by-name
attrset, if* implemented with concatMapAttrs
, would become a mere linear time operation; possibly faster than the current in-expression binary merge "tree" algorithm.
*: Nixpkgs could decide based on builtins?concatMapAttrs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we can expose the Bindings::iterator as a standalone k-way merge utility this is pretty easy to implement. We can resurrect your old k-way merge branch for this @roberth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if* implemented with concatMapAttrs
Wouldn't this also imply some overhead for the function calls? We could maybe implement an optimization for identity functions though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resurrect [...] @roberth
imply some overhead for the function calls?
I think it's naturally a concatMapAttrs
, something like
concatMapAttrs
(shard: type:
lib.optionalAttrs (type == "directory") (
readShard shard
)
(readDir ../by-name)
No identity function needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could maybe implement an optimization for identity functions though
Might be useful for other cases, idk. Not the most promising, but tracking in
"while evaluating the second argument passed to builtins.concatMapAttrs"); | ||
auto inAttrs = args[1]->attrs(); | ||
|
||
std::map<SymbolIdx, Value*> attrMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to use a small buffer optimization to avoid unnecessary allocations. As it stands now this will be very costly actually.
Implements
concatMapAttrs
according to #11887Motivation
This should allow efficient usage of concatMapAttrs which is currently implemented in nixpkgs/lib.
However due to heavy performance overhead (n update
//
operations) has never seen great adoption.Adding this function to primops, may unlock faster mapping between attribute sets, which currently always require an intermediate list.
attrsToList
->map
->listToAttrs
Context
see #11887
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.
Example Ideas:
filterAttrsRecursive
could be implemented as:Which saves the internal list conversion and concatenation.
https://github.com/NixOS/nixpkgs/blob/26833ad1dad83826ef7cc52e0009ca9b7097c79f/lib/attrsets.nix#L701:C3
There are many other functions in nixpkgs.lib that use an internal list conversion (
attrsToList
->map
->listToAttrs
) that could benefit in performance and memory allocation.