feat: add frequencies sketches#44
Conversation
Signed-off-by: Chojan Shang <[email protected]>
Signed-off-by: Chojan Shang <[email protected]>
Signed-off-by: Chojan Shang <[email protected]>
|
only 1 serde case failed on windows, ... on windows, long is 32‑bit...... |
|
Probably don’t need to bother with a dedicated implementation for longs. That was mostly for initial testing, I think. And if rust supports templates/generics with primitives then there’s no reason for a separate implementation for longs. We didn’t do one for c++, just provided a default serde for a few basic types. |
|
you're right.
I'll mark this pr as a draft for now. Later, I'll remove longsketch, update the tests, and skip the serde errors on Windows. |
Signed-off-by: Chojan Shang <[email protected]>
|
I need to see some Rust reviewers before I approve & merge this. |
|
I'll take a look here before the new year :P |
tisonkun
left a comment
There was a problem hiding this comment.
Rest LGTM. Mostly style comments to keep our codebase consistent.
Signed-off-by: Chojan Shang <[email protected]>
Signed-off-by: Chojan Shang <[email protected]>
|
Waiting for @PsiACE |
tisonkun
left a comment
There was a problem hiding this comment.
LGTM. You may leverage the new codec utils in a follow-up PR.
| if let Err(err) = sketch { | ||
| assert!(matches!(err, SerdeError::InsufficientData(_))); | ||
| assert_eq!(err.kind(), ErrorKind::InvalidData); | ||
| assert!( |
There was a problem hiding this comment.
nit: googletest has a contains matcher that can provide a better error message without manual construct.
https://docs.rs/googletest/latest/googletest/matchers/fn.contains.html
| let value = self.hash_map.get(item); | ||
| value.max(0) as u64 |
There was a problem hiding this comment.
I think the cpp impl uses u64 for reverse_purge_hash_map's value. But let's review this point as a follow-up.
Read https://github.com/apache/datasketches-cpp/blob/7bb979d3ef8929e235bcd22d67579e1f695f6ecd/fi/include/reverse_purge_hash_map_impl.hpp#L346-L364 especially for how subtract_and_keep_positive_only replaces:
self.adjust_all_values_by(-median);
self.keep_only_positive_counts();This is the only usage where i64 is needed in the current impl.
|
Cross-reference to #35. I don't know whether this sketch works well in |
Uh oh!
There was an error while loading. Please reload this page.