Package org.elasticsearch.common.util
Class SetBackedScalingCuckooFilter
- java.lang.Object
-
- org.elasticsearch.common.util.SetBackedScalingCuckooFilter
-
- All Implemented Interfaces:
Writeable
public class SetBackedScalingCuckooFilter extends java.lang.Object implements Writeable
An approximate set membership datastructure that scales as more unique values are inserted. Can definitively say if a member does not exist (no false negatives), but may say an item exists when it does not (has false positives). Similar in usage to a Bloom Filter. Internally, the datastructure maintains a Set of hashes up to a specified threshold. This provides 100% accurate membership queries. When the threshold is breached, a list of CuckooFilters are created and used to track membership. These filters are approximate similar to Bloom Filters. This datastructure scales as more values are inserted by growing the list of CuckooFilters. Final size is dependent on the cardinality of data inserted, and the precision specified.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.elasticsearch.common.io.stream.Writeable
Writeable.Reader<V>, Writeable.Writer<V>
-
-
Constructor Summary
Constructors Constructor Description SetBackedScalingCuckooFilter(int threshold, java.util.Random rng, double fpp)
SetBackedScalingCuckooFilter(StreamInput in, java.util.Random rng)
SetBackedScalingCuckooFilter(SetBackedScalingCuckooFilter other)
-
Method Summary
Modifier and Type Method Description void
add(long value)
Add's the provided value to the set for trackingvoid
add(org.apache.lucene.util.BytesRef value)
Add's the provided value to the set for trackingboolean
equals(java.lang.Object other)
long
getSizeInBytes()
Get the approximate size of this datastructure.int
hashCode()
void
merge(SetBackedScalingCuckooFilter other)
Merge `other` cuckoo filter into this cuckoo.boolean
mightContain(long value)
Returns true if the set might contain the provided value, false otherwise.boolean
mightContain(org.apache.lucene.util.BytesRef value)
Returns true if the set might contain the provided value, false otherwise.void
registerBreaker(java.util.function.Consumer<java.lang.Long> breaker)
Registers a circuit breaker with the datastructure.void
writeTo(StreamOutput out)
Write this into the StreamOutput.
-
-
-
Constructor Detail
-
SetBackedScalingCuckooFilter
public SetBackedScalingCuckooFilter(int threshold, java.util.Random rng, double fpp)
- Parameters:
threshold
- The number of distinct values that should be tracked before converting to an approximate representationrng
- A random number generator needed for the cuckoo hashing processfpp
- the false-positive rate that should be used for the cuckoo filters.
-
SetBackedScalingCuckooFilter
public SetBackedScalingCuckooFilter(SetBackedScalingCuckooFilter other)
-
SetBackedScalingCuckooFilter
public SetBackedScalingCuckooFilter(StreamInput in, java.util.Random rng) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
writeTo
public void writeTo(StreamOutput out) throws java.io.IOException
Description copied from interface:Writeable
Write this into the StreamOutput.
-
registerBreaker
public void registerBreaker(java.util.function.Consumer<java.lang.Long> breaker)
Registers a circuit breaker with the datastructure. CuckooFilter's can "saturate" and refuse to accept any new values. When this happens, the datastructure scales by adding a new filter. This new filter's bytes will be tracked in the registered breaker when configured.
-
mightContain
public boolean mightContain(org.apache.lucene.util.BytesRef value)
Returns true if the set might contain the provided value, false otherwise. False values are 100% accurate, while true values may be a false-positive.
-
mightContain
public boolean mightContain(long value)
Returns true if the set might contain the provided value, false otherwise. False values are 100% accurate, while true values may be a false-positive.
-
add
public void add(org.apache.lucene.util.BytesRef value)
Add's the provided value to the set for tracking
-
add
public void add(long value)
Add's the provided value to the set for tracking
-
getSizeInBytes
public long getSizeInBytes()
Get the approximate size of this datastructure. Approximate because only the Set occupants are tracked, not the overhead of the Set itself.
-
merge
public void merge(SetBackedScalingCuckooFilter other)
Merge `other` cuckoo filter into this cuckoo. After merging, this filter's state will be the union of the two. During the merging process, the internal Set may be upgraded to a cuckoo if it goes over threshold
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
equals
public boolean equals(java.lang.Object other)
- Overrides:
equals
in classjava.lang.Object
-
-