org.xlattice.crypto.filters
Class BloomSHA1

java.lang.Object
  extended by org.xlattice.crypto.filters.BloomSHA1

public class BloomSHA1
extends Object

A Bloom filter for sets of SHA1 digests. A Bloom filter uses a set of k hash functions to determine set membership. Each hash function produces a value in the range 0..M-1. The filter is of size M. To add a member to the set, apply each function to the new member and set the corresponding bit in the filter. For M very large relative to k, this will normally set k bits in the filter. To check whether x is a member of the set, apply each of the k hash functions to x and check whether the corresponding bits are set in the filter. If any are not set, x is definitely not a member. If all are set, x may be a member. The probability of error (the false positive rate) is f = (1 - e^(-kN/M))^k, where N is the number of set members. This class takes advantage of the fact that SHA1 digests are good- quality pseudo-random numbers. The k hash functions are the values of distinct sets of bits taken from the 20-byte SHA1 hash. The number of bits in the filter, M, is constrained to be a power of 2; M == 2^m. The number of bits in each hash function may not exceed floor(m/k). This class is designed to be thread-safe, but this has not been exhaustively tested.

Author:
< A HREF="mailto:jddixon@users.sourceforge.net">Jim Dixon BloomSHA1.java and KeySelector.java are BSD licensed from the xlattice app - http://xlattice.sourceforge.net/ minor tweaks by jrandom, exposing unsynchronized access and allowing larger M and K. changes released into the public domain. Note that this is used only by DecayingBloomFilter, which uses only the unsynchronized locked_foo() methods. Deprecated for use outside of the router; to be moved to router.jar. As of 0.8.11, the locked_foo() methods are thread-safe, in that they work, but there is a minor risk of false-negatives if two threads are accessing the same bloom filter integer.

Nested Class Summary
static class BloomSHA1.FilterKey
          Store the (opaque) bloom filter offsets for reuse.
 
Constructor Summary
BloomSHA1()
          Creates a filter of 2^20 bits with k defaulting to 8.
BloomSHA1(int m)
          Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.
BloomSHA1(int m, int k)
          Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash.
 
Method Summary
 int capacity()
           
 void clear()
          Synchronized version
 double falsePositives()
           
 double falsePositives(int n)
           
 BloomSHA1.FilterKey getFilterKey(byte[] b, int offset, int len)
          Get the bloom filter offsets for reuse.
 void insert(byte[] b)
          Add a key to the set represented by the filter.
 void insert(byte[] b, int offset, int len)
           
 void locked_insert(BloomSHA1.FilterKey fk)
          Add the key to the filter.
 void locked_insert(byte[] b)
           
 void locked_insert(byte[] b, int offset, int len)
           
 boolean locked_member(BloomSHA1.FilterKey fk)
          Is the key in the filter.
 boolean locked_member(byte[] b)
           
 boolean locked_member(byte[] b, int offset, int len)
           
 boolean member(byte[] b)
          Is a key in the filter.
 boolean member(byte[] b, int offset, int len)
           
 void release(BloomSHA1.FilterKey fk)
           
 int size()
          Returns the number of keys which have been inserted.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BloomSHA1

public BloomSHA1(int m,
                 int k)
Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash.

Parameters:
m - determines number of bits in filter
k - number of hash functionsx See KeySelector for important restriction on max m and k

BloomSHA1

public BloomSHA1(int m)
Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.

Parameters:
m - determines size of filter

BloomSHA1

public BloomSHA1()
Creates a filter of 2^20 bits with k defaulting to 8.

Method Detail

clear

public void clear()
Synchronized version


size

public final int size()
Returns the number of keys which have been inserted. This class (BloomSHA1) does not guarantee uniqueness in any sense; if the same key is added N times, the number of set members reported will increase by N.

Returns:
number of set members

capacity

public final int capacity()
Returns:
number of bits in filter

insert

public void insert(byte[] b)
Add a key to the set represented by the filter. XXX This version does not maintain 4-bit counters, it is not a counting Bloom filter.

Parameters:
b - byte array representing a key (SHA1 digest)

insert

public void insert(byte[] b,
                   int offset,
                   int len)

locked_insert

public final void locked_insert(byte[] b)

locked_insert

public final void locked_insert(byte[] b,
                                int offset,
                                int len)

locked_member

public final boolean locked_member(byte[] b)

locked_member

public final boolean locked_member(byte[] b,
                                   int offset,
                                   int len)

member

public final boolean member(byte[] b)
Is a key in the filter. External interface, internally synchronized.

Parameters:
b - byte array representing a key (SHA1 digest)
Returns:
true if b is in the filter

member

public final boolean member(byte[] b,
                            int offset,
                            int len)

getFilterKey

public BloomSHA1.FilterKey getFilterKey(byte[] b,
                                        int offset,
                                        int len)
Get the bloom filter offsets for reuse. Caller should call release(rv) when done with it.

Since:
0.8.11

locked_insert

public void locked_insert(BloomSHA1.FilterKey fk)
Add the key to the filter.

Since:
0.8.11

locked_member

public boolean locked_member(BloomSHA1.FilterKey fk)
Is the key in the filter.

Since:
0.8.11

release

public void release(BloomSHA1.FilterKey fk)
Since:
0.8.11

falsePositives

public final double falsePositives(int n)
Parameters:
n - number of set members
Returns:
approximate false positive rate

falsePositives

public final double falsePositives()