Overview
I was searching for a very fast hash function to quickly fingerprint hundred of thousands of files. I stumbled upon xxHash. It ranks the fastest amount other hash functions. You can see the benchmark here.
Required libraries
- Download the Java implementation of xxHash and add it to your class path. It can be found at https://repo1.maven.org/maven2/net/jpountz/lz4/lz4/1.3/lz4-1.3.jar
- The Java doc can be found at https://jpountz.github.io/lz4-java/1.3.0/docs/
Hash a string
public static void main(String[] args){ XXHashFactory factory = XXHashFactory.fastestInstance(); try{ byte[] data = "12345345234572".getBytes("UTF-8"); ByteArrayInputStream in = new ByteArrayInputStream(data); int seed = 0x9747b28c; // Use to initialize the hash value, use whatever // value you want, but always the same. StreamingXXHash32 hash32 = factory.newStreamingHash32(seed); byte[] buf = new byte[8]; // For real-world usage, use a larger buffer, like 8192 bytes for (;;){ int read = in.read(buf); if (read == -1){ break; } hash32.update(buf, 0, read); } int hash = hash32.getValue(); System.out.println(hash); // Expected output: 506742924 } catch(UnsupportedEncodingException ex){ System.out.println(ex); } catch(IOException ex){ System.out.println(ex); } }
Hash a file
public static void main(String[] args){ XXHashFactory factory = XXHashFactory.fastestInstance(); int seed = 0x9747b28c; // Use to initialize the hash value, use whatever // value you want, but always the same. StreamingXXHash32 hash32 = factory.newStreamingHash32(seed); try{ byte[] bufferBlock = new byte[8192]; // 8192 bytes FileInputStream fileInputStream = new FileInputStream(new File("C:\\temp\\Xuan\\test.txt")); int read; while ((read = fileInputStream.read(bufferBlock))!=-1){ hash32.update(bufferBlock, 0, read); } fileInputStream.close(); int hash = hash32.getValue(); System.out.println(hash); // Output } catch(UnsupportedEncodingException ex){ System.out.println(ex); } catch(IOException ex){ System.out.println(ex); } }
GitHub
- https://github.com/xuanngo2001/java-xxhash