Wednesday, June 16, 2010

suffix/prefix expressions in google safe browsing

google uses host suffix/path prefix expressions to hash the blacklist and malwarelist url for google safe browsing.

When you try to match against a URL: http://www.google.com/header/x.html, you will try all the combination:
google.com/
google.com/header/
google.com/header/x.html

The original design only download 4 bytes hash, when it matches, it will contact the google server again to download 32 bytes hash.

No comments: