Website Fingerprinting
When browsing the web, many users would prefer to have privacy. Clients who wish to avoid behavorial marketing, tracking and surveillance could use an anonymizing proxy service such as Tor. Tor, however, is susceptible to website fingerprinting, wherein a local, passive adversary (such as your ISP or those who have access to your ISP's data) can identify a user's behavior according to patterns in their packet sequence.
We have implemented new and old website fingerprinting attacks and defenses, in order to demonstrate its realistic threat and defend against the it. We have five papers describing our research and implementations. DL indicates a direct link and L indicates a link to the entry in the relevant publisher.
T. Wang and I. Goldberg. Improved Website Fingerprinting on Tor (L). WPES 2013.
T. Wang, X. Cai, R. Nithyanand, R. Johnson and I. Goldberg. Effective Attacks and Provable Defenses for Website Fingerprinting (DL). USENIX 2014.
X. Cai, R. Nithyanand, T. Wang, R. Johnson and I. Goldberg. A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses (L). CCS 2014.
T. Wang and I. Goldberg. On Realistically Attacking Tor with Website Fingerprinting (DL). PETS 2016.
T. Wang and I. Goldberg. Walkie-Talkie: An Effective and Efficient Defense for Website Fingerprinting (DL). USENIX 2017.
Download
You may need to do some editing to get some defenses to work with different data sets (for example, changing the folder names in the code). Please feel free to e-mail taow@cse.ust.hk for any questions.
Repository of implemented attacks (ZIP). Includes notes to explain the attacks. Note that some are based on other authors' works.
Repository of implemented defenses (ZIP). Includes notes to explain the defenses. Note that some are based on other authors' works.
Repository of data. Includes notes to explain the data. Most files are large.
ZIP containing code for the PETS 2016 work. 900MB.
WPES 2013
This work presents an improved version of Cai's OSAD attack, along with some of the first open-world results in the field. It shows that relatively simple changes to the algorithm can greatly reduce the error rate.
Results: In attacks/Wa-OSAD.py, which needs (from the same folder) Ca-OSAD.py, clLev.cpp, clgen_stratify.cpp, loaders.py. Uses data/levdata*.zip.
USENIX 2014
This work presents a new attack based on a k-nearest neighbours algorithm with optimized distances learned using a custom local hill-climbing algorithm. It has a greatly improved accuracy and lower FPR compared to previous attacks. It is also significantly faster. It also presents a defense, Supersequence.
Results: In attacks/kNN.py, which needs fextractor.py, flearner.cpp. Supersequence is in defenses/cluster.py (warning: not updated, may not work well with new code). Uses data/knndata.zip.
CCS 2014
This work hypothesizes the minimum overhead of an effective defense against all website fingerprinting attacks, and implements Tamaraw as a proof of concept.
Results: In defenses/tamaraw.py.
PETS 2016
This work resolves several issues with realistic website fingerprinting, and proposes new methods to tackle these issues, including noise, splitting, and training set maintenance.
Results: In realpublish.zip.
USENIX 2017
This work proposes Walkie-Talkie, an effective defense against all website fingerprinting attacks, with low overhead achieved from applying half-duplex mode on browser page loading.
There are several ways to use these results:
- Download data/walkiebatch-defended.zip. This is a defended data set in cell format. Any of the attacks should work on this cell format (when extracted into the correct folder). This will let you check the accuracy of various attacks on this data.
- Download defenses/walkie-padding.zip. This is a set of code to test and implement various burst-padding mechanisms. They only work on burst data, which is included.
- Downloads walkie-browser.zip, which is an implementation of Walkie-Talkie on a (quite old) version of Tor Browser. This is HTTP/1.1 only. Sirinam et al. have a better implementation, but I don't know if anyone has an updated implementation on HTTP/2.