Research Article | Open Access
Renzhi Tang, Guowei Shen, Chun Guo, Yunhe Cui, "SAD: Website Fingerprinting Defense Based on Adversarial Examples", Security and Communication Networks, vol. 2022, Article ID 7330465, 12 pages, 2022. https://doi.org/10.1155/2022/7330465
SAD: Website Fingerprinting Defense Based on Adversarial Examples
Abstract
Website fingerprinting (WF) attacks can infer website names from encrypted network traffic when the victim is browsing the website. Inherent defenses of anonymous communication systems such as The Onion Router(Tor) cannot compete with current WF attacks. The state-of-the-art attack based on deep learning can gain over 98% accuracy in Tor. Most of the defenses have excellent defensive capabilities, but it will bring a relatively high bandwidth overhead, which will seriously affect the user’s network experience. And some defense methods have less impact on the latest website fingerprinting attacks. Defense-based adversarial examples have excellent defense capabilities and low bandwidth overhead, but they need to get the complete website traffic to generate defense data, which is obviously impractical. In this article, based on adversarial examples, we propose segmented adversary defense (SAD) for deep learning-based WF attacks. In SAD, sequence data are divided into multiple segments to ensure that SAD is feasible in real scenarios. Then, the adversarial examples for each segment of data can be generated by SAD. Finally, dummy packets are inserted after each segment original data. We also found that setting different head rates, that is, end points for the segments, will get better results. Experimentally, our results show that SAD can effectively reduce the accuracy of WF attacks. The technique drops the accuracy of the state-of-the-art attack hardened from 96% to 3% while incurring only 40% bandwidth overhead. Compared with the existing proposed defense named Deep Fingerprinting Defender (DFD), the defense effect of SAD is better under the same bandwidth overhead.
1. Introduction
Anonymous communication system can protect the users’ information through data encryption and multi-hop proxy. The Onion Router (Tor) [1
- R. Dingledine, N. Nick, and S. Paul, Tor: The Second-Generation Onion Router, Naval Research Lab, Washington, DC, USA, 2004. View at: Publisher Site
- A. R. Javed, W. Ahmed, M. Alazab, Z. Jalil, K. Kifayat, and T. R. Gadekallu, “A Comprehensive Survey on Computer Forensics: State-Of-The-Art, Tools, Techniques, Challenges, and Future Directions.,” IEEE Access, vol. 10, 2022. View at: Publisher Site | Google Scholar
In the past, the attacker extracts a variety of packets level features from encrypted traffic, such as number, order, direction, and unique length, and then uses machine learning for classification [3
- T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective attacks and provable defenses for website fingerprinting,” Proceedings of the USENIX Security Symposium, USENIX Association, San Diego, CA, USA, pp. 143–157, August 2014. View at: Google Scholar
- J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable website fingerprinting technique,” Proceedings of the 25th {USENIX} Security Symposium ({USENIX} Security 16), Austin, TX, USA, pp. 1187–1203, August 2014. View at: Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105, 2012. View at: Google Scholar
- G. Hinton, L. Deng, D. Yu et al., “Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012. View at: Publisher Site | Google Scholar
- S. Agrawal, S. Sarkar, O. Aouedi et al., Federated Learning for Intrusion Detection System: Concepts, Challenges and Future Directions, 2021, https://arxiv.org/abs/2106.09527.
- V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated Website Fingerprinting through Deep Learning,” 2017, arXiv preprint arXiv:1708.06376. View at: Google Scholar
- S. Bhat, D. Lu, A. Kwon, and S. Devadas, “Var-cnn: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning,” Proceedings on Privacy Enhancing Technologies, vol. 2019, no. 4, pp. 292–310, 2019. View at: Publisher Site | Google Scholar
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
In the meantime, some WF defenses have been proposed such as WTF-PAD [12
- M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright, “Toward an efficient website fingerprinting defense,” European Symposium on Research in Computer Security, Springer, NY, USA, pp. 27–46, 2016. View at: Publisher Site | Google Scholar
- T. Wang and I. Goldberg, “Walkie-talkie: an efficient defense against passive website fingerprinting attacks,” 26th {USENIX} Security Symposium ({USENIX} Security 17), USENIX Association, Vancouver, Canada, pp. 1375–1390, 2017. View at: Google Scholar
- A. Abusnaina, R. Jang, A. Khormali, D. Nyang, and D. Mohaisen, “Dfd: adversarial learning-based approach to defend against website fingerprinting,” in Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 2459–2468, IEEE, Toronto, ON, Canada, July 2020. View at: Publisher Site | Google Scholar
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2014, https://arxiv.org/abs/1412.6572. View at: Google Scholar
- R. Dingledine, N. Nick, and S. Paul, Tor: The Second-Generation Onion Router, Naval Research Lab, Washington, DC, USA, 2004. View at: Publisher Site
- C. Hou, G. Gou, J. Shi, P. Fu, and G. Xiong, “Wf-gan: fighting back against website fingerprinting attack using adversarial learning,” in Proceedings of the 2020IEEE Symposium on Computers and Communications (ISCC), pp. 1–7, IEEE, Rennes, France, July 2020. View at: Publisher Site | Google Scholar
- M. S. Rahman, M. Imani, N. Mathews, and M. Wright, “Mockingbird: defending against deep-learning-based website fingerprinting attacks with adversarial traces,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1594–1609, 2020. View at: Google Scholar
In order to achieve excellent defensive performance, low bandwidth overhead, and availability in real-time traffic, we propose SAD based on adversarial examples and head rate. SAD consists of two components: an adversarial examples generation model and a WF attack model-based deep learning. The former can generate adversarial examples according to the direction sequence of flow. The attack model is an adversary of the generated model similar to GAN [18
- I. Goodfellow, J. Pouget-Abadie, M. Mirza et al., “Generative adversarial nets,” Advances in Neural Information Processing Systems, vol. 27, 2014. View at: Google Scholar
2. Related Work
2.1. Website Fingerprinting Attack
WF attacks initially required manual feature extraction. Instead, deep learning models based convolutional neural network (CNN) are now mainly used, as it only requires packet direction information. Table 1 shows simple information about website fingerprinting attacks, which are recognized by researchers.
|
Herrmann et al. [19
- D. Herrmann, R. Wendolsky, and H. Federrath, “Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier,” in Proceedings of the 2009 ACM workshop on Cloud computing security, pp. 31–42, Chicago, IL, USA, November 2009. View at: Publisher Site | Google Scholar
- A. Panchenko, L. Niessen, A. Zinnen, and T. Engel, “Website fingerprinting in onion routing based anonymization networks,” in Proceedings of the 10th annual ACM workshop on Privacy in the electronic society, pp. 103–114, Chicago, IL, USA, October 2011. View at: Publisher Site | Google Scholar
- T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective attacks and provable defenses for website fingerprinting,” Proceedings of the USENIX Security Symposium, USENIX Association, San Diego, CA, USA, pp. 143–157, August 2014. View at: Google Scholar
- A. Panchenko, F. Lanze, J. Pennekamp et al., “Website fingerprinting at internet scale,” NDSS, 2016. View at: Publisher Site | Google Scholar
- J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable website fingerprinting technique,” Proceedings of the 25th {USENIX} Security Symposium ({USENIX} Security 16), Austin, TX, USA, pp. 1187–1203, August 2014. View at: Google Scholar
With the successful application of deep learning in other fields, many researchers are using it for website fingerprinting attacks. The obvious benefit is that it does not require manual feature extraction. To skip the process that manually extracting features, Rimmer et al. [9
- V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated Website Fingerprinting through Deep Learning,” 2017, arXiv preprint arXiv:1708.06376. View at: Google Scholar
- S. Bhat, D. Lu, A. Kwon, and S. Devadas, “Var-cnn: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning,” Proceedings on Privacy Enhancing Technologies, vol. 2019, no. 4, pp. 292–310, 2019. View at: Publisher Site | Google Scholar
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, Las Vegas, NV, USA, June 2016. View at: Publisher Site | Google Scholar
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
2.2. Website Fingerprinting Defense
WF defense against WF attacks by inserting dummy packets. The approach brings bandwidth overhead. Tor’s bandwidth resources are precious. Researchers have been focusing on reducing bandwidth overhead and improving defensive capabilities. Different methods have different insertion rules, but they can be broadly sourced in two ways: generated by the authors based on their own expertise, and utilized to generate insertion rules from attack methods, as shown in Table 2.
Buffered Fixed-Length Obfuscation (BuFLO) [22
- K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton, “Peek-a-boo, i still see you: why efficient traffic analysis countermeasures fail,” in Proceedings of the 2012 IEEE Symposium on Security and Privacy, pp. 332–346, IEEE, San Francisco, CA, USA, May 2012. View at: Publisher Site | Google Scholar
- X. Cai, R. Nithyanand, T. Wang, R. Johnson, and I. Goldberg, “A systematic approach to developing and evaluating website fingerprinting defenses,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 227–238, NY, USA, November 2014. View at: Publisher Site | Google Scholar
- X. Cai, R. Nithyanand, and R. Johnson, “Cs-buflo: a congestion sensitive website fingerprinting defense,” in Proceedings of the 13th Workshop on Privacy in the Electronic Society, pp. 121–130, Scottsdale, AZ, USA, November 2014. View at: Publisher Site | Google Scholar
- T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective attacks and provable defenses for website fingerprinting,” Proceedings of the USENIX Security Symposium, USENIX Association, San Diego, CA, USA, pp. 143–157, August 2014. View at: Google Scholar
- T. Wang and I. Goldberg, “Walkie-talkie: an efficient defense against passive website fingerprinting attacks,” 26th {USENIX} Security Symposium ({USENIX} Security 17), USENIX Association, Vancouver, Canada, pp. 1375–1390, 2017. View at: Google Scholar
- C. Hou, G. Gou, J. Shi, P. Fu, and G. Xiong, “Wf-gan: fighting back against website fingerprinting attack using adversarial learning,” in Proceedings of the 2020IEEE Symposium on Computers and Communications (ISCC), pp. 1–7, IEEE, Rennes, France, July 2020. View at: Publisher Site | Google Scholar
- M. S. Rahman, M. Imani, N. Mathews, and M. Wright, “Mockingbird: defending against deep-learning-based website fingerprinting attacks with adversarial traces,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1594–1609, 2020. View at: Google Scholar
The latest defense method (DFD) [14
- A. Abusnaina, R. Jang, A. Khormali, D. Nyang, and D. Mohaisen, “Dfd: adversarial learning-based approach to defend against website fingerprinting,” in Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 2459–2468, IEEE, Toronto, ON, Canada, July 2020. View at: Publisher Site | Google Scholar
3. Preliminary
3.1. Threat Model
Tor protects browsing website information of the user through data encryption and multi-hop proxy. Its working principle is shown in Figure 1. In general, the user randomly selects three relay nodes, guard node, middle node, and exit node, from global active relay nodes. These selected nodes will form a communication link. All traffic will be encrypted when through the link. The purpose of the WF attacker is to obtain the website name visited by users from encrypted traffic.
To be specific, the attacker monitors and saves traffic between the guard node and user. Then, the attacker uses the traffic to infer the website name using deep learning. The attacker generally needs the direction sequence of traffic packet. The outgoing (client to server) and incoming (server to client) packets are represented as and , respectively. For a piece of data, the total number of −1 and 1 is the effective length of the data. However, the attacker did not own all the data of the website. We mark target websites as monitored sites and other sites as unmonitored sites. To this end, WF attack and WF defense are evaluated in two different experimental environments: close-world and open-world. Close-world means that users can only access monitored websites. The open-world is closer to the real scene than the close-world. It allows the user to access any website.
3.2. Defense Model
The goal of the defender is to reduce the accuracy of WF attacks. One of the best ways to reduce the accuracy of WF attack is to change the input data of WF attack model. For this, there are two operations: inserting dummy packets and delay packets. Delayed packets have a greater negative impact on the network than inserting dummy packets; the latter usually is adopted by WF defense. In addition, there are two scenarios for WF defense: black-box or white-box. Black-box means that the structure and parameters of the WF attack model are unknown. We only know the correspondence between the input and output of the model. White-box means that the structure and parameter information of the attack model can be obtained. We can interact deeply with the attack model.
3.3. Adversarial Examples
Adversarial examples are carefully designed input data for deep learning model, which will cause the model to output incorrect prediction results. It was first discovered by the research in the field of image classification [25
- C. Szegedy, W. Zaremba, I. Sutskever et al., “Intriguing Properties of Neural Networks,” 2013, https://arxiv.org/abs/1312.6199. View at: Google Scholar
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2014, https://arxiv.org/abs/1412.6572. View at: Google Scholar
- C. Hou, G. Gou, J. Shi, P. Fu, and G. Xiong, “Wf-gan: fighting back against website fingerprinting attack using adversarial learning,” in Proceedings of the 2020IEEE Symposium on Computers and Communications (ISCC), pp. 1–7, IEEE, Rennes, France, July 2020. View at: Publisher Site | Google Scholar
- M. S. Rahman, M. Imani, N. Mathews, and M. Wright, “Mockingbird: defending against deep-learning-based website fingerprinting attacks with adversarial traces,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1594–1609, 2020. View at: Google Scholar
4. Defense Design
In this section, we present the design details of SAD, that is,, method of processing data, detailed design of deep learning model, and the process of training the model.
Figure 2 represents the overall process of SAD. is a sequence of the segment of different lengths of flow, is the dummy packets being inserted, and adversarial examples are the output of SAD. The means that we only generate adversarial examples for the first percent of each data item. We define , where is sequence data , is effective length of , and is the number of tagged data in .
4.1. SAD Model Design
4.1.1. Effective Length Alignment at
We do this to train the model quickly, even if the effective length of each data is different. In Algorithm 1, we describe the detailed processing; that is, each element of the data is shifted to the right by a different distance for the entire dataset. Finally, we have a uniform stop segmentation point. This is the key to speed up the model training.
|
4.1.2. Sequence Data Segmentation
In the existing public dataset, each piece of data is a packet direction sequence generated during the complete access process of a certain website. We do this to simulate the real packet transmission. Similarly, we describe in detail in Algorithm 2. The segmentation refers to cutting the sequence into different lengths according to different segmented packet numbers . SPN set consists of several different . The segmented adversarial examples is inserted after each segment of original packets until the . Finally, the segmented sequence and the segmented adversarial examples are sequentially connected to form complete adversarial examples.
4.1.3. Generating Adversarial Examples
SAD contains multiple submodels with similar structures, the generation of adversarial examples is shown in Figure 3. The number of submodels is equal to the number of preset elements. In other words, each submodel has a corresponding as the input feature dimension size. The submodel structure is a simple multilayer fully connected neural network. Its activation function is , as in formula (1). In order to shorten the model training time and improve the robustness of the model, and are used. The input to the submodel is a segment of sequence data. The features dimension of the submodel output is for strictly control the bandwidth overhead, where is segment injection rate and is head rate. Although we take this means, the actual bandwidth overhead is a little bit smaller than the preset one. We tend to drop the last segment to ensure that the adversarial examples is valid. Significantly, the output of the submodel only contains or . We applied and to the model output, as shown in formula (1). Finally, the output of the submodel represents the direction of dummy packets. The main function of the submodel is to generate adversarial examples corresponding to the segmented sequence. The output of multiple submodels is spliced with segmented sequence to generate complete adversarial examples, and the process is shown in Algorithm 2:
|
The input dimension size of the submodel in the corresponds to the elements one-to-one. First, we randomly select from . Second, we choose the corresponding submodel that input dimension size is . Finally, the subsequence is selected from and input into the selected submodel to obtain the output . In other words, the first segmented sequence is put into the submodel to get output . The input of the submodel selected for the second time is , and the corresponding output is . By analogy, the input of the submodel is , and the output is . Finally, the adversarial examples generated by all submodels are spliced with the segmented sequence to generate adversarial examples , in the form of .
4.1.4. SAD Model Training
First, SAD outputs the adversarial examples according to the original data. Then, will be generated when the adversarial examples are fed to the attack model, and the loss function is . The difference is that our goal is to make the loss larger. Moreover, we only update the parameters of SAD. In this way, the accuracy of the attack model will decrease with adversarial sample generated by SAD. The attack model can be any WF attack model based on deep learning. Significantly, we use to map the SAD output to or . But this function is not diversified, the input gradient of as the output gradient.
5. Experiment and Analysis
5.1. Dataset
In this article, the experimental dataset is collected by Sirinam et al. [11
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
- M. Juarez, S. Afroz, G. Acar, C. Diaz, and R. Greenstadt, “A critical evaluation of website fingerprinting attacks,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 263–274, Scottsdale, AZ, USA, November 2014. View at: Publisher Site | Google Scholar
5.1.1. Plain Data
They visited Alexa 100 sites in the close-world; each website was visited 1250 times. Next, they only store the data for at least 1000 valid visits after filtering out the invalid data. In the final result, 95 websites are stored. In the open-world, they visited 50000 Alexa websites, excluding the 100 websites in the close-world dataset. Finally, they got 40716 unmonitored website traffic data after filtering invalid data.
5.1.2. DFD Data
In order to evaluate the defense performance of DFD and compare it with SAD, plain data are used to generate defense data using and injection with 100% client injection according to the algorithm in DFD. The average bandwidth overhead range of these defense data is .
5.1.3. W-T and WTF-PAD Data
For WTF-PAD, Sirinam et al. [11
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
- M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright, “Toward an efficient website fingerprinting defense,” European Symposium on Research in Computer Security, Springer, NY, USA, pp. 27–46, 2016. View at: Publisher Site | Google Scholar
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
- T. Wang and I. Goldberg, “Walkie-talkie: an efficient defense against passive website fingerprinting attacks,” 26th {USENIX} Security Symposium ({USENIX} Security 17), USENIX Association, Vancouver, Canada, pp. 1375–1390, 2017. View at: Google Scholar
5.1.4. Data Representation
These traffic can be expressed as a sequence of (, , ), where means outgoing(incoming) packet. On this foundation, Sirinam et al. [11
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
5.2. Performance Index
We define three performance index: attack success rate , defense success rate , and bandwidth overhead , respectively, , , , where is the number of browsing records contained in the dataset, is the number of data correctly identified by the WF attack model, is the number of error identification of the WF attack model, is the number of inserted dummy packets, and is the number of valid packets of the source data.
5.3. Result and Discussion
5.3.1. WF Attack Model Evaluation in Different Sequence Lengths
Deep learning-based WF attacks use the direction sequence of flow as input data. The strategy adopted by WF defense is to insert dummy packets into normal traffic. The most obvious change brought by this strategy is that the effective length of the sequence becomes longer. In previous WF attacks based on deep learning, the length was usually set to 5000. If the length is less than 5000, fill it with 0; if it exceeds, it will be truncated directly. To obtain a credible result, we trained multiple models on different sequence lengths, even if the effective data are the same. Considering bandwidth overhead and sequence length . In this experiment, three WF attack models, DF, Var-CNN, and AWF, are used in both close-world and open-world.
It can be seen from Figure 4 that in close-world, different sequence lengths have stable . When the sequence length is 7500, of DF reaches the highest, 98.14%. When the sequence is 7000, of Var-CNN reaches the highest, 97.56%, which is in line with the ability performance in original paper. In open-world, of Var-CNN, AWF, and DF reaches the highest, 96.39%, 91.28%, and 96.04%, respectively. In comparison, DF and Var-CNN are relatively stable. Therefore, in the subsequent experiments related to AWF, except AWF in open-world is between 96% and 97%. Finally, we can consider that the performance of the three attack models is stable in the current sequence length range. Appending meaningless data at the end of the sequence to proceed does not reduce the capability of the model. These three attack models are stable in accuracy over different sequence lengths. The padding sequences do not affect the attack models much. This ensures that the results of subsequent experiments are not affected by the length of the input data.
5.3.2. Defense Model Evaluation in Different Bandwidth Overheads
In this experiment, segment packet number and bandwidth overhead . Because different combinations of segment injection rate and head rate will bring the same real bandwidth overhead, here we have chosen the best result. In different environments, the relationship between and is shown in Figure 5. The solid line represents the data of this experiment. It can be seen from Figure 5 that DSR has been significantly improved with the increase of . In close-world, DSR for three attack models is above 90% when is 30%. DSR is close to saturation when is 40%. Although of the three attack models in this experiment is 96%, there are differences in the defense capabilities of SAD for the three. With the same , in open-world is higher. It can be seen that it is more beneficial to the attacker in close-world. It is more beneficial to the defender in open-world. In addition, cannot be significantly increased when reaches a threshold. In practice, we can ignore unmonitored website and only confuse monitoring websites. This can save computing costs. In open-world, the attacker marks the unmonitored sites as additional tags, which appear only once in the experimental data. When the defender obfuscates only the commonly used sites, the attacker’s traffic classification for the targeted sites results in either other targeted sites or unmonitored sites. In this way, the attacker is also unable to deduce valid information from the results.
5.3.3. Different Combinations of Segment Injection Rate and Head Rate
In our design, the actual bandwidth overhead is determined by both the segmented injection rate and head rate. Therefore, the DSR may also be different for the same bandwidth overhead. In Table 3, we explore this situation in detail. In this experiment, segment injection rate , head rate , and . In the table, we have marked (bold) the largest with the same bandwidth overhead and .
|
The greater is seen where the head rate is smaller. It means that a large amount of injection in the early stages of network traffic transmission will have better results for defense. According to the general flow, SAD inserts a fixed percentage of dummy packets based on the length of each segment, that is, . In Table 3, when and (); that is, each segment will be processed. However, when and (). The reason for such a large gap at the same bandwidth overhead is that dummy packets that should have been inserted at the back are inserted at the front. A web page contains many different files. When a browser loads a web page, the files are downloaded in a relatively fixed order; for example, html files are always loaded first. This means that the first segment of web traffic contains more unique packet patterns. So we speculate that the deep learning model pays more attention to the front part of the data. And our behavior causes more drastic perturbation to the data in the forward part. On the other hand, in a real scenario loading two pages sequentially, it is difficult for an attacker to separate the first page from the second page from the mix. The approach in [27
- Y. Xu, T. Wang, Q. Li, Q. Gong, Y. Chen, and Y. Jiang, “A multi-tab website fingerprinting attack,” in Proceedings of the 34th Annual Computer Security Applications Conference, pp. 327–341, San Juan, PR, USA, December 2018. View at: Publisher Site | Google Scholar
5.3.4. Compared with DFD
We use the algorithm that injecting dummy packets proposed by DFD to generate defense data. These defensive data contain only monitored websites. In open-world, the training data of the attack model contains both unmonitored and monitored dataset. Then, we calculate the average of bandwidth overhead based on the defense data and DF is selected as the attack model. The results are shown in Figure 5, dashed lines. It is obvious that SAD is superior to DFD in all conditions of this experiment.
The performance of the DFD is somewhat different from the results expressed in the original paper. In DFD, the attack model being detected is not the latest AWF, Var-CNN, and DF, but is created by itself, although its CNN-based attack model can reach the same level of accuracy as DF. On the other hand, the valid length of the obfuscated data was not increased during testing, and the model input length was set to 5000, which can lead to the loss of valid data in some data back segments. In SAD, all data will be retained.
5.3.5. Adversary Training
Attackers can obtain defense data when the defense is public. In response to this situation, SAD can adjust bandwidth overhead to slow down the decline of DSR. We simulate the situation in this experiment. First, we train the DF attack model on the defense dataset until ASR reaches 90%. Then, the attack model will verify the effectiveness of multiple adversarial examples. When the defense data sequence and the input feature dimension of the attack model are not the same, if it is not enough, it is filled with 0, and if it is too long, it will be directly truncated. Furthermore, WTF-PAD and W-T are the most recognized defense methods, so we have conducted experiments using them as well, and the result is shown in Table 4.
|
In Figure 6, when DF uses the adversarial examples generated by SAD for training, its can still reach more than 90% with the same bandwidth overhead. The greater the gap between bandwidth overhead setting of defender and attacker, the higher . Therefore, SAD has excellent defense ability against the WF attack model after adversary training. Although deep learning can automatically extract features, it requires that the dimensional size of the training and test data be the same. Defenders can employ a variety of bandwidth overheads to increase the difficulty for attackers. But, it can be seen from Table 4 that WTF-PAD and W-T have different defense abilities in the face of confrontation training, and W-T is better. WTF-PAD uses dummy packets filled according to a certain rule. W-T is to try to convert traces to another traces that already exist. SAD is almost ineffective against adversarial training, although excellent defense can be obtained by modifying the bandwidth overhead. The WF attack model can still extract the relevant features. W-T modifies the original full-duplex communication. It tries to fit already existing traces when inserting dummy packets.
5.3.6. Black-Box Defense
The purpose of this experiment is to evaluate the robustness of SAD against unknown attack models. Unlike experiments (2)-(5), this experiment is a black-box defense. In the process of training and testing, three WF attack models can be selected, one of which is selected as the training model of SAD, and the other two is used as the test model.
In Figure 7, the dotted line and solid line represent DSR under the white-box and black-box, respectively. The black-box defense did not cause the SAD’s defense performance to fluctuate too much. As far as we know, the deep learning model is transferable. The three models in this experiment are the three most commonly used WF attack models, which have higher performance than other attack models and do not require manual feature extraction. Reference [9
- V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated Website Fingerprinting through Deep Learning,” 2017, arXiv preprint arXiv:1708.06376. View at: Google Scholar
6. Limitations and Future Work
There are still many limitations to SAD. Our ultimate goal is to deploy SAD in real-world scenarios. However, we have to consider how long it takes to generate the adversarial examples. We counted the time spent by SAD to generate the adversarial examples; the information is presented in Table 5. SAD generates a segment of adversarial examples in less than 2 milliseconds. It takes about 200 milliseconds for a complete trace, that is, the number of times SAD generates adversarial data , and the time complexity is . As far as we know, it takes 1 to 2 seconds to load a simple search engine home page. On the face of it, the SAD time overhead is reasonable. In fact, our results are obtained with the GPU in use. It takes 1 to 2 minutes to generate a piece of adversarial examples under the same conditions, with only CPU. This will seriously affect the web browsing. WTF-PAD, W-T, and DFD do not have such troubles because they do not use deep learning. According to the current development of deep learning, in order to get faster speed on the CPU, it is necessary to reduce the complexity of the model. So in the subsequent work, we can start from two aspects: reduce the model complexity and maintain the performance, and use nondeep learning methods to achieve the same purpose.
|
Adversarial examples can reduce the classification performance of the deep learning classifier. SAD is a model that can generate adversarial examples. We can assume that SAD transforms trace into another class. Ilyas et al. [28
- A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” Advances in Neural Information Processing Systems, vol. 32, 2019. View at: Google Scholar
7. Conclusion
In this article, we proposed a novel website fingerprinting defense, called Segment Adversarial Defense, against deep learning-based WF attacks. SAD can perform segmentation processing on live traffic and injects dummy packets after each segment to complete the defense. At the same time, we can limit the range of segmentation to get more diverse adversarial examples. By this, SAD can flexibly tune the actual bandwidth overhead. The operation effectively balances the defense success rate and bandwidth overhead. In addition, SAD can deal with adversarial training and black-box defense. The experimental results show that in a close-world and open-world, the SAD defense success rate can reach up to 99%. It only needs 30% bandwidth overhead to achieve 90% defensive success rate. Remarkably, it is possible to get over 40% with only 5% bandwidth overhead. In summary, SAD can effectively against WF attacks based on deep learning.
Data Availability
The dataset used to support the findings of this study can be found according to the prompt in “https://github.com/deep-fingerprinting/df”. The dataset belongs to Sirinam et al.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was funded by the National Natural Science Foundation of China (no. 62062022).
References
- R. Dingledine, N. Nick, and S. Paul, Tor: The Second-Generation Onion Router, Naval Research Lab, Washington, DC, USA, 2004. View at: Publisher Site
- A. R. Javed, W. Ahmed, M. Alazab, Z. Jalil, K. Kifayat, and T. R. Gadekallu, “A Comprehensive Survey on Computer Forensics: State-Of-The-Art, Tools, Techniques, Challenges, and Future Directions.,” IEEE Access, vol. 10, 2022. View at: Publisher Site | Google Scholar
- T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg, “Effective attacks and provable defenses for website fingerprinting,” Proceedings of the USENIX Security Symposium, USENIX Association, San Diego, CA, USA, pp. 143–157, August 2014. View at: Google Scholar
- A. Panchenko, F. Lanze, J. Pennekamp et al., “Website fingerprinting at internet scale,” NDSS, 2016. View at: Publisher Site | Google Scholar
- J. Hayes and G. Danezis, “k-fingerprinting: A robust scalable website fingerprinting technique,” Proceedings of the 25th {USENIX} Security Symposium ({USENIX} Security 16), Austin, TX, USA, pp. 1187–1203, August 2014. View at: Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105, 2012. View at: Google Scholar
- G. Hinton, L. Deng, D. Yu et al., “Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012. View at: Publisher Site | Google Scholar
- S. Agrawal, S. Sarkar, O. Aouedi et al., Federated Learning for Intrusion Detection System: Concepts, Challenges and Future Directions, 2021, https://arxiv.org/abs/2106.09527.
- V. Rimmer, D. Preuveneers, M. Juarez, T. Van Goethem, and W. Joosen, “Automated Website Fingerprinting through Deep Learning,” 2017, arXiv preprint arXiv:1708.06376. View at: Google Scholar
- S. Bhat, D. Lu, A. Kwon, and S. Devadas, “Var-cnn: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning,” Proceedings on Privacy Enhancing Technologies, vol. 2019, no. 4, pp. 292–310, 2019. View at: Publisher Site | Google Scholar
- P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting: undermining website fingerprinting defenses with deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 1928–1943, Toronto, Canada, October 2018. View at: Google Scholar
- M. Juarez, M. Imani, M. Perry, C. Diaz, and M. Wright, “Toward an efficient website fingerprinting defense,” European Symposium on Research in Computer Security, Springer, NY, USA, pp. 27–46, 2016. View at: Publisher Site | Google Scholar
- T. Wang and I. Goldberg, “Walkie-talkie: an efficient defense against passive website fingerprinting attacks,” 26th {USENIX} Security Symposium ({USENIX} Security 17), USENIX Association, Vancouver, Canada, pp. 1375–1390, 2017. View at: Google Scholar
- A. Abusnaina, R. Jang, A. Khormali, D. Nyang, and D. Mohaisen, “Dfd: adversarial learning-based approach to defend against website fingerprinting,” in Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 2459–2468, IEEE, Toronto, ON, Canada, July 2020. View at: Publisher Site | Google Scholar
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” 2014, https://arxiv.org/abs/1412.6572. View at: Google Scholar
- C. Hou, G. Gou, J. Shi, P. Fu, and G. Xiong, “Wf-gan: fighting back against website fingerprinting attack using adversarial learning,” in Proceedings of the 2020IEEE Symposium on Computers and Communications (ISCC), pp. 1–7, IEEE, Rennes, France, July 2020. View at: Publisher Site | Google Scholar
- M. S. Rahman, M. Imani, N. Mathews, and M. Wright, “Mockingbird: defending against deep-learning-based website fingerprinting attacks with adversarial traces,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1594–1609, 2020. View at: Google Scholar
- I. Goodfellow, J. Pouget-Abadie, M. Mirza et al., “Generative adversarial nets,” Advances in Neural Information Processing Systems, vol. 27, 2014. View at: Google Scholar
- D. Herrmann, R. Wendolsky, and H. Federrath, “Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier,” in Proceedings of the 2009 ACM workshop on Cloud computing security, pp. 31–42, Chicago, IL, USA, November 2009. View at: Publisher Site | Google Scholar
- A. Panchenko, L. Niessen, A. Zinnen, and T. Engel, “Website fingerprinting in onion routing based anonymization networks,” in Proceedings of the 10th annual ACM workshop on Privacy in the electronic society, pp. 103–114, Chicago, IL, USA, October 2011. View at: Publisher Site | Google Scholar
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, Las Vegas, NV, USA, June 2016. View at: Publisher Site | Google Scholar
- K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton, “Peek-a-boo, i still see you: why efficient traffic analysis countermeasures fail,” in Proceedings of the 2012 IEEE Symposium on Security and Privacy, pp. 332–346, IEEE, San Francisco, CA, USA, May 2012. View at: Publisher Site | Google Scholar
- X. Cai, R. Nithyanand, T. Wang, R. Johnson, and I. Goldberg, “A systematic approach to developing and evaluating website fingerprinting defenses,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 227–238, NY, USA, November 2014. View at: Publisher Site | Google Scholar
- X. Cai, R. Nithyanand, and R. Johnson, “Cs-buflo: a congestion sensitive website fingerprinting defense,” in Proceedings of the 13th Workshop on Privacy in the Electronic Society, pp. 121–130, Scottsdale, AZ, USA, November 2014. View at: Publisher Site | Google Scholar
- C. Szegedy, W. Zaremba, I. Sutskever et al., “Intriguing Properties of Neural Networks,” 2013, https://arxiv.org/abs/1312.6199. View at: Google Scholar
- M. Juarez, S. Afroz, G. Acar, C. Diaz, and R. Greenstadt, “A critical evaluation of website fingerprinting attacks,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 263–274, Scottsdale, AZ, USA, November 2014. View at: Publisher Site | Google Scholar
- Y. Xu, T. Wang, Q. Li, Q. Gong, Y. Chen, and Y. Jiang, “A multi-tab website fingerprinting attack,” in Proceedings of the 34th Annual Computer Security Applications Conference, pp. 327–341, San Juan, PR, USA, December 2018. View at: Publisher Site | Google Scholar
- A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” Advances in Neural Information Processing Systems, vol. 32, 2019. View at: Google Scholar
Copyright
Copyright © 2022 Renzhi Tang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.