The Tor protocol provides online anonymity for millions of users worldwide. However, Tor is much slower than regular internet browsing software. Many researchers proposed circuit building strategies that aim at improving the performance of Tor via optimizing its latency and bandwidth. Congestion-Aware Routing (CAR) is currently the most efficient solution for improving Tor performance. CAR is a decentralized approach that enables users to quantify the degree of circuit congestion during the stage of circuit creation in order to select the best one to connect to. However, CAR’s decentralized approach is somewhat limited because users will only have access to a small percentage of information about the degree of circuit congestion.
A recently published paper introduces PredicTor, a path selection approach that aims at improving Tor performance via using a Random Forest classifier specifically trained on the most recent Tor measurements in order to estimate the performance of a given path. Throughout this article, we will overview PredicTor and its efficiency in improving Tor performance.
Using PredicTor to improve Tor performance:
PredicTor utilizes the performance data of multiple circuits to pick up the less congested Tor relay nodes and shorter geographical paths with high levels of probability. A Random Forest classifier is trained using the most recent Tor circuits‘ measurements to predict the performance of a given path. If the path is found to be fast and less congested, then a circuit is created using those relay nodes.
PredicTor was implemented onto Tor’s source code and via Shadow’s simulations, it has been proven that PredicTor can improve performance of the Tor network by 23% when compared to Vanilla Tor and by 13% when compared to CAR. This led to the speeding up of Tor by over 500 ms more than Vanilla’s performance. Experiments show that PredicTor utilized around 30% more Tor relay nodes than Vanilla, yielding better load distribution and enabling PredicTor to better utilize limited network resources.
Via experiments on the live Tor network, it was shown that PredicTor boosts performance of the Tor network partly via avoiding connections to highly congested nodes and partly via creating low latency Tor circuits. Actually, PredicTor is by far the first ever path selection technique that dynamically considers both latency and congestion according to measurements of performance of the live Tor network. During experiments on the live Tor network, during period of high levels of congestion, an improvement of up to 10% is shown with PredicTor, when compared to Vanilla Tor.
Evaluating anonymity with usage of PredicTor:
Any technique that attempts to build efficient circuits must evaluate anonymity. For instance, an approach that is centered on connecting to relay nodes with high bandwidths could route traffic into fewer relay nodes, rendering it easier for a small number of adversaries to successfully attack more circuits.
Present anonymity metrics are grouped into two categories: metrics that quantify anonymity and metrics that quantify all-or-nothing compromises. Most anonymity quantifying metrics are based on entropy. Even though entropy based anonymity metrics can evaluate average cases efficiently, they are not highly representative of worst case scenarios. Metrics that quantify all-or-nothing compromises, are capable of measuring the time taken until a user connects to a compromised circuit. Via such metrics, we can obtain a thorough understanding of features that can result in full deanonymization, yet they are not fully representative of the anonymity state of users who have not been subjected to full deanonymization. Accordingly, a metric is needed to offer insight into the anonymity state before occurrence of full deanonymization and another that quantifies an attacker’s ability to infer users’ key attributes.
For all these reasons, this research study presents a new anonymity metric named Client As Inference, or CLASI. CLASI measures the ability of an adversary to infer users’ Autonomous Systems (ASes) via means of circuit fingerprinting at the country and network level in addition to other information, e.g. the bandwidth of relay nodes. The adversaries are given full knowledge of all Tor connections, so that results represent an upper bound. CLASI is useful in evaluating the anonymity of path selection techniques such as PredicTor and of algorithms that act to avoid BGP hijacking attacks.
PredicTor was evaluated using both entropy based metrics and CLASI. It was found that AS leakage in association with PredicTor is almost similar to Vanilla and somewhat better than CAR since PredicTor users build paths that are independent of their real-world network locations.
Even though Tor is an efficient online privacy solution, its relatively low performance level represents one of its main disadvantages. PredicTor is an innovative path selection technique that has been proven to boost Tor performance, yet more research is needed to test its resistance against various forms of deanonymization attacks.
by: Tamer Sameeh