Full Program »
Improving Streaming Cryptocurrency Transaction Classification via Biased Sampling and Graph Feedback
We show that knowledge of wallet addresses from the current time state of a blockchain network, such as Bitcoin, increases the performance of illicit activity detection. Based on this finding we introduce two new methods for the sampling of classifier training data so that precedence is given to transaction information from the recent past and the current time state. This sampling enables streaming classification in which a decision on the class of a transaction needs to be made based on data seen to-date. Our simple yet effective approach provides insight into how the dynamics of the blockchain network plays a central role in the detection of illicit transactions, and is independent of the classifier choice. Our proposed sampling methods enable graph convolution network (GCN) and random forest (RF) classifiers to better adapt to changes in the network due to significant events, such as the closure of a large `Darknet' marketplace. We introduce Graphlet spectral correlation analysis for exposing the effect of such network re-organisation due to major events. Finally, based on our analysis, we propose a new two-stage random forest classifier that feeds back intermediate predictions of neighbours to improve the classification decision. Our methodology enables practical streaming classification, even in the scenario of very limited information on the feature space of each transaction.