Leverage temporal convolutional network for the representation learning of URLs

Yunji Liang, Jian Kang, Zhiwen Yu, Bin Guo, Xiaolong Zheng, Saike He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Cyber crimes including computer virus/malwares, spam, illegal sales, and phishing websites are proliferated aggressively via the disguised Uniform Resource Locators (URL). Although numerous studies were conducted for the URL classification task, the traditional URL classification solutions retreated due to the hand-crafted feature engineering and the boom of newly generated URLs. In this paper, we study the representation learning of URLs, and explore the URL classification using deep learning. Specifically, we propose URL2vec to extract both the structural and lexical features of URLs, and apply temporal convolutional network (TCN) for the URL classification task. The experimental results show that URL2vec outperforms both word2vec and character-level embedding for URL representation, and TCN achieves the best performance than baselines with the precision up to 95.97%.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Intelligence and Security Informatics, ISI 2019
EditorsXiaolong Zheng, Ahmed Abbasi, Michael Chau, Alan Wang, Lina Zhou
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages74-79
Number of pages6
ISBN (Electronic)9781728125046
DOIs
StatePublished - Jul 2019
Event17th IEEE International Conference on Intelligence and Security Informatics, ISI 2019 - Shenzhen, China
Duration: 1 Jul 20193 Jul 2019

Publication series

Name2019 IEEE International Conference on Intelligence and Security Informatics, ISI 2019

Conference

Conference17th IEEE International Conference on Intelligence and Security Informatics, ISI 2019
Country/TerritoryChina
CityShenzhen
Period1/07/193/07/19

Keywords

  • Cyber Crime
  • Temporal Convolutional Network
  • URL Classification
  • URL2vec

Fingerprint

Dive into the research topics of 'Leverage temporal convolutional network for the representation learning of URLs'. Together they form a unique fingerprint.

Cite this