Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning

Tellenbach, Bernhard; Paganoni, Sergio; Rennhard, Marc

doi:10.21256/zhaw-1537

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-1537

Full metadata record

DC Field	Value	Language
dc.contributor.author	Tellenbach, Bernhard	-
dc.contributor.author	Paganoni, Sergio	-
dc.contributor.author	Rennhard, Marc	-
dc.date.accessioned	2017-11-29T08:49:13Z	-
dc.date.available	2017-11-29T08:49:13Z	-
dc.date.issued	2016	-
dc.identifier.issn	1942-2636	de_CH
dc.identifier.uri	https://digitalcollection.zhaw.ch/handle/11475/1601	-
dc.description.abstract	JavaScript is a common attack vector to probe for known vulnerabilities to select a fitting exploit or to manipulate the Document Object Model (DOM) of a web page in a harmful way. The JavaScripts used in such attacks are often obfuscated to make them hard to detect using signature-based approaches. On the other hand, since the only legitimate reason to obfuscate a script is to protect intellectual property, there are not many scripts that are both benign and obfuscated. A detector that can reliably detect obfuscated JavaScripts would therefore be a valuable tool in fighting JavaScript based attacks. In this paper, we compare the performance of nine different classifiers with respect to correctly classifying obfuscated and non-obfuscated scripts. For our experiments, we use a data set of regular, minified, and obfuscated samples from jsDeliver and the Alexa top 5000 websites and a set of malicious samples from MELANI. We find that the best of these classifiers, the boosted decision tree classifier, performs very well to correctly classify obfuscated and non-obfuscated scripts with precision and recall rates of around 99 percent. The boosted decision tree classifier is then used to assess how well this approach can cope with scripts obfuscated by an obfuscator not present in our training set. The results show that while it may work for some obfuscators, it is still critical to have as many different obfuscators in the training set as possible. Finally, we describe the results from experiments to classify malicious obfuscated scripts when no such scripts are included in the training set. Depending on the set of features used, it is possible to detect about half of those scripts, even though those samples do not seem to use any of the obfuscators used in our training set.	de_CH
dc.language.iso	en	de_CH
dc.publisher	IARIA	de_CH
dc.relation.ispartof	International Journal on Advances in Security	de_CH
dc.rights	Licence according to publishing contract	de_CH
dc.subject	Machine Learning	de_CH
dc.subject	JavaScript obfuscation	de_CH
dc.subject.ddc	006: Spezielle Computerverfahren	de_CH
dc.title	Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning	de_CH
dc.type	Beitrag in wissenschaftlicher Zeitschrift	de_CH
dcterms.type	Text	de_CH
zhaw.departement	School of Engineering	de_CH
zhaw.organisationalunit	Institut für Informatik (InIT)	de_CH
dc.identifier.doi	10.21256/zhaw-1537	-
zhaw.funding.eu	No	de_CH
zhaw.issue	3/4	de_CH
zhaw.originated.zhaw	Yes	de_CH
zhaw.pages.end	206	de_CH
zhaw.pages.start	196	de_CH
zhaw.publication.status	publishedVersion	de_CH
zhaw.volume	9	de_CH
zhaw.publication.review	Peer review (Publikation)	de_CH
zhaw.webfeed	Information Security	de_CH
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2017_Tellenbach_Detecting_obfuscated_JavaScripts_Advance_in_Securtiy.pdf		210.91 kB	Adobe PDF	View/Open

Show simple item record

Tellenbach, B., Paganoni, S., & Rennhard, M. (2016). Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning. International Journal on Advances in Security, 9(3/4), 196–206. https://doi.org/10.21256/zhaw-1537

Tellenbach, B., Paganoni, S. and Rennhard, M. (2016) ‘Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning’, International Journal on Advances in Security, 9(3/4), pp. 196–206. Available at: https://doi.org/10.21256/zhaw-1537.

B. Tellenbach, S. Paganoni, and M. Rennhard, “Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning,” International Journal on Advances in Security, vol. 9, no. 3/4, pp. 196–206, 2016, doi: 10.21256/zhaw-1537.

TELLENBACH, Bernhard, Sergio PAGANONI und Marc RENNHARD, 2016. Detecting obfuscated JavaScripts from known and unknown obfuscators using machine learning. International Journal on Advances in Security. 2016. Bd. 9, Nr. 3/4, S. 196–206. DOI 10.21256/zhaw-1537

Tellenbach, Bernhard, Sergio Paganoni, and Marc Rennhard. 2016. “Detecting Obfuscated JavaScripts from Known and Unknown Obfuscators Using Machine Learning.” International Journal on Advances in Security 9 (3/4): 196–206. https://doi.org/10.21256/zhaw-1537.

Tellenbach, Bernhard, et al. “Detecting Obfuscated JavaScripts from Known and Unknown Obfuscators Using Machine Learning.” International Journal on Advances in Security, vol. 9, no. 3/4, 2016, pp. 196–206, https://doi.org/10.21256/zhaw-1537.