Trace and detect adversarial attacks on CNNs using feature response maps

Amirian, Mohammadreza; Schwenker, Friedhelm; Stadelmann, Thilo

doi:10.1007/978-3-319-99978-4_27

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-3863

Full metadata record

DC Field	Value	Language
dc.contributor.author	Amirian, Mohammadreza	-
dc.contributor.author	Schwenker, Friedhelm	-
dc.contributor.author	Stadelmann, Thilo	-
dc.date.accessioned	2018-07-13T07:08:44Z	-
dc.date.available	2018-07-13T07:08:44Z	-
dc.date.issued	2018	-
dc.identifier.isbn	978-3-319-99977-7	de_CH
dc.identifier.isbn	978-3-319-99978-4	de_CH
dc.identifier.uri	https://digitalcollection.zhaw.ch/handle/11475/8027	-
dc.description.abstract	The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer – they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers – “feature responses” to a given input – have been helpful to visualize for a human “debugger” what the CNN “looks at” while computing its output. In this work, we propose a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet.	de_CH
dc.language.iso	en	de_CH
dc.publisher	Springer	de_CH
dc.relation.ispartofseries	Lecture Notes in Computer Science	de_CH
dc.rights	Licence according to publishing contract	de_CH
dc.subject	Model interpretability	de_CH
dc.subject	Feature visualization	de_CH
dc.subject	Diagnostic	de_CH
dc.subject.ddc	005: Computerprogrammierung, Programme und Daten	de_CH
dc.title	Trace and detect adversarial attacks on CNNs using feature response maps	de_CH
dc.type	Konferenz: Paper	de_CH
dcterms.type	Text	de_CH
zhaw.departement	School of Engineering	de_CH
zhaw.organisationalunit	Institut für Informatik (InIT)	de_CH
dc.identifier.doi	10.1007/978-3-319-99978-4_27	de_CH
dc.identifier.doi	10.21256/zhaw-3863	-
zhaw.conference.details	8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), Siena, Italy, 19-21 September 2018	de_CH
zhaw.funding.eu	No	de_CH
zhaw.originated.zhaw	Yes	de_CH
zhaw.pages.end	358	de_CH
zhaw.pages.start	346	de_CH
zhaw.publication.status	acceptedVersion	de_CH
zhaw.series.number	11081	de_CH
zhaw.publication.review	Peer review (Publikation)	de_CH
zhaw.title.proceedings	Artificial Neural Networks in Pattern Recognition	de_CH
zhaw.webfeed	Datalab	de_CH
zhaw.webfeed	Information Engineering	de_CH
zhaw.webfeed	Machine Perception and Cognition	de_CH
zhaw.funding.zhaw	QualitAI - Quality control of industrial products via deep learning on images	de_CH
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
ANNPR_2018c.pdf	Accepted Version	2.95 MB	Adobe PDF	View/Open

Show simple item record

Amirian, M., Schwenker, F., & Stadelmann, T. (2018). Trace and detect adversarial attacks on CNNs using feature response maps [Conference paper]. Artificial Neural Networks in Pattern Recognition, 346–358. https://doi.org/10.1007/978-3-319-99978-4_27

Amirian, M., Schwenker, F. and Stadelmann, T. (2018) ‘Trace and detect adversarial attacks on CNNs using feature response maps’, in Artificial Neural Networks in Pattern Recognition. Springer, pp. 346–358. Available at: https://doi.org/10.1007/978-3-319-99978-4_27.

M. Amirian, F. Schwenker, and T. Stadelmann, “Trace and detect adversarial attacks on CNNs using feature response maps,” in Artificial Neural Networks in Pattern Recognition, 2018, pp. 346–358. doi: 10.1007/978-3-319-99978-4_27.

AMIRIAN, Mohammadreza, Friedhelm SCHWENKER und Thilo STADELMANN, 2018. Trace and detect adversarial attacks on CNNs using feature response maps. In: Artificial Neural Networks in Pattern Recognition. Conference paper. Springer. 2018. S. 346–358. ISBN 978-3-319-99977-7

Amirian, Mohammadreza, Friedhelm Schwenker, and Thilo Stadelmann. 2018. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Conference paper. In Artificial Neural Networks in Pattern Recognition, 346–58. Springer. https://doi.org/10.1007/978-3-319-99978-4_27.

Amirian, Mohammadreza, et al. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Artificial Neural Networks in Pattern Recognition, Springer, 2018, pp. 346–58, https://doi.org/10.1007/978-3-319-99978-4_27.