Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-20419
Full metadata record
DC FieldValueLanguage
dc.contributor.authorRoost, Dano-
dc.contributor.authorMeier, Ralph-
dc.contributor.authorToffetti Carughi, Giovanni-
dc.contributor.authorStadelmann, Thilo-
dc.date.accessioned2020-08-31T08:09:44Z-
dc.date.available2020-08-31T08:09:44Z-
dc.date.issued2020-08-31-
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/20419-
dc.descriptionAwarded with the Dr. Waldemar Jucker award 2020 of the GSTde_CH
dc.description.abstractWhile vision in living beings is an active process where image acquisition and classification are intertwined to gradually refine perception, much of today’s computer vision is build on the inferior paradigm of episodic classification of i.i.d. samples. We aim at improved scene understanding for robots by taking the sequential nature of seeing over time into account. We present a supervised multi-task approach to answer questions about different aspects of a scene such as the relationship between objects, their quantity or the their relative positions to the camera. For each question, we train a different output head which operates on input from one shared recurrent convolutional neural network that accumulates information over time steps. In parallel, we train an additional output head using reinforcement learning (RL) that uses the reduction in cumulative loss from the supervised heads as reward signal. It thereby learns to gradually improve the prediction confidence of e.g. partially occluded objects by moving the camera to a more favourable angle with respect to these objects. We present preliminary results on simulated RGB-D image sequences that show superior performance of our RL-based approach in answering questions quicker and more accurately than using static or random camera movement.de_CH
dc.language.isoende_CH
dc.publisherUniversity of Essexde_CH
dc.rightsLicence according to publishing contractde_CH
dc.subjectActive Visionde_CH
dc.subjectDeep Learningde_CH
dc.subjectReinforcement Learningde_CH
dc.subjectNeural Scene Understandingde_CH
dc.subjectRobotic Graspingde_CH
dc.subjectComputer Visionde_CH
dc.subject.ddc006: Spezielle Computerverfahrende_CH
dc.titleCombining reinforcement learning with supervised deep learning for neural active scene understandingde_CH
dc.typeKonferenz: Paperde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitInstitut für Informatik (InIT)de_CH
dc.identifier.doi10.21256/zhaw-20419-
zhaw.conference.detailsActive Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, 31 August - 4 September 2020de_CH
zhaw.funding.euNode_CH
zhaw.originated.zhawYesde_CH
zhaw.publication.statusacceptedVersionde_CH
zhaw.publication.reviewPeer review (Publikation)de_CH
zhaw.webfeedDatalabde_CH
zhaw.webfeedInformation Engineeringde_CH
zhaw.webfeedZHAW digitalde_CH
zhaw.webfeedMachine Perception and Cognitionde_CH
zhaw.author.additionalNode_CH
zhaw.display.portraitYesde_CH
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2020_Roost_Combining_reinforcement_learning_with_supervised_deep_learning.pdfAccepted Version1.52 MBAdobe PDFThumbnail
View/Open
Show simple item record
Roost, D., Meier, R., Toffetti Carughi, G., & Stadelmann, T. (2020, August 31). Combining reinforcement learning with supervised deep learning for neural active scene understanding. Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), Online, 31 August - 4 September 2020. https://doi.org/10.21256/zhaw-20419
Roost, D. et al. (2020) ‘Combining reinforcement learning with supervised deep learning for neural active scene understanding’, in Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, 31 August - 4 September 2020. University of Essex. Available at: https://doi.org/10.21256/zhaw-20419.
D. Roost, R. Meier, G. Toffetti Carughi, and T. Stadelmann, “Combining reinforcement learning with supervised deep learning for neural active scene understanding,” in Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, 31 August - 4 September 2020, Aug. 2020. doi: 10.21256/zhaw-20419.
ROOST, Dano, Ralph MEIER, Giovanni TOFFETTI CARUGHI und Thilo STADELMANN, 2020. Combining reinforcement learning with supervised deep learning for neural active scene understanding. In: Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, 31 August - 4 September 2020. Conference paper. University of Essex. 31 August 2020
Roost, Dano, Ralph Meier, Giovanni Toffetti Carughi, and Thilo Stadelmann. 2020. “Combining Reinforcement Learning with Supervised Deep Learning for Neural Active Scene Understanding.” Conference paper. In Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), Online, 31 August - 4 September 2020. University of Essex. https://doi.org/10.21256/zhaw-20419.
Roost, Dano, et al. “Combining Reinforcement Learning with Supervised Deep Learning for Neural Active Scene Understanding.” Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), Online, 31 August - 4 September 2020, University of Essex, 2020, https://doi.org/10.21256/zhaw-20419.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.