BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20260523T215140EDT-0938iWeiPP@132.216.98.100 DTSTAMP:20260524T015140Z DESCRIPTION:Abstract\n\n \n\nExisting state-of-the-art recognition models a chieve impressive performance but require a complete scene which may not a lways be available. For example\, sensing a complete scene at once is infe asible in applications such as aerial imaging. Further\, in applications s uch as disaster recovery\, imaging devices should be light\, inexpensive\, and energy-efficient\; thus\, they are often built using small field-of-v iew cameras that capture only a part of a scene at a time. In the above ca ses\, the imaging devices must scan the area sequentially. Moreover\, they must also prioritize the scanning of informative subregions for timely re cognition.\n\nMany developed attention models that recognize a scene by ob serving it through small informative subregions called glimpses. However\, most models locate informative glimpses by glancing at a low-resolution g ist of a complete scene\, which is unavailable in practice. In this thesis \, we develop sequential recognition models that locate and attend to info rmative glimpses without assessing a complete scene. Our sequential attent ion models predict the location of the next glimpse based solely on past g limpses. Our models achieve effective attention policies under partial obs ervability by selecting subsequent glimpses that\, combined with past glim pses\, help the most in reasoning about the complete scene.\n\nWe present three attention models\, two for spatial and one for spatiotemporal recogn ition. The first is Probabilistic Attention Model (PAM). PAM uses Bayesian Optimal Experiment Design to attend to a glimpse with maximum expected in formation gain (EIG). It synthesizes features of the complete scene from p ast glimpses to estimate the EIG for yet unobserved regions. The second is Sequential Transformers Attention Model (STAM)\, which employs the one-st ep actor-critic algorithm to attend to a sequence of glimpses that produce class distribution consistent with the one produced using a complete scen e. The third is Glimpse Transformer (GliTr). GliTr learns an effective att ention mechanism for online action recognition by selecting glimpses with features and class distribution consistent with the corresponding complete video frames.\n\nThroughout the thesis\, we evaluate our models on multip le datasets and compare them with existing models. Our two key findings ar e as follows. First\, reasoning about the complete scene from partial obse rvations helps in learning an effective attention policy under partial obs ervability. Second\, while reducing the amount of sensing required for rec ognition\, our glimpse-based models achieve comparable or higher performan ce than the existing models that require complete scenes. The key takeaway is that one can attain good performance even using low-cost sensing devic es and non-ideal imaging by automating the sensing process and compelling the recognition model to fill in the missing information.\n DTSTART:20230613T180000Z DTEND:20230613T200000Z LOCATION:Room 603\, McConnell Engineering Building\, CA\, QC\, Montreal\, H 3A 0E9\, 3480 rue University SUMMARY:PhD defence of Samrudhdhi Rangrej – Visual Hard Attention Models Un der Partial Observability URL:/ece/channels/event/phd-defence-samrudhdhi-rangrej -visual-hard-attention-models-under-partial-observability-348659 END:VEVENT END:VCALENDAR