BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4//
BEGIN:VEVENT
UID:20260611T040911EDT-9566mXVF53@132.216.98.100
DTSTAMP:20260611T080911Z
DESCRIPTION:Abstract\n\nThe structure of a neural network is one of the cri
 tical determinants of its performance and dictates how information flows. 
 The architecture determines the representational capacity\, inductive bias
 es\, and computational efficiency of a model. These factors directly dicta
 te how well the network learns from data\, how well it performs on a given
  task\, and how well it generalises to new examples.\n\nThe challenge of i
 dentifying a good neural network architecture has traditionally been solve
 d by manual design\, relying on expert intuition and costly trial-and-erro
 r empirical experimentation. Such handcrafted designs are commonly reused 
 for multiple applications to reduce the costs associated with structural d
 esign.\n\nWhile sharing the same architecture across a variety of tasks ma
 y bring potential benefits of native multimodality\, this choice can be in
 herently suboptimal for individual heterogeneous application requirements.
  As machine learning tasks grow more complex\, a rigorous architectural de
 sign is imperative to achieve robust generalisation and efficient utilisat
 ion of computational resources.\n\nThis thesis contributes several algorit
 hms for efficient architecture design and adaptation. It addresses both pa
 radigms for identifying structure: static\, where the goal is to identify 
 the most suitable fixed architecture for a given task\; and dynamic\, wher
 e the structure of a neural network is adjusted to identify a specialised 
 configuration for each data point of the task.\n\nIn the static paradigm\,
  two improvements are proposed in the field of Zero-Shot Neural Architectu
 re Search\, a domain where an architecture must be selected without traini
 ng any neural networks. The first contribution is a set of Zero-Shot ranki
 ng functions specifically designed for fast and memory-efficient evaluatio
 n of candidate architectures. They outperform state-of-the-art approaches 
 not only in terms of accuracy\, but also in terms of computational efficie
 ncy. The second contribution is a statistical comparison procedure designe
 d to achieve improved architecture search stability. This procedure is com
 patible with common search algorithms and effectively mitigates the proble
 m of Zero-Shot ranking functions variability.\n\nIn the dynamic paradigm\,
  the thesis presents two novel sparse Mixture of Experts methods that effi
 ciently tackle the problem of expert specialisation. The first contributio
 n is a novel expert routing system that is designed to enforce the special
 isation of experts. The thesis demonstrates the benefits of the proposed s
 ystem by explaining how it can be used to achieve effective knowledge tran
 sfer from a teacher Graph Neural Network into a more efficient student Mix
 ture of Experts model\, outperforming existing Graph Neural Network knowle
 dge distillation approaches.\n\nThe second contribution is a Mixture of Ex
 perts that is specifically tailored to the graph domain. The thesis propos
 es a novel graph-structure-aware expert routing procedure that is used to 
 distribute inference tasks to a set of heterogeneous experts. This allows 
 the learning architecture to adapt to distinct graph patterns and exhibit 
 robustness across a wide variety of graph learning tasks.\n
DTSTART:20251001T173000Z
DTEND:20251001T193000Z
LOCATION:Room 603\, McConnell Engineering Building\, CA\, QC\, Montreal\, H
 3A 0E9\, 3480 rue University
SUMMARY:PhD defence of Pavel Rumiantsev – Efficient Algorithms for Automati
 c Structure Identification and Adaptation in Deep Neural Networks
URL:https://www.mcgill.ca/ece/channels/event/phd-defence-pavel-rumiantsev-e
 fficient-algorithms-automatic-structure-identification-and-adaptation-3677
 14
END:VEVENT
END:VCALENDAR
