Multilevel Architectures and Algorithms in Deep Learning

Start: 2023-01-01
End: 2025-12-31
Principal Investigators: Roland Herzog, Anton Schiela
Staff: Leonie Kreis, Frederik Köhne
Funded by: DFG within the Priority Program funding scheme
Part of: Theoretical Foundations of Deep Learning (SPP 2298)

Project Description

The design of deep neural networks (DNNs) and their training is a central issue in machine learning. Progress in these areas is one of the driving forces for the success of these technologies. Nevertheless, tedious experimentation and human interaction is often still needed during the learning process to find an appropriate network structure and corresponding hyperparameters to obtain the desired behavior of a DNN.

The strategic goal of the proposed project is to provide algorithmic means to improve this situation. Our methodical approach relies on well established mathematical techniques; identify fundamental algorithmic quantities and construct a-posteriori estimates for them, identify and consistently exploit an appropriate topological framework for the given problem class, establish a multilevel structure for DNNs to account for the fact that DNNs only realize a discrete approximation of a continuous nonlinear mapping relating input to output data. Combining this idea with novel algorithmic control strategies and preconditioning, we will establish the new class of adaptive multilevel algorithms for deep learning, which not only optimize a fixed DNN, but also adaptively refine and extend the DNN architecture during the optimization loop. This concept is not restricted to a particular network architecture, and we will study feedforward neural networks, ResNets, and PINNs as relevant examples.

Our integrated approach will thus be able to replace many of the current manual tuning techniques by algorithmic strategies, based on a-posteriori estimates. Moreover, our algorithm will reduce the computational effort for training and also the size of the resulting DNN, compared to a manually designed counterpart, making the use of deep learning more efficient in many aspects. Finally, in the long run our algorithmic approach has the potential to enhance the reliability and interpretability of the resulting trained DNN.

Associated Publications

Roland Herzog, Frederik Köhne, Leonie Kreis and Anton Schiela

Metric Frobenius norms and inner products of matrices and linear maps

Linear Algebra and its Applications 727, p.112-128, 2025

arXiv:2311.15419

doi:10.1016/j.laa.2025.08.005

bibtex

@ARTICLE{HerzogKoehneKreisSchiela:2025:1,
  AUTHOR = {Herzog, Roland and Köhne, Frederik and Kreis, Leonie and Schiela, Anton},
  DATE = {2025-08-08},
  DOI = {10.1016/j.laa.2025.08.005},
  EPRINT = {2311.15419},
  EPRINTTYPE = {arXiv},
  JOURNALTITLE = {Linear Algebra and its Applications},
  PAGES = {112--128},
  TITLE = {Metric Frobenius norms and inner products of matrices and linear maps},
  VOLUME = {727},
}

Evelyn Herberg, Roland Herzog, Frederik Köhne, Leonie Kreis and Anton Schiela

Sensitivity-based layer insertion for residual and feedforward neural networks, 2023

arXiv:2311.15995

bibtex

@ONLINE{HerbergHerzogKoehneKreisSchiela:2023:1,
  AUTHOR = {Herberg, Evelyn and Herzog, Roland and Köhne, Frederik and Kreis, Leonie and Schiela, Anton},
  DATE = {2023-11},
  EPRINT = {2311.15995},
  EPRINTTYPE = {arXiv},
  TITLE = {Sensitivity-based layer insertion for residual and feedforward neural networks},
}

Frederik Köhne, Leonie Kreis, Anton Schiela and Roland Herzog

Adaptive step sizes for preconditioned stochastic gradient descent, 2023

arXiv:2311.16956

bibtex

@ONLINE{KoehneKreisSchielaHerzog:2023:1,
  AUTHOR = {Köhne, Frederik and Kreis, Leonie and Schiela, Anton and Herzog, Roland},
  DATE = {2023-11},
  EPRINT = {2311.16956},
  EPRINTTYPE = {arXiv},
  TITLE = {Adaptive step sizes for preconditioned stochastic gradient descent},
}

Leonie Kreis

Multilevel Training of Residual Neural Networks

M.Sc. Thesis, Heidelberg University, 2022

bibtex

@THESIS{Kreis:2022:1,
  AUTHOR = {Kreis, Leonie},
  INSTITUTION = {Heidelberg University},
  DATE = {2022-11-29},
  TITLE = {Multilevel Training of Residual Neural Networks},
  TYPE = {M.Sc. Thesis},
}