A complete list of our publications is available on Google Scholar, please find below our highlighted work (all Open Access).
Dörrich, M., Fan, M., & Kist, A. M. (2023). Impact of Mixed Precision Techniques on Training and Inference Efficiency of Deep Neural Networks. IEEE Access.
Our brain works with fuzzy and low-precision logic. In this study, we investigate how mixed-precision techniques have an effect of the training and inference efficiency of encoder-decoder deep neural networks in a biomedical image segmentation task.
Kruse, E., Döllinger, M., Schützenberger, A., & Kist, A. M. (2023). GlottisNetV2: Temporal Glottal Midline Detection using Deep Convolutional Neural Networks. IEEE Journal of Translational Engineering in Health and Medicine.
Article / Code
Detecting the glottal midline accurately is crucial to assess quantitative parameters related to the symmetrical oscillation of the vocal folds. Here, we show how to use engineered neural networks to allow accurate midline detection simultaneously with glottal area segmentation.
Kist, A. M., Breininger, K., Dörrich, M., Dürr, S., Schützenberger, A., & Semmler, M. (2022). A single latent channel is sufficient for biomedical glottis segmentation. Scientific Reports, 12(1), 14292.
Article / Code / Data
In this paper, we show by mining an encoder-decoder deep neural network that a single latent channel image is sufficient for glottis segmentation. Further, we describe the function of the latent channel and how this affects downstream glottis segmentation.
Groh, R., Lei, Z., Martignetti, L., Li-Jessen, N. Y., & Kist, A. M. (2022). Efficient and Explainable Deep Neural Networks for Airway Symptom Detection in Support of Wearable Health Technology. Advanced Intelligent Systems, 2100284.
Article / Code
Airway symptom detection is crucial for monitoring chronic airway-related diseases. In this work, René is showing how data from neck surface accelerometers can be analyzed using a deep neural network in a computational constrained environment. He uses evolutionary neural architecture search to find an accurate, yet fast and deployable deep neural network for wearables.
Neubig, L., Groh, R., Kunduk, M., Larsen, D., Leonard, R., & Kist, A. M. (2022). Efficient Patient Orientation Detection in Videofluoroscopy Swallowing Studies. In Bildverarbeitung für die Medizin 2022 (pp. 129-134). Springer Vieweg, Wiesbaden.
Swallowing disorders are commonly examined using videofluoroscopy swallowing studies (VFSS). To comprehensively evaluate the swallowing process, a typical VFSS contains different patient orientations. Here, we show a systematic architectural scaling approach and found that an efficient ResNet18 variant is sufficient to classify a full VFSS recording of about 1800 frames in less than 14 s on conventional CPUs.
Commercially available systems for laryngeal high-speed videoendoscopy have not been further developed lately, are closed-source, and have only very limited analysis capacities. With OpenHSV, we provide a novel, award-winning, open hard- and software platform with DNN-powered online analysis.
Kist, A. M., and Michael Döllinger. Efficient biomedical image segmentation on Edge TPUs. Accepted as a short paper at Medical Imaging with Deep Learning (MIDL), 2021
We highlight at MIDL our work on semantic segmentation using Edge TPUs.
Kist, A. M., Zilker J., Döllinger M., & Semmler M. Feature-based image registration in structured light endoscopy. Accepted as full paper at Medical Imaging with Deep Learning (MIDL), 2021
Article / Code
Structured light endoscopy is a 3D-imaging method. However, the assignment of a projected laser grid to its reference is still tricky. We propose a Deep Learning-based image registration approach that achieves 91% accuracy on an ex vivo dataset.
Kist, A. M., Gómez P., Dubrovskiy D., Schlegel O., Kunduk M., Echternach M., Patel RR., Semmler M., Bohr C., Dürr S., Schützenberger A., & Döllinger M. A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis. J Speech Lang Hear R, 64 (6), 1889-1903.
The analysis of high-speed videoendoscopy data is crucial for voice quantification. In this paper, we describe the Glottis Analysis Tools (GAT). GAT has been actively developed in C# since 2010 and is used by dozens of labs worldwide.
Kist, A. M., and Michael Döllinger. Efficient Biomedical Image Segmentation on EdgeTPUs at Point of Care. IEEE Access 8 (2020): 139356-139366.
Deep neural networks are changing the way of biomedical diagnosis. For image segmentation, they can be very large and slow, especially on CPUs. In our recent study, we show that we can improve the glottis segmentation inference speed >79x fold by optimizing a popular biomedical segmentation network (U-Net) and porting it to the inexpensive EdgeTPU Hardware Accelerator.
Symmetry is important in vocal fold motion. The identification of the glottal midline is crucial to deriving symmetry from the glottal area. Here, we evaluate different approaches to determine the glottal midline and suggest a multi-task architecture, GlottisNet, that predicts both simultaneously, glottis segmentation and glottal midline.
Gómez, P.*, Kist, A. M.*, Schlegel, P., Berry, D. A., Chhetri, D. K., Dürr, S., … & Döllinger, M. (2020). BAGLS, a multihospital benchmark for automatic glottis segmentation. Scientific data, 7(1), 1-12.
Article / Code / Dataset
Glottis segmentation is a key component for analyzing the vocal fold vibrations. With BAGLS, we provide the first open, multihospital dataset for training and evaluating deep neural networks.
Kist, A. M., & Portugues, R. (2019). Optomotor swimming in larval zebrafish is driven by global whole-field visual motion and local light-dark transitions. Cell Reports, 29(3), 659-670.
Larval zebrafish swim when they perceive a whole-field moving stimulus. However, the underlying features that drive optomotor swimming remain elusive. Here, we show that larval zebrafish are predominantly driven by local light-dark transitions.