Researchers from the Changchun Institute of Optics, Fine Mechanics and Physics of the Chinese Academy of Sciences have developed a novel autofocus method that harnesses the power of deep learning to dynamically select regions of interest in grayscale images. The study was published in the journal Sensors.
Traditional autofocus methods can be divided into active and passive categories. Active focusing relies on external sensors, increasing costs and complexity. In contrast, passive focusing assesses image quality to control focus, but fixed focusing windows and evaluation functions often lead to focusing failures, especially in complex scenes.
Moreover, the lack of comprehensive datasets has hindered the widespread adoption of deep learning methods in autofocus. Traditional image-based autofocus solutions suffer from issues like misjudging light spots and focal breathing where changes in camera zoom and light intensity during focusing can affect image sharpness evaluation.
In this study, researchers embarked on a three-step method to solve these problems. First, they constructed a comprehensive dataset of grayscale image sequences with continuous focusing adjustments, capturing diverse scenes from simple to complex and at varying focal lengths. This dataset serves as a valuable resource for training and evaluating autofocus algorithms.
Next, the researchers transformed the autofocus problem into an ordinal regression task, proposing two focusing strategies: full-stack search and single-frame prediction. These strategies enable the network to adaptively focus on salient regions within the frame, eliminating the need for pre-selected focusing windows.
Finally, the researchers designed a MobileViT network equipped with a linear self-attention mechanism. This lightweight yet powerful network achieves dynamic autofocus with minimal computational cost, ensuring fast and accurate focusing.
Experiments showed that the full-stack search strategy achieved a mean absolute error (MAE) of 0.094 with a focusing time of 27.8 milliseconds, while the single-frame prediction strategy achieved an MAE of 0.142 in just 27.5 milliseconds. These results underscore the superior performance of the deep learning-based autofocus method.
This deep learning-based autofocus method underscores the potential of AI in enhancing traditional imaging technologies. Future research could explore the application of this method to color images and video sequences. In addition, optimizing the network architecture and focusing strategies could lead to even faster and more accurate focusing.
More information:
Yao Wang et al, Deep Learning-Based Dynamic Region of Interest Autofocus Method for Grayscale Image, Sensors (2024). DOI: 10.3390/s24134336
Chinese Academy of Sciences
Deep learning drives dynamic autofocus in grayscale images (2024, September 16)
retrieved 16 September 2024
from https://techxplore.com/news/2024-09-deep-dynamic-autofocus-grayscale-images.html
part may be reproduced without the written permission. The content is provided for information purposes only.
Researchers from the Changchun Institute of Optics, Fine Mechanics and Physics of the Chinese Academy of Sciences have developed a novel autofocus method that harnesses the power of deep learning to dynamically select regions of interest in grayscale images. The study was published in the journal Sensors.
Traditional autofocus methods can be divided into active and passive categories. Active focusing relies on external sensors, increasing costs and complexity. In contrast, passive focusing assesses image quality to control focus, but fixed focusing windows and evaluation functions often lead to focusing failures, especially in complex scenes.
Moreover, the lack of comprehensive datasets has hindered the widespread adoption of deep learning methods in autofocus. Traditional image-based autofocus solutions suffer from issues like misjudging light spots and focal breathing where changes in camera zoom and light intensity during focusing can affect image sharpness evaluation.
In this study, researchers embarked on a three-step method to solve these problems. First, they constructed a comprehensive dataset of grayscale image sequences with continuous focusing adjustments, capturing diverse scenes from simple to complex and at varying focal lengths. This dataset serves as a valuable resource for training and evaluating autofocus algorithms.
Next, the researchers transformed the autofocus problem into an ordinal regression task, proposing two focusing strategies: full-stack search and single-frame prediction. These strategies enable the network to adaptively focus on salient regions within the frame, eliminating the need for pre-selected focusing windows.
Finally, the researchers designed a MobileViT network equipped with a linear self-attention mechanism. This lightweight yet powerful network achieves dynamic autofocus with minimal computational cost, ensuring fast and accurate focusing.
Experiments showed that the full-stack search strategy achieved a mean absolute error (MAE) of 0.094 with a focusing time of 27.8 milliseconds, while the single-frame prediction strategy achieved an MAE of 0.142 in just 27.5 milliseconds. These results underscore the superior performance of the deep learning-based autofocus method.
This deep learning-based autofocus method underscores the potential of AI in enhancing traditional imaging technologies. Future research could explore the application of this method to color images and video sequences. In addition, optimizing the network architecture and focusing strategies could lead to even faster and more accurate focusing.
More information:
Yao Wang et al, Deep Learning-Based Dynamic Region of Interest Autofocus Method for Grayscale Image, Sensors (2024). DOI: 10.3390/s24134336
Chinese Academy of Sciences
Deep learning drives dynamic autofocus in grayscale images (2024, September 16)
retrieved 16 September 2024
from https://techxplore.com/news/2024-09-deep-dynamic-autofocus-grayscale-images.html
part may be reproduced without the written permission. The content is provided for information purposes only.