Mixed reality headsets and applications are gaining popularity for entertainment and educational purposes. They can be used to acquire new skills or assist users in operating complex devices. Detection and localization of objects in the user’s vicinity are necessary to create convincing and effective tools. So far mostly real-world image markers are used to position holograms on real objects. They require printing and manual placing which might break the mixed reality experience. Accurately positioning objects with imaging sensors without the use of markers proves to be challenging and is currently state-of-the-art research. In this study we explore the effectiveness of the RGB, reflectivity and depth imaging sensors of the HoloLens mixed reality headset in natural, artificial and dim lighting setups on a new manually designed dataset. We are also looking at the difference in performance for the dim lighting setup and if natural or artificial light influence infrared-based imaging sensors. Our imaging subject is a musical keyboard in an indoor setting. We have concluded that the RGB, reflectivity and depth imaging sensors all pose candidates for computer vision tasks on the HoloLens mixed reality headset. However no specialized detectors for the reflectivity and depth fisheye lenses have been found in the literature. Using depth-aware and 6D pose estimation detectors requires prior experience and proves to be time-consuming to implement. Various cloud-based computer vision portals exist like CustomVision.ai that enable training an object detector but with limited metrics and export functionality. Based on the answers to the research questions we conclude that HoloLens imaging sensors are able to solve computer vision tasks in natural, artificial and dim lighting setups. A reflectivity imaging sensor will outperform an RGB sensor for object detection tasks in natural, artificial and dim lighting setups and more research is required for depth imaging sensor performance. Careful planning on imaging sensors, lighting setups and detectors should be done when implementing new mixed reality applications to ensure optimal performance.
A musical keyboard object detection dataset is created. For comparing the HoloLens imaging sensors, a dataset is recorded consisting of 50 randomly selected images per sensor per lighting setup resulting in a total of 450 samples. The samples include color, depth and reflectivity images recorded in natural, artificial and dim lighting setups. [Dataset]