This section describes the main aspects leading to the visualization of the VOs superimposed on the real world. The workflow of augmented-reality-enabled systems is shown in
Figure 1. This
Figure 1 shows that once the virtual model has been rendered, tracking and recording are the two basic steps. In this sense, tracking and registration provide the correct spatial positioning of the VOs with respect to the real world
[10]. This result is possible because, with monitoring, the spatial characteristics of an object are detected and measured. Specifically, with regard to AR, tracking indicates the operations necessary to determine the device’s six degrees of freedom, 3D location and orientation within the environment, necessary to calculate the real time user’s point of view. Tracking can be performed outdoors and indoors. Researchers focused on the latter. Two methods of indoor tracking are then distinguishable: outside-in and inside-out. In the outside-in method, the sensors are placed in a stationary place in the environment and sense the device location, often resorting to marker-based systems
[11]. In the inside-out method, the camera or the sensors are placed on the actual device whose spatial features are to be tracked in the environment. In this case, the device aims to determine how its position changes in relation to the environment, as for the head-mounted displays (HMDs). The inside-out tracking can be marker-based or marker-less. The marker-based vision technique, making use of optical sensors, measures the device pose starting from the recognition of some fiducial markers placed in the environment. This method can also hyperlink physical objects to web-based content using graphic tags or automatic identification technologies such as radio-frequency-identification (RFId) systems
[12]. The marker-less method, conversely, does not require fiducial markers. It bases its measures on the recognition of distinct characteristics, present in the environment, that in turn are used to localize the position of the device in combination with computer vision and image-processing techniques. Registration involves the matching and alignment of tracked spatial features obtained from the real world (RW) with the corresponding points of the VOs to reach an optimal overlapping between them
[1]. The accuracy of this process allows an accurate representation of the virtual reality over the real world and determines the natural appearance of an augmented image
[13]. The registration phase is connected to the tracking one. Based on the ways these two are accomplished, the process is defined as manual, fully automatic or semiautomatic. The manual one refers to manual registration and manual tracking. It consists in finding landmarks both on the model and the patient and consequently manually orienting and resizing of the obtained preoperative 3D model displayed on the operative monitor to make it match real images. The fully automatic process is the most complex one, especially with soft tissues. Since real world objects change their shapes with time, the same deformation needs to be applied to the VOs to address the fact that any deformation during surgery, due to events such as respiration, can result in an inaccurate real-time registration, subsequently causing an imprecise overlapping between 3D VOs and ROs. Finally, the semiautomatic process associates the automatic tracking with the manual registration. The identification of landmark structures, both on the obtained 3D model and on the real structures, occurs automatically, while its overlay on the model, and its orienting and resizing, occurs manually. This aspect is what differentiates the automatic process from the semiautomatic one. The latter provides the overlay of the AR images on real life statically and manually, while the former makes the 3D virtual models dynamically match the actual structures
[1][14][15][16]. For the visualization of the VOs onto the real world, several AR display technologies exist, usually classified in head, body and world devices, depending on the place where they are located
[7][17]. World devices are located in a fixed place. This category includes desktop displays used as AR displays, and projector-based displays. The former are equipped with a webcam, a virtual mirror showing the scene framed by the camera and a virtual showcase, allowing the user to see the scene, alongside additional information. Projector-based displays cast virtual objects directly onto the corresponding real-world objects’ surfaces. With body devices, researchers usually refer to handheld Android-based platforms, such as tablets or mobile phones. These devices use the camera for capturing the actual scenes in real time, while some sensors (e.g., gyroscopes and accelerometers and magnetometer) can determine their rotation. These devices usually resort to fiducial image targets for the tracking-registration phase
[18]. Finally, the HMDs are near eye displays, wearable devices consisting in sort of glasses that have the advantage of leaving the hands free to perform other tasks. HMDs are mainly of two types: video see-through and optical see-through. The first ones refer to special lenses that let the user see the external real world through a camera whose frames are in turn combined with VOs. In this way, the external environment is recorded in real time and the final images overlaying the VOs are produced directly over the user’s lenses. Differently, the optical see-through devices consist of an optical combiner or holographic waveguides, the lenses, that enable the overlay of images transmitted by a projector over the same lenses through which a normal visualization of the real world is allowed. In this way the user visualizes directly the reality augmented with the VOs overlaid onto it
[7][19].
Figure 2 shows an example of HMD.