Design and Evaluation of
Different Selection Metaphors for
a Dissimilar Co-Embodied Avatar
in Virtual Reality
DIPLOMARBEIT
zur Erlangung des akademischen Grades
Diplom-Ingenieur
im Rahmen des Studiums
Visual Computing
eingereicht von
Gabriel Ratschiller, BSc
Matrikelnummer 11778247
an der Fakultät für Informatik
der Technischen Universität Wien
Betreuung: Univ.Prof. Mag.rer.nat. Dr.techn. Hannes Kaufmann
Mitwirkung: Univ.Ass. Hugo Brument, PhD
Wien, 21. November 2024
Gabriel Ratschiller Hannes Kaufmann
Technische Universität Wien
A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.at

Design and Evaluation of
Different Selection Metaphors for
a Dissimilar Co-Embodied Avatar
in Virtual Reality
DIPLOMA THESIS
submitted in partial fulfillment of the requirements for the degree of
Diplom-Ingenieur
in
Visual Computing
by
Gabriel Ratschiller, BSc
Registration Number 11778247
to the Faculty of Informatics
at the TU Wien
Advisor: Univ.Prof. Mag.rer.nat. Dr.techn. Hannes Kaufmann
Assistance: Univ.Ass. Hugo Brument, PhD
Vienna, November 21, 2024
Gabriel Ratschiller Hannes Kaufmann
Technische Universität Wien
A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.at

Erklärung zur Verfassung der
Arbeit
Gabriel Ratschiller, BSc
Hiermit erkläre ich, dass ich diese Arbeit selbständig verfasst habe, dass ich die verwen-
deten Quellen und Hilfsmittel vollständig angegeben habe und dass ich die Stellen der
Arbeit – einschließlich Tabellen, Karten und Abbildungen –, die anderen Werken oder
dem Internet im Wortlaut oder dem Sinn nach entnommen sind, auf jeden Fall unter
Angabe der Quelle als Entlehnung kenntlich gemacht habe.
Wien, 21. November 2024
Gabriel Ratschiller
v

Danksagung
Ich möchte diese Gelegenheit nutzen, um mich bei allen, die mich während meines Studi-
ums und insbesondere während meiner Diplomarbeit unterstützt haben, zu bedanken.
Mein Dank gilt zunächst meinen überaus engagierten Betreuern, Hannes Kaufmann und
Hugo Brument, die mich während der gesamten Zeit der Diplomarbeit mit wertvollem
Input unterstützt haben. Der unkomplizierte E-Mail-Austausch und die regelmäßigen
Meetings waren eine enorme Hilfe bei der Durchführung der Arbeit.
Mein besonderer Dank gilt meinen wunderbaren Eltern, die nicht nur mein Studium
finanziell unterstützt haben, sondern mir auch stets das Gefühl gegeben haben, das
Richtige zu tun, und nie an meinem Erfolg gezweifelt haben.
Des Weiteren danke ich meinem Bruder, der sich nie gescheut hat, mir die eine oder
andere (mehr oder weniger) wertvolle Lebensweisheit mit auf den Weg zu geben.
Ein besonderer Dank gilt auch meiner Freundin, die mich vom ersten Tag meiner Di-
plomarbeit an unterstützt und an mich geglaubt hat.
Zum Schluss möchte ich mich auch bei allen TeilnehmerInnen der Nutzerstudie bedanken.
Diese Arbeit ist meinen beiden verstorbenen Omas gewidmet.
vii

Acknowledgements
I would like to take this opportunity to thank everyone who has supported me during
my studies and especially during my diploma thesis.
First of all, I would like to thank my extremely dedicated supervisors, Hannes Kaufmann
and Hugo Brument, who provided me with valuable input throughout the entire time I
was writing my thesis. The uncomplicated exchange of emails and regular meetings were
an enormous help in completing the thesis.
My special thanks go to my wonderful parents, who not only supported my studies
financially but also always made me feel that I was doing the right thing and never
doubted my success.
I would also like to thank my brother, who has never shied away from giving me some
(more or less) valuable life wisdom along the way.
Special thanks also go to my girlfriend, who supported and believed in me from day one
of my diploma thesis.
Finally, I would like to thank all the participants of the user study.
This thesis is dedicated to my two late grandmothers.
ix

Kurzfassung
Die virtuelle Realität (VR) ermöglicht immersive Erfahrungen, bei denen die Nutzer über
Avatare mit computergenerierten Umgebungen interagieren. Diese Interaktion basiert auf
dem “Embodiment“, dem psychologischen Gefühl, in einem virtuellen Körper zu “stecken“.
Dieses “Gefühl der Verkörperung“ (Sense of Embodiment, SoE) trägt wesentlich dazu
bei, dass die Nutzer die VR-Umgebung als realistisch empfinden und erleben. Herkömm-
liche VR-Schnittstellen stützen sich jedoch häufig auf menschenbezogene Auswahl- und
Manipulationsmetaphern (z. B. Hand oder Cursor), die sich möglicherweise nicht gut auf
nicht-menschliche Avatare übertragen lassen, das sind Avatare, die sich (manchmal stark)
von der menschlichen Anatomie unterscheiden. Vor allem “Co-Embodiment“-Szenarien,
bei denen sich mehrere Nutzer die Kontrolle über einen einzigen Avatar teilen, erfordern
geeignete Interaktionsmetaphern, da sich herkömmliche Metaphern oft als unzureichend
erweisen.
Diese Arbeit beschreibt den Entwurf, die Implementierung und die Evaluierung von
Auswahl- und Manipulationsmetaphern in VR für einen nicht-menschlichen, ko-verkörperten
Avatar. Eine Interaktionsplattform wurde innerhalb einer bestehenden Multi-User-VR-
Umgebung mit der Unity3D Software entwickelt. Diese Plattform ermöglicht es den
Nutzern, die Kontrolle über einen nicht-menschlichen Avatar zu teilen und mit virtuellen
Objekten in der Umgebung zu interagieren. Es wurden verschiedene Interaktionsme-
taphern entwickelt, die gut zu den Fähigkeiten des nicht-menschlichen Avatars passen.
Insbesondere wird in dieser Arbeit untersucht, wie sich verschiedene Auswahl- und Interak-
tionsmetaphern auf die Benutzererfahrung, das SoE und die Ko-Präsenz in VR auswirken.
Darüber hinaus wurden die implementierten Interaktionsmetaphern in einer Benutzer-
studie evaluiert, um zu verstehen, welche Metaphern geeignet sind und für zukünftige
Entwicklungen der gemeinsamen Steuerung von ko-verkörperten, nicht-menschlichen
Avataren in VR in Betracht gezogen werden können. Die Ergebnisse zeigten, dass es in
der Tat Unterschiede zwischen den Interaktionsmetaphern in Bezug auf die Leistung, die
Benutzererfahrung und das SoE gibt. Sie zeigten auch, dass die entwickelten Interakti-
onsmetaphern für den nicht-menschlichen Avatar geeignet sind und eine Grundlage für
zukünftige Weiterentwicklungen bilden.
xi

Abstract
Virtual Reality (VR) enables immersive experiences in which users interact with computer-
generated environments through avatars. This interaction is based on embodiment, the
psychological feeling of “being in“ a virtual body. The user’s perception of realism and
experience in the VR environment is strongly influenced by the “Sense of Embodiment“
(SoE). However, traditional VR interfaces often rely on human-centric selection and
manipulation metaphors (e.g., hand or raycast) that may not translate well to dissimilar
avatars, i.e., avatars that differ (sometimes greatly) from human anatomy. In particular,
co-embodied scenarios, where multiple users share control of a single avatar, require
appropriate selection and interaction metaphors, as traditional metaphors often prove
inadequate.
This thesis describes the design, implementation, and evaluation of selection and manipu-
lation metaphors in VR for a dissimilar co-embodied avatar. An interaction platform was
developed within an existing multi-user VR environment built using Unity3D software.
This platform allows users to share control of a dissimilar avatar and interact with virtual
objects in the environment. Several interaction metaphors have been developed that fit
well with the capabilities of the dissimilar avatar. In particular, this thesis investigates
how different interaction metaphors affect user experience, SoE, and co-presence in VR.
Furthermore, the implemented interaction metaphors were evaluated in a user study
in order to understand which metaphors are suitable and can be considered for future
developments of shared control of co-embodied dissimilar avatars in VR. The results
showed that there are indeed differences between the interaction metaphors in terms of
performance, sense of agency, and SoE. They also showed that the developed interaction
metaphors are suitable for the dissimilar avatar and form a basis for further development
in the future.
xiii

Contents
Kurzfassung xi
Abstract xiii
Contents xv
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Aim of the Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work 5
2.1 Virtual Reality and Avatars . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Body perception and Sense of Embodiment . . . . . . . . . . . . . . . 6
2.3 Dissimilar avatars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Co-Embodiment in Virtual Reality . . . . . . . . . . . . . . . . . . . . 12
2.5 Inverse Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Selection Metaphors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Avatar Design and Implementation 21
3.1 Overview and Requirements . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Virtual Reality Environment . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Selection Metaphors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 FABRIK IK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Evaluation Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4 Evaluation 43
4.1 Study Design and Hypotheses . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Technical Setup and Equipment . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . . . 49
4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
xv
5 Discussion 63
5.1 User Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Sense of Embodiment and Agency . . . . . . . . . . . . . . . . . . . . 64
5.3 User preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6 Conclusion 69
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
List of Figures 71
List of Tables 73
Bibliography 75
Appendix 80
CHAPTER 1
Introduction
Virtual Reality (VR) is a tool that allows users to immerse themselves in a computer-
generated world and interact with the environment using a head-mounted display (HMD)
and hand-held VR controllers. VR hardware and devices have grown more accessible
to consumers in recent years and are widely used in both the commercial and consumer
sectors. As VR hardware becomes more widely available, its use has expanded to include
healthcare, education and entertainment [5].
One of the key success factors of VR is its ability to create experiences that cannot be
created with traditional 2D screens. By combining the virtual and real worlds, users can
immerse themselves in a fantastic experience and escape the physics and conventions of
the real world. Instead of only viewing a virtual environment, VR creates the impression
that one is actually a part of it. A key factor that defines whether a user is genuinely
“immersed“ in the virtual world is their “Sense of Presence“ (SoP) [8]. SoP is defined as the
user’s subjective feeling of “being there“ and is influenced by both user characteristics (e.g.,
previous VR experience, concentration, tendency to motion sickness, and expectations
of the VR experience) and media characteristics (e.g., audiovisual presentation of the
content, response time, or size of the visual field). The stronger the SoP, the more the
user feels part of the virtual world.
To give the user the feeling of being in and interacting with the virtual world, the user
is often placed inside a virtual representation by replacing parts of their body with a
virtual body. This virtual body, also known as an “avatar“, serves as a visual orientation
aid for users who are wearing an HMD and cannot see their real body. This virtual
representation therefore assists the user in interacting with the virtual environment and
enables them to execute their own movements precisely. The user’s movements are
mapped to the limbs of the virtual avatar and animated based on the user’s motions. The
better the translation from the user’s motions to the animated virtual limbs is, i.e., the
more realistic the virtual movements are, the more the user feels moving in the virtual
environment and the stronger the immersion. This feeling of “being in“ the virtual body
1
1. Introduction
is known as embodiment [26]. Every VR experience heavily relies on this “Sense of
Embodiment“ (SoE), since a poor SoE due to inadequate animations and representations
of the user’s movements in VR can ruin the illusion of presence in the virtual world.
This can negatively affect the VR experience and even cause physical symptoms such as
cybersickness, dizziness, or nausea [40, 10, 15, 27].
Traditionally, VR avatars represent the user’s physical body and allow the user to control
them based on familiar movements. But this technology goes even beyond this paradigm.
Avatars can take on completely different virtual forms from the user’s physical form. The
spectrum ranges from subtle variations such as altered skin tones or facial expressions to
more extreme modifications that transform the user into an animal, a fantasy creature,
or beings with unusual body parts [7]. In addition to their unusual appearance, avatars
can also differ from human avatars in how the user controls them and which virtual
body parts are controlled with which user interactions. Such dissimilar avatars have been
studied more and more frequently recently as they can be relevant for use in therapeutic,
social, or educational applications [31, 49].
Another emerging topic is co-embodiment, which refers to a single virtual avatar controlled
by multiple users or agents simultaneously [16]. Both users’ movements can be averaged
and mapped to the avatar, or each user can control a part of the avatar individually.
This collaborative and shared VR experience can be used in areas such as rehabilitation,
skills training and support for disabled users [20]. Studies show that this form of VR can
be particularly beneficial in a learner-teacher environment, where a teacher controls the
same avatar as a student, creating a shared learning experience [29].
1.1 Motivation
With the ability to simulate all avatars in virtual worlds, creating experiences that deeply
immerse the user and promote a high SoE is challenging. Avatars must have appropriate
proportions, reflect the user’s movements well, and not make unexpected movements;
otherwise, the VR experience could be significantly compromised. For this reason, much
research has focused on SoE, particularly with human-like avatars [21, 31]. In recent
years, however, a new area of research has emerged that focuses on dissimilar avatars.
This field complements the research on human-like avatars with new ways of integrating
non-human avatars into the use of VR.
When designing interactions with objects and environments for such (human-like or
dissimilar) avatars, there are challenges to overcome to make the experience as user-
friendly as possible and to ensure a strong SoE. A key challenge lies in the design of
selection metaphors, i.e., the methods used for object selection and interaction in VR.
Research shows that a well-designed selection metaphor can significantly impact SoE and
usability [12, 22].
When co-embodied dissimilar avatars are involved, the requirement for intuitive interaction
metaphors that generate a strong SoE becomes even more relevant. Careful thought must
2
1.2. Aim of the Work
be given to how co-embodied avatars interact with their surroundings in co-embodiment
settings and how to design such co-embodiment scenarios. As the user suddenly no longer
has their familiar physical form, novel selection metaphors need to be developed that use
the virtual avatar’s physical capabilities to enable intuitive interactions. Traditional VR
interfaces typically rely on human-centric hand or cursor metaphors for selection [2, 33].
While proven useful for human-like avatars, they may not work well with dissimilar
avatars with different physical capabilities. Therefore, there is a need for intuitive
selection metaphors in co-embodied scenarios with dissimilar avatars, especially because
of the unusual freedom of movement and the potential for mutual influence between the
users. This lack of intuitive selection methods can hinder the user experience and limit
the possibility of co-embodied interaction with dissimilar avatars in VR, thus reducing
the user’s SoE.
Co-embodiment research is still in its early stages, especially when it comes to dissimilar
avatars. Although co-embodiment studies have been conducted with human-like avatars,
the study of dissimilar avatars is a novel area of investigation that introduces new design
considerations and opportunities. We want to address this lack of research in our thesis.
1.2 Aim of the Work
This thesis investigates the design and evaluation of suitable selection and manipulation
metaphors for a dissimilar avatar in a co-embodied interaction in VR. To achieve this,
an experimental platform is developed within an existing multi-user VR framework with
a dissimilar avatar with four arms and two heads. This allows users to share control of
a dissimilar avatar and interact with virtual objects in the environment. We develop
two interaction metaphors specifically tailored to the capabilities of the dissimilar avatar,
whose movements are synchronized over the network to ensure a real-time collaborative
interaction setting for both users. Further, we assess how these interaction metaphors
can influence the user’s agency and SoE.
To evaluate the effectiveness of these metaphors, we develop a series of interaction tasks in
the VR environment. These tasks aim to assess the usability, efficacy, and user experience
of each interaction metaphor. The goal is to evaluate how well the metaphors assist users
in completing tasks that require them to select and manipulate virtual objects while
co-embodied in the dissimilar avatar.
Finally, a user study is conducted to collect objective and subjective data composed of
performance metrics (such as task completion time), user preferences for the proposed
selection metaphors, and user experience (such as sense of agency or SoE). By examining
this data, we want to determine the most practical and efficient techniques for selecting
co-embodied objects with dissimilar avatars and investigate the variables affecting SoE
in this particular VR interaction setting.
3
1. Introduction
1.3 Methodology
First, a structured literature review of existing research on selection and interaction
metaphors in VR is conducted. The review focuses explicitly on metaphors relevant to
dissimilar co-embodied avatars. Further, literature on the current state of the art about
the influence of various selection metaphors on user agency and interaction within VR
environments featuring co-embodied avatars is examined. In addition, the latest research
in the fields of VR and SoE, dissimilar avatars, co-embodiment, and inverse kinematics is
presented and discussed.
Using the Unity3D engine, an experimental platform is designed and developed in which
users co-embody and control a dissimilar avatar. To interact with objects in the VR
environment with this avatar, two interaction metaphors for selecting and manipulating
objects are designed and developed. These interaction metaphors are adapted to the
conditions of the dissimilar avatar and exploit the strengths of the avatar’s morphology.
Users then perform different tasks using different interaction metaphors. In addition, the
movements when performing the tasks with the interaction metaphors are synchronized.
To achieve this, both users are connected via the network and can control and interact
with the avatar in real-time.
The usability and user experience of the developed selection metaphors are evaluated with
a user study, collecting objective and subjective data. The user study consists of measured
user performance data when performing a selection task as well as questionnaires. The
user performs a series of tasks in different body configurations, providing feedback on
the sense of agency and the preferred interaction metaphor. The collected data on user
performance (e.g., task completion time), as well as the subjective data gathered in the
questionnaires, are then analyzed to identify the most usable and preferred selection
metaphor for dissimilar co-embodied avatars.
1.4 Structure of the Thesis
This thesis is structured as follows: Chapter 2 provides a comprehensive overview of
the existing literature on the topics covered in this thesis. First, a general overview of
VR and the SoE is given, followed by the state of the art on dissimilar avatars as well
as co-embodiment in VR. Furthermore, a discussion about Inverse Kinematics (IK) is
given before moving on to literature on selection metaphors for VR interaction. Chapter
3 describes the design and implementation of the avatar and the VR environment,
including the developed interaction metaphors. It goes into detail about implementing
the interaction system and explains how the interaction with the VR environment works.
Finally, the design and implementation of the user study tasks are explained. In chapter
4, the user study conducted is described, including the design of the user study, the
experimental procedures, and the data collection methods. The evaluated results of the
data analysis are summarized. Chapter 5 discusses the user study results and highlights
the limitations of the experimental platform described. Finally, chapter 6 summarizes
the thesis and provides an outlook for future work.
4
CHAPTER 2
Related Work
This chapter provides an overview of the state of the art of the concepts and technologies
relevant to this thesis. First, we introduce VR and the role of avatars (section 2.1), then
discuss the concept of body perception and SoE, and related influencing factors such as
self-location, body ownership, and agency (section 2.2). We then present current research
on dissimilar avatars and the idea behind exploring them (section 2.3), followed by an
overview of co-embodiment in VR (section 2.4). In the section on Inverse Kinematics
(IK) (section 2.5), we give an overview of the IK algorithm, which has also proven useful
for animating the limbs of a dissimilar avatar. Finally, an overview of VR selection
metaphors is given (section 2.6), with a focus on metaphors relevant for controlling
dissimilar avatars.
2.1 Virtual Reality and Avatars
VR technology allows users to enjoy interactive experiences and become completely
immersed in a fictional world. Recent significant advances in the technology and the
development of more powerful and affordable hardware have made it available to a wider
range of consumers. Users can interact with virtual worlds and have fictional experiences
that feel real thanks to wearable controllers and head-mounted displays (HMDs). Various
industries, including gaming, education, training, entertainment, and even healthcare,
are using VR applications [5].
Virtual representations of people are often one of the many building blocks of a VR
experience. These representations can exist independently of the user in the virtual world,
or they can be directly controlled by the user. In this case, it is called an “avatar“. An
avatar is their virtual representation in the digital world. Avatars are an integral part
of VR experiences and can be anything from realistic human-like figures to non-human
beings and fantastical creatures [31, 23, 30]. They are versatile and allow users to express
5
2. Related Work
their identity (reflecting their personal style if customized), interact with other users in
virtual environments, and, most importantly, enhance immersion.
2.2 Body perception and Sense of Embodiment
2.2.1 Body perception
An important concept in any VR experience is the concept of body perception. This
describes the way people perceive and experience their own virtual bodies [38]. Depending
on the characteristics of the virtual avatar and the expectations a person has when
interacting with the virtual version of themselves, this feeling can be greatly affected by
the use of VR technology and can vary in intensity [48].
Neyret et al. [38] investigated how people perceive and evaluate the shape of an avatar
based on their own body size compared to a body based on their own body shape
estimations and their ideal body in a VR environment. These three distinct avatars were
made by the researchers, and the participants were required to observe and assess them
once in the first-person perspective and once in the third-person perspective. The results
of the study show that there is a difference in body perception depending on the setting
the users are immersed in. For example, female participants in the study rated their
real bodies as more attractive when it was viewed from the third-person perspective.
Another finding is that viewing one’s own body from the third-person perspective helped
to reduce body dissatisfaction.
Another study looked at body weight perception in relation to self-perception and the
perception of others [48]. Participants were asked to estimate the body weight of a
virtual avatar once when they themselves embodied a photorealistic avatar and performed
movements in front of a mirror and once when they watched an avatar controlled by
another person perform the movements. It turned out that participants underestimated
the weight of the virtual avatar when they controlled it themselves, compared to when
they only observed another avatar.
These results suggested that one’s own body perception in VR depends on how the virtual
avatar is presented and how personalized it is. Furthermore, one’s own body perception
also influences body perception in the virtual world. However, body perception can
also be manipulated by giving people the feeling that an object that does not actually
belong to their body is perceived as their own body. In the famous rubber hand illusion
experiment, Botvinick et al. [6] showed that participants felt a sense of ownership towards
a fake rubber hand after stimulating it visually and tactilely in synchrony with their own
hidden hand. This shows that a high level of ownership can be an important factor in
the success of virtual simulations and experiences.
2.2.2 Sense of Embodiment
The “Sense of Embodiment“ (SoE), which refers to the feeling users have when using VR
applications that the virtual body is their own and that they are in control of it, plays
6
2.2. Body perception and Sense of Embodiment
an important role in the VR experience and is related to the topic of body perception in
VR. Kilteni et al. [26] tried to generalize the definition of this concept of embodiment
and described it as a combination of three sub-components: the sense of self-location
(SL), the sense of agency (AG), and the sense of body ownership (BO).
Self-location represents the feeling of “being in“ the virtual body, while agency refers to
the feeling of being able to control and direct the actions of the virtual body. Finally,
body ownership describes the feeling that an artificial limb or an entire body belongs to
the person immersed in VR. As users often experience SoE unconsciously to a certain
extent when being embodied in a virtual avatar, these three components of embodiment
play a crucial role in the overall VR experience and immersion.
There are several approaches to measuring SoE in users [26]. Measures for addressing
the SoE can be:
• Questionnaires (for all three sub-components)
• Physiological response to threat. Can be measured using the “Skin Conductance
Response“-test, which measures a short-term drop in the electrical conductivity of
the skin in response to a thread (SL and BO).
• Estimation of body position (SL).
• Estimation of body parts’ size (BO).
• Proprioceptive estimations, by letting users assess their ability to move during
appropriate movement tasks or through visual feedback, e.g., assess what they see
in a mirror (BO).
Various factors can influence the SoE and its sub-components. The sense of self-location
can be impaired if users have the feeling that they are no longer in the virtual body
and have out-of-body experiences (OBE) [26]. Studies have also shown that self-location
suffers when users look at the avatar from a third-person perspective rather than a
first-person perspective [11]. The strength of the sense of agency, on the other hand,
depends largely on how quickly the visual feedback is displayed after the execution of a
movement. If there is a time delay between the user’s actions and the visual feedback, the
sense of agency suffers and, as a result, the SoE [11]. The agency is not only influenced
by the synchronicity between the user’s actions and the resulting visualization but can
also be influenced by the embodiment of the tools that the user controls [26]. The
sense of body ownership is influenced by a combination of bottom-up and top-down
influences. Bottom-up information refers to visuo-tactile stimuli, i.e., stimuli that are
transmitted from the sensory organs to the nervous system and brain. Top-down refers
to the processing of sensory stimuli, e.g., seeing a virtual avatar and judging whether it
represents one’s own body or not [26].
The SoE, and the factors that influence it have been well studied. Porssut et al. [40]
7
2. Related Work
(a) No body visibility. (b) Low body visibility. (c) Medium body visibility.
Figure 2.1: The three body representations to test body ownership (first-person perspec-
tive) [34].
investigated how reaching the articular limits of a virtual arm can negatively affect a
person’s SoE in VR. The participants were asked to hold a physical cylinder mapped to a
virtual cylinder and had to perform reaching movements with varying degrees of distortion
between their real and virtual arms. The experimenter measured their perception of the
distortion and their SoE while performing the tasks.
The results showed that negative distortions (which hinder movement by mapping the
virtual hand behind the real hand position) are more easily detected than positive
distortions (where the virtual hand is ahead of the real hand, helping to reach objects).
Further, reaching the articular limit reduced the participants’ SoE. The authors stated
that this is because reaching the articular limit makes users more aware of a discrepancy
between their real and virtual arm movements, which leads to a lower detection threshold
for movement distortions and a reduced SoE.
A further study examined the impact of different virtual hand representations on the
factors user performance, sense of agency, and sense of ownership when interacting
with virtual objects and obstacles within a VR environment [33]. Participants had to
complete a task that involved selecting and positioning a cube in a virtual environment
having different virtual hand representations (sphere, controller, hand) and with different
obstacle conditions. The authors found that the hand representation led to the strongest
sense of ownership. The controller representation, on the other hand, performed best for
precise positioning tasks, leading to good user performance.
The results of the mentioned studies demonstrated that a realistic and comprehensible
depiction and animation of the avatar in VR has a significant influence on the degree of
immersion of the user. Furthermore, the realism of the virtual avatar’s representation is
of greater importance than the number of visible limbs, as found out by Lugrin et al. [34].
They stated that the number of visible parts of an avatar’s body has a negligible effect
on virtual body ownership, immersion, or performance, and that “sensorial immersion
and a well-calibrated motion control“ [34] are more important for a strong immersion.
8
2.3. Dissimilar avatars
(a) Human avatar. (b) Robot avatar. (c) Block-like avatar.
Figure 2.2: Types of virtual avatars. (a) and (b) are anthropomorphic avatars, (c) is a
non-anthropomorphic (dissimilar) avatar [24].
In the study, there were three body conditions tested: no visible body, low visible body
(hands and forearms), and medium visible body (head, torso, arms) (see Figure 2.1). In
the “No Body“ condition, virtual controllers were shown in the absence of a visible avatar
body. In the “Low“ and “Medium“ conditions, they gradually increased the number of
visible body parts of the avatar, with the “Medium“ condition having a similarity to a
human body of about 50%. However, it should be noted that the “No Body“ condition
also showed virtual controllers to assist the user in performing the tasks. Therefore, they
noted, it may not have accurately tested the complete absence of a virtual body.
2.3 Dissimilar avatars
Most studies on VR, and on SoE in particular, usually explored anthropomorphic avatars
or avatars with a human representation and rarely non-human (dissimilar) representation.
This is understandable, as studies have shown that users prefer anthropomorphic represen-
tations of themselves in VR to non-anthropomorphic ones [24]. It makes little difference
whether the virtual avatar looks like a human or only has human-like characteristics (e.g.,
a robot with the same number of limbs and body proportions as a human); these avatars
are preferred by users over block-like avatars (Figure 2.2). Furthermore, research has
indicated that the discrepancy between the physical appearance and the virtual avatar
can impact self-perception, user experience, and task performance [24, 42]. However, in
VR, there are no limits to the appearance of an avatar. Therefore, in recent years there
has been an increasing amount of research exploring the potential of dissimilar avatars.
The use of such avatars in VR raises a number of questions about the user’s SoE and the
impact on the user experience and agency.
Several studies have already explored avatars with structural differences. One of the early
studies in 2013 investigated body ownership and control of a humanoid avatar that has
a tail [45] (see Figure 2.3a). Participants were divided into two groups, where the first
group could control the tail with hip movements, while the other group had a tail that
could not be controlled and moved randomly. Participants with the controllable tail felt
a greater sense of ownership and control over the virtual body with the extra appendage.
9
2. Related Work
Even though it is not a normal part of the human body, they were also able to learn to
control the avatar’s tail. Participants were also more anxious and tried to avoid danger
to the avatar’s body and tail if they had a greater sense of control over the avatar.
(a) Human avatar with an additional
tail [45]. (b) Six digit virtual hand avatar [21].
(c) Full-body (FB) animal avatar [31].
Figure 2.3: Different avatar representation reported in the literature.
Another study came to a similar conclusion, in which Hoyet et al. [21] investigated
how users perceived and accepted controlling a virtual avatar with structural differences,
specifically a six-digit virtual hand (see Figure 2.3b). According to the authors, partici-
pants felt a great sense of agency and ownership over the entire six-digit virtual hand,
and they were more receptive to the extra animated finger as a component of the hand if
it was animated than they were to its rigid, not-animated state. In addition to avatars
with additional limbs, avatars that deviate completely from a human-like representation,
such as animals or mythical creatures, were also studied [31, 23, 30, 49] (see Figure 2.3c).
Players reported high enjoyment and SoP when embodying animal avatars in the VR
games and appreciated the unique abilities and perspectives offered by the dissimilar
avatars, which allowed them to do things they cannot do with human-like avatars. In
addition, the differences between full-body (the player’s posture is mapped onto the
10
2.3. Dissimilar avatars
Figure 2.4: Proposed categorization system for dissimilar avatars applied to a virtual
hand [7].
entire virtual body) and half-body (the player’s lower body is mapped onto all the limbs
of an animal) control modes and third-person control modes of the animal-like avatar
were investigated, with full-body and half-body control modes being effective in creating
a sense of virtual body ownership for animal avatars and outperforming third-person
control modes.
Xu et al. [49] have shown that in addition to the high levels of SoE experienced by
research participants when embodying animals, virtual animal embodiment can even
evoke human empathy for the animals. To foster empathy, the researchers created a
system called “iStrayPaws“ that simulated the lives of stray animals using a Virtual
Reality Perspective-Taking (VRPT)-based methodology. Participants had to find shelter,
food, and escape from mistreatment while being embodied in different stray animals.
The results showed that using such a VRPT system can significantly increase empathy
for stray animals compared to a narrative-based task.
With such a wide variety of dissimilar avatar types, it makes sense to categorize these
avatars. A framework for classifying dissimilar avatars has recently been published by
Cheymol et al. [7]. They categorized them into three main groups: structural, volumetric,
and superficial aspect dissimilarities. Figure 2.4 gives an overview of the proposed
classification. Dissimilar avatar types that differ from the real human body in terms of
skeletal structure or morphology are termed “structural dissimilar“. These differences
include, for example, changes in the number of limbs. The term “volume dissimilarity“
describes avatar types with different body sizes and proportions. Finally, avatars with
dissimilar surface characteristics, such as unusual skin color, texture, or material, are
referred to as “superficial aspect dissimilar“.
Based on this categorization, dissimilar avatars can be better studied, understood, and
compared. If an avatar falls into one or more of these categories, this can produce a
stronger or weaker SoE in the user, which must be taken into account when designing
dissimilar avatars [7].
11
2. Related Work
(a) One-for-one. (b) One-for-all. (c) Re-embodiment. (d) Co-embodiment.
Figure 2.5: Social presence configurations of agents [35].
2.4 Co-Embodiment in Virtual Reality
The concept of co-embodiment refers to the feeling of sharing a virtual body with one
(or more) other users and controlling it together. Co-embodiment was first studied in
experiments utilizing two voice assistants and their physical embodiment as a single
entity, represented by the same car [35]. In their user enactment study, the authors
examined how people perceive and respond to different configurations of social presence.
In particular, they examined how the participants respond to virtual agents that are
based on human models (with each agent bound to a single body), those that are designed
as a universal system (where a single agent controls multiple bodies), those that make use
of re-embodying (where an agent can move its social presence from one body to another),
and those that are able to co-embody (where an agent can join another agent within a
single body). Figure 2.5 illustrates the four configurations of social presence. Although
the participants in the study were not physically co-embodied in a virtual agent, the
study revealed some new insights and challenges regarding this new topic and laid the
foundation for further co-embodiment research.
In a later study, Fribourg et al. [16] began to investigate how the users’ agency changes
in a co-embodied scenario. The users shared control over a virtual avatar with varying
degrees of control (full, partial, or no control) based on a weighted average of both
users’ movements. The participants had to use the virtual arm to complete a number of
grabbing tasks. Additionally, they were asked to report their sense of control of the arm
while performing the tasks. The results demonstrated that while participants are able to
estimate their true level of control, they frequently overestimate their sense of agency
when they are able to predict the movements of the avatar. These findings suggested
that even partial co-embodiment can create a sense of shared agency and ownership of
the virtual body.
These findings were supported by other studies [50, 20], including Hapuarachchi et al.
[20], who investigated a co-embodied scenario where each user controlled a different arm
and analyzed their SoE. The study used a shared avatar where one participant controlled
the left side and the other controlled the right side, both in the first-person perspective,
and participants had to reach and touch targets that appeared randomly. The experiment
was conducted under three conditions. Same target (a shared target), different targets:
visible (each participant had a different target), and different targets: invisible (as in
visible, but the partner’s target was hidden).
12
2.4. Co-Embodiment in Virtual Reality
Figure 2.6: Virtual avatar controlled based on the weighted average of the teacher’s and
learner’s movements in a co-embodiment scenario. [29].
The results suggested that the SoE towards the own arm was higher than for the arm
controlled by the partner. In the case where the partner’s target was visible, this visual
information can help to improve the SoE towards the arm controlled by the other user.
This was because visibility helps to predict the partner’s actions and improve the SoE of
the uncontrolled arm.
Kodama et al. [28] went one step further and investigated the extent to which co-
embodiment is suitable as a method for improving the sense of agency and the acquisition
of motor skills. They used an avatar whose movement was controlled by a weighted
average of the movements of multiple users (see Figure 2.6). The percentage that
determines how much each user’s movements contribute to the avatar’s final virtual
movement is referred to as “weight“. The idea was based on the fact that students can
learn motor skills by observing and imitating their teachers’ movements. During the
experiment, two conditions were tested: a static and an adjusted weight control. In the
static weight control condition, the weights were set to a fixed value of 50% throughout,
while in the adjusted weight control condition, the weights were initially set to 50% and
then adjusted based on the learner’s performance.
The results showed that although the weight adjustment method prevented a drop in
learner performance, it also led to lower learning efficiency after the virtual co-embodiment
ended. In another study, Kodama et al. [29] investigated how learning efficiency changed
when the learner was located in different embodiment scenarios: virtual co-embodiment
with the teacher, sharing the teacher’s first-person perspective, and learning alone. As
a result, the efficiency of learning motor skills was improved by learning in the virtual
co-embodiment scenario with a teacher compared to learning alone or by sharing the
teacher’s first-person perspective.
13
2. Related Work
2.5 Inverse Kinematics
In order to provide the user with the highest possible SoE in the application, avatars
must be animated according to the movements of the user. This requires a simple but
robust algorithm that maps the positions of the real limbs as accurately as possible to
the limb positions of the virtual avatar. One algorithm to achieve this is called Inverse
Kinematics (IK). In this algorithm, an articulated chain represents the users’ body, where
each joint in the articulated chain represents a joint of the real body.
In IK, each joint angle of an articulated chain is determined to achieve the desired
position and orientation of the end effector. Only the position and orientation of the root
joint and the end effector are known, and the remaining joint angles are calculated based
on certain constraints. This makes it possible to procedurally animate an articulated
chain in order to imitate realistic movements. A variety of IK methods are described
in the literature, each with their own advantages and disadvantages. Aristidou et al.
[3] described a number of frequently used IK techniques like Jacobian-based solvers,
Newton-based solvers, Cyclic Coordinate Descent (CCD) solvers, Sequential Monte Carlo
method (SMCM) or the Triangulation algorithm.
Jacobian-based solvers. The Jacobian matrix is used to approximate the IK problem
linearly. While this family of solvers produces smooth postures, they suffer from compu-
tational complexity and singularity problems. There are several more computationally
efficient adaptations of this algorithm, but for real-time applications, this approach may
be too computationally expensive [14].
Newton-based solvers. Newton’s methods approach the problem by formulating it as a
minimization problem. Newton’s methods can be complex to implement and compu-
tationally expensive per iteration, but they can also be very effective. This approach
results in a smooth motion with no sudden changes [17].
Cyclic Coordinate Descent (CCD) solvers. A heuristic iterative method that is known for
its computational efficiency and ability to solve IK problems without complex matrix
manipulation. Because it is designed for serial chains, CCD can be difficult to adapt to
multiple end-effectors, and it can produce unrealistic animation and abrupt motion even
with additional constraints [25].
Sequential Monte Carlo Method (SMCM) and particle filtering solvers. Statistical meth-
ods that represent the articulated chain as a collection of particles. Each particle has 3
degrees of freedom (DoF) and they are connected to each other by length constraints. To
reconstruct the DoF of the final joints, particle positions and length constraints are used,
both of which are calculated based on customizable constraints that are dynamically
adjusted by various preconditions and parameters. These methods avoid the problem of
matrix singularity but can be computationally intensive [9].
Triangulation algorithm. The triangulation algorithm uses the cosine rule to determine
each joint angle. It starts at the root joint and moves outwards towards the end effector.
The final motion can look unnatural, but achieving the goal only takes one single iteration.
It is also limited to solving problems with a single end effector [37].
14
2.5. Inverse Kinematics
Figure 2.7: Full iteration of the FABRIK algorithm consisting of a forward iteration
(a)-(d) and a backward iteration (e)-(f) [3].
In their work, Aristidou et al. [3] proposed a simple and lightweight algorithm to tackle
the IK problem, called Forward And Backward Reaching Inverse Kinematics (FABRIK).
It is an iterative method that computes the joint positions by iterating forward and
backward through the chain using the previously computed joint positions. Since the
problem can be reduced to computing a point on a line, it is a fast algorithm optimized
for animating articulated chains in real-time.
The algorithm starts by calculating the distances between each joint di to check if
the target position t can be reached. If the target cannot be reached, the algorithm
constructs a line pointing to the target and terminates; otherwise, a full two-stage iteration
is employed. In the first stage, the position of the end effector pn is set to the position
of the target point (p′n = t). Then the point pn−1 is recalculated based on the line
passing through pn−1 and p′n, taking into account the length constraints. This process
15
2. Related Work
is repeated for each remaining joint up to the root joint. As the position of the root joint
may now deviate from its original position, in the second stage, the root joint must be
moved back to its starting position, and the positioning process of all joints is repeated,
this time backward to the end effector. In this way, the position of the end effector moves
closer to the target position with each iteration, and the algorithm stops when a certain
threshold is reached. Figure 2.7 shows a full iteration of the FABRIK algorithm with a
single target and four joints. This algorithm can also calculate articulated chains with
more than one end effector.
The counterpart to IK is forward kinematics (FK), which calculates the position and
orientation of the end effector of an articulated chain from the known angles of each
individual chain joint. While IK is a complex problem, especially when calculating
anatomically and analytically correct models, FK is relatively simple and inexpensive
to calculate. However, FK is only suitable for certain problems because, as mentioned
above, it requires already-known joint angles, which are usually not known in advance
for a real-time animation algorithm.
2.6 Selection Metaphors
When users need to interact with the virtual environment through their avatars, the
right choice of 3D interaction metaphors can greatly impact the user experience [22]. It
is therefore important to carefully design these techniques to improve the user experience
and SoE. Typically, 3D interaction is divided into four parts:
• Navigation: the movement of the user in the VR environment
• Selection: the action of pointing to an object and its validation
• Manipulation: the changing of the state of a previously selected object (usually
position and rotation, but can also be size, color, etc.).
• System control: the interaction/dialogue between the user and the application
through menus or functions
Each of these four parts describes a different area of interaction that the user can control
and interact with in VR, whereby only certain parts are covered depending on the
interaction metaphor. In this thesis, we focus on the selection and manipulation parts
and compare two metaphors to assess the SoE and agency of the dissimilar avatar.
When designing selection and manipulation metaphors, a common strategy for improving
user performance in terms of task completion times and error rates is the application of
Fitts’ law [1, 36]. It states that the expected time required to acquire a target object is
a function of the ratio of the width of the target to its distance. The most commonly
used equation for Fitts’ law expresses it as a relationship between the width W of the
target object, the distance A, and some regression coefficients that take into account the
16
2.6. Selection Metaphors
reaction time required by the user to locate the target and the performance of the task.
Equation 2.1 shows the relationship between these factors.
T = a + b ∗ log2(A + W
W
) (2.1)
T is the index of difficulty (ID) and is given in bits per second, a unit of information.
This is because the given equation is derived from an equation from information theory
that models the transmission of information [36]. The amount of information transferred
while performing a pointing task (in bits) can be considered as the difficulty of the task,
where T is the movement time required to reach the target. A higher T value means a
longer movement time and therefore a more difficult task. Several factors can influence
the ID, as studies have shown [19]. Thus, the angle of the movement has a significant
influence on performance, as well as the dimension of the target size along the primary
axis of movement, while the other two dimensions tend to have less influence.
This principle can be applied to the design of selection metaphors in VR interfaces to
make target selection easier and more efficient. Studies have shown that the type of
selection metaphor used can significantly impact user performance [1]. Depending on
the characteristics of the selection metaphor (whether it is a free-space interaction, how
good the visual feedback is, whether there are occlusions during the selection, etc.), they
can influence the ID.
An efficient selection metaphor combines both precision of target movement and ease
and naturalness of interaction. Various 3D object selection and manipulation techniques
exist in VR. Depending on the type of interaction, they are divided into exocentric
metaphors (i.e., where the user interacts from outside (third-person perspective)), such
as World-in-Miniature or automatic scaling, or egocentric metaphors (where the user
interacts from inside (first-person perspective)), such as virtual hand or ray casting [1].
In the following subsections, we discuss two selection and manipulation metaphors that
are useful for the dissimilar avatar in the co-embodied setup.
2.6.1 Go-Go Interaction
The Go-Go interaction technique is a selection and manipulation metaphor for the
interaction with near as well as distant objects in VR and was first introduced in 1996
by Poupyrev et al. [41]. The authors described the technique as a non-linear mapping of
the users’ real arm movements and the movements of the virtual arm. At close range,
the virtual arm precisely follows the user’s arm movements. As soon as the user moves
the arm away from the body, the virtual arm grows non-linearly in order to reach distant
objects without physically moving. This technology enables precise interaction and
manipulation at close range as well as reaching distant objects.
A useful measure to categorize interaction techniques is the Control-Display (CD) ratio.
It describes the ratio between the movement of the input device (control) and the
resulting movement of the virtual object (display) [1]. If the ratio is 1:1, the virtual
17
2. Related Work
Figure 2.8: The mapping function F used in the Go-Go Interaction technique [41].
movement corresponds exactly to the physical movement. A ratio other than 1 means
an “amplification effect“, where a small physical movement results in a larger virtual
movement (CD ratio < 1) or a large physical movement results in a smaller virtual
movement (CD ratio > 1).
The Go-Go interaction technique uses a dynamic CD ratio that depends on the distance
of the virtual hand from the user’s body. Specifically, it uses a non-linear mapping
function to both reach distant objects and work accurately at close range (Figure 2.8).
The threshold value D defines the point at which the function switches from linear to
non-linear mapping. With linear mapping, the movements of the virtual hand follow the
movements of the real hand. When the user extends the arm beyond the threshold, the
virtual arm grows non-linearly. The threshold value is normally set at approximately 2
3
of the user’s arm length. The function F is defined as follows:
Rv = F (Rr) = Rr if Rr < D
Rr + k(Rr − D)2 otherwise
(2.2)
Here Rv is the length of the virtual arm and is calculated using the function F (Rr),
where Rr is the length of the real arm (i.e., the length of the vector R⃗r pointing from the
origin to the user’s hand). If the user’s real arm is not stretched beyond the threshold
(Rr < D) the virtual arm length Rv equals the real arm length Rr. Otherwise, the virtual
arm length is calculated based on the real hand position and an “amplification factor“
k(Rr − D)2, where k is a coefficient that defines how much the virtual arm grows or
shrinks.
The Go-Go interaction technique, with its core principle of non-linear mapping and
intuitive interaction, has influenced the development of various VR interaction techniques.
Many modern VR systems and applications incorporate elements of non-linear mapping
and virtual hand extensions to make user interaction intuitive and efficient [1]. For
example, Auteri et al. [4] combined the Go-Go technique with PRISM, a velocity-based
scaling technique to improve accuracy, to achieve precise object manipulation in 3D.
18
2.6. Selection Metaphors
Figure 2.9: The “Gaze-and-Pinch“ interaction with one or two hands: look at an object,
pinch to select it, manipulate it with hand gestures [39].
The user study showed that the hybrid Go-Go + PRISM interface led to an almost 2:1
improvement in accuracy over Go-Go alone in a task where a virtual object had to be
aligned within a distant target.
2.6.2 Gaze-and-Pinch
Another interaction metaphor is the “Gaze-and-Pinch“ technique described by Pfeuffer
et al. [39]. This interaction metaphor is often used for the selection and manipulation of
distant objects that are out of the user’s reach. In VR interactions, the user’s eyes often
naturally point at the interaction targets, and therefore this method takes advantage
of this natural behavior. When using “Gaze-and-Pinch“, the user’s gaze is tracked by
some special cameras in the HMD, and objects are selected that the user is looking at.
In addition, hand gestures (e.g., pinching the thumb and index finger together) are used
to confirm the selection and manipulate the target object (see Figure 2.9).
The authors conducted an informal user evaluation of this technique by creating an
experimental UI system that allowed users to interact with various use cases showcasing
different “Gaze-and-Pinch“ variations. Users emphasized the potential of this new type
of interaction that differs from other interactions and described the “Gaze-and-Pinch“
technique as easy to use. Since a handheld controller is not necessary for this interaction
method, it can be used in various applications. It does, however, need modern hardware
and a reliable hand and eye tracking system.
19

CHAPTER 3
Avatar Design and Implementation
In this chapter, we present the design and implementation of the experimental platform,
including the virtual environment and the interaction metaphors explicitly developed
for the co-embodied, dissimilar avatar setup. Section 3.1 gives an overview of the
requirements for the application, and section 3.2 explains the VR project created with
Unity3D, the virtual environment developed, as well as the networked dissimilar avatar.
In the following section, the implemented interaction metaphors are explained. Section 3.5
then discusses the evaluation tasks implemented to evaluate the interaction metaphors.
3.1 Overview and Requirements
As explained above, an experimental platform is developed with the Unity3D software on
top of an existing basic VR environment. To enable different VR HMDs to work with the
application, we use the OpenXR plugin in Unity. The idea of the experimental platform
is to allow several users to share control of a dissimilar avatar and to interact with virtual
objects using specifically designed interaction metaphors (see Figure 3.1). Therefore, the
experimental platform is divided into multiple parts that have to be carefully designed
and implemented:
• Two interaction metaphors have to be developed for the dissimilar avatar for
enabling interaction with virtual objects
• The dissimilar avatar has to be animated according to the interaction metaphors
• The dissimilar avatar has to be networked to support multi-user interaction
• Evaluation tasks must be implemented to test and evaluate these interaction
metaphors
21
3. Avatar Design and Implementation
Figure 3.1: Left - Overview of the project’s architecture, including the co-embodied
dissimilar avatar. Right - An example body configuration of the two users showing the
limbs they control.
We cover each of these parts in detail in the following subsections. For now, we briefly
describe the requirements for each of the parts.
Interaction metaphors. We implement two different interaction metaphors used for
selecting and manipulating objects in VR. These two metaphors should be intuitive
to understand and easy to control for the users. Additionally, they should work in a
one-player setup as well as in a networked multiplayer setup. Each of the two interaction
metaphors comes with its own requirements, discussed in the following section 3.3.
Dissimilar avatar. The dissimilar avatar should be designed and adapted to work with
two users simultaneously. It should benefit from shared control, and the two interaction
metaphors should work well with the avatar. The avatar should be animated based on
the interaction metaphor used and evoke a strong SoE in the users.
Networked interaction. The selected and manipulated virtual objects, as well as the
avatar’s animations, should be networked and displayed synchronously for both users in
order to enable joint interaction within the VR environment. The interaction of a single
player should still be possible even if no second user is connected.
Evaluation tasks. The developed experimental platform should be evaluated in a user
study, and for this, evaluation tasks should be implemented. These evaluation tasks
should assess the user experience and agency of the users when interacting in the VR
environment with the interaction metaphors.
3.2 Virtual Reality Environment
This section gives an overview of the structure and components of the VR platform,
describes how the user’s movements are networked, how the dissimilar avatar is animated,
and deals with the implementation of the interaction metaphors and interaction tasks.
22
3.2. Virtual Reality Environment
Figure 3.2: The interaction test scene in which the user can grab cabbages using the
interaction metaphors (left - in third-person perspective, right - first-person perspective).
3.2.1 3D scene setup
The components of the application are developed in a 3D scene within the Unity3D
game engine. The virtual environment consists of an outdoor scene with objects that the
participant can interact with. The main components of the virtual scene are the dissimilar
avatar, a mirror, and three objects (cabbages). The mirror serves as a visual reference
for the user to understand the physique of their virtual avatar and its movements when
using the interaction metaphors. Figure 3.2 shows the general setup of the scene. As the
avatar in this study is stationary and does not need to be able to move, no visible spatial
boundaries are defined for the playing field. In the user study, however, a fence is added
to help participants improve their spatial perception during interaction. The objects are
there so that the interaction metaphors can be used to grab and move the objects.
3.2.2 Networking
In order for two users to control the dissimilar avatar at the same time, their movements
need to be synchronized over the network. The Photon Unity Networking 2 (PUN2)
library 1, which is available in the Unity Asset Store, is used for this purpose. PUN uses
the concept of rooms to create and manage multiplayer games. When a player joins a
room, they can see and interact with all other players currently connected to the same
room. If a player tries to join a room but no room exists yet, the server will create a
room and give that player the master client role (a master client is unique to each room;
it can perform special actions, such as starting the game or kicking other players out).
All subsequent players who want to join are then client players.
In our application, a connection between the players is established when the application
is started using the “PhotonManager“ GameObject. It has a script attached in which the
current player joins a random open room. As there can only be one room in our concept,
it is not necessary to create a separate room with specific settings, but joining a random
1https://www.photonengine.com/pun
23
3. Avatar Design and Implementation
room is perfectly fine. If this is the first player to connect to the application, there is no
room yet, and a new one will be created. The second player to connect to the application
will join this room as a client player. The order in which users connect to the server
determines the numbering of players and the initial body configuration (i.e., which user
controls which eye and arm), but this configuration can be changed anytime during the
simulation. When a player joins the game by entering the room, a new “NetworkPlayer“
prefabricated component (a prefab, i.e., a blueprint of a GameObject that can be created
any number of times and always has the same scripts and properties) is instantiated
with the PhotonView.cs and the NetworkPlayer.cs scripts attached. See Figure
3.3 for the room creation and networking functionality.
The NetworkPlayer.cs script contains the logic to record the movements of the
current user and pass them on to the network. To achieve this, the script takes the local
movements of the “Player“ GameObject, which is directly controlled by the HMD and
controllers, and sends them to the remote instance of the application. These movements
are then transferred to a local GameObject called “GhostPlayer“ within the application
instance of the other user, whose movements, in turn, control the limbs of the avatar
of the corresponding player in the local instance. In the local instance of a player, the
movements of the “Player“ GameObject are recorded, and, in addition to being sent over
the network using the network player script, the avatar’s limbs are moved accordingly.
An overview of these movement transmissions is shown in Figure 3.4.
3.2.3 Dissimilar Avatar
This subsection describes the dissimilar avatar, which can be embodied by two users
simultaneously and allows users to interact non-isomorphically with the virtual environ-
ment through novel interaction metaphors. This means that a total of two HMDs and
two pairs of controllers must be tracked and that it must be possible to assign each of
these tracked devices to a limb of the avatar. The avatar used for this purpose consists
of four arms and two eyes. The avatar resembles an upright slug with two tentacles
protruding from each side of the torso and two eye-stalks protruding from the upper part
of the torso. The avatar also has a tail that emerges from the back of the torso. The left
image in Figure 3.5 shows the avatar in its default pose.
The avatar is rigged in order to animate it according to the user’s movements. A detailed
explanation of how the animation works with the help of the IK algorithm can be found
in section 3.4. Each of the tentacles is controlled by nine bones, with one bone at the
end of the articulated chain serving as an anchor and not moving. The eyestalks are
controlled by nine bones as well, with all bones movable. In addition, the avatar has
bones in the torso and tail to animate the rest of the body. The image on the right of
Figure 3.5 shows the avatar rigged with its bones.
As mentioned above, the avatar can be controlled by two users simultaneously, with one
user controlling the left or right eye and the upper or lower arms. The second user then
controls the other limbs. These body configurations can be adjusted as required. In order
24
3.2. Virtual Reality Environment
Figure 3.3: Overview of the room creation and networking logic when starting the
application.
25
3. Avatar Design and Implementation
Figure 3.4: Overview of the logic of the synchronized and networked movements of the
players that drive the dissimilar avatar limbs.
to control the limbs of the avatar, a structure is required that defines which parts of the
avatar are controlled by which player. For this purpose we introduce the two classes
PlayerStruct.cs and PlayerArmStruct.cs. These classes bundle all information
about a player and its assignment to the avatar limbs, and these structures are used for
movement mappings throughout the application. Listing 3.1 and listing 3.2 show the
structure of these classes.
1 public class PlayerStruct
2 {
3 public InteractionMetaphor InteractionMetaphor { get; set; }
4 public Transform EyeTarget { get; set; }
5 public PlayerArmStruct LeftArm { get; set; }
6 public PlayerArmStruct RightArm { get; set; }
7 }
Listing 3.1: The PlayerStruct class containing information about the player controlling
one part of the avatar.
The PlayerStruct.cs class contains the following information: the interaction metaphor
the player is currently using; the eye target Transform, i.e., which of the two eyes the
player is controlling; and a “PlayerArmStruct“ for the left and right arm.
The PlayerArmStruct.cs class contains all essential information about the control
of the tentacles, e.g., whether it is a left or right arm; the virtual hand Transform, which
is used to store the position of the virtual (Go-Go) hand; a Transform containing the
position of the real hand; a Transform containing the position and rotation of the tentacle
26
3.3. Selection Metaphors
Figure 3.5: Left: Shaded model of the dissimilar avatar. Right: Transparent avatar with
its rig and bones structure.
tip (necessary to influence the tentacle tip according to the rotation of the real hand);
a Transform for the arm target, i.e., the object that controls the effective position and
rotation of the virtual tentacles based on the real or virtual hand movement; and a field
containing the FABRIK IK code for this arm to achieve different behavior of the IK
algorithm depending on the interaction metaphor.
1 public class PlayerArmStruct
2 {
3 public bool Left { get; set; }
4 public Transform VirtualHand { get; set; }
5 public Transform RealHand { get; set; }
6 public Transform TentacleTip { get; set; }
7 public Transform ArmTarget { get; set; }
8 public FastIKFabrikStretch FabrikStretch { get; set; }
9 }
Listing 3.2: The PlayerArmStruct class containing information about one arm that the
player controls.
3.3 Selection Metaphors
The core of this work is the implementation of suitable selection and interaction metaphors
for the dissimilar avatar. In the following subsections, two interaction metaphors
are discussed, which are later evaluated in a user study. These metaphors are suit-
able for manipulating distant objects and can be used by two users simultaneously
in a co-embodiment setup. To be able to assign the interaction metaphors to the
user, we create the InteractionMetaphorSelector GameObject, which contains the script
27
3. Avatar Design and Implementation
InteractionMetaphorSelector.cs and a PhotonView.cs script.
The PhotonView.cs script is used to synchronize the changes that one user makes to
their instance of the program in the InteractionMetaphorSelector GameObject with the
other user’s program over the network. The InteractionMetaphorSelector.cs
script has two drop-down menus, one for Player One (i.e., the master player) and one for
Player Two (i.e., the client player), with different interaction metaphors to choose from
(more details on the individual metaphors in the following subsections). In this way, the
connected users can be assigned the desired metaphor, and their arms (depending on
which one they control) will adopt the behavior required for the interaction metaphor.
The selectable metaphors are “Default“, “GoGoInteraction“, “GazeAndManipulate“, and
“GoGoPlusGazeAndManipulate“.
With the default interaction, the movements of the user’s arms are mapped 1:1 to the
avatar’s arms. The physical limitations of the avatar are respected, i.e., the user stretches
their arm out very far, and the virtual arm is moved only as far as possible and will
not stretch. This interaction metaphor can lead to a lower SoE, as the user observes a
different behavior of the virtual arms than he would expect based on the movements
of his own hands. The other three interaction metaphors are discussed in detail in the
following subsections.
3.3.1 Go-Go Interaction
The first of the two interaction metaphors that we implement and with which we eval-
uate the usability of the dissimilar avatar in a co-embodied scenario is the so-called
Go-Go interaction technique [41]. This is a non-linear scaling of the virtual arm or
a non-linear mapping between the user’s movements and the resulting movement of
the virtual arm. This makes it easier to reach distant objects in the virtual environ-
ment. To achieve this mapping, a non-linear mapping function is used (see Equation 2.2).
The MonoBehaviour script GoGoInteraction.cs is located on the character’s GameOb-
ject so that the dissimilar avatar can grab objects using the Go-Go interaction technique.
The parameters required by the script are the user’s arm length, a coefficient K, and two
pivot Transforms, one for the left and one for the right arm. The arm length is required
to determine the threshold value D of the mapping function. This value separates the
linear from the non-linear mapping. The literature suggested that it should be 2
3 of the
user’s arm length to achieve the best possible Go-Go behavior (compromise between
good reachability of distant objects and no excessive arm distortion). The coefficient K
determines how much the non-linear part of the mapping function increases, i.e., how
much the arm grows when the user holds it further away from the body.
Finally, the two pivot Transforms are needed to determine the grabbing direction or the
length of the vector R⃗r pointing from the origin to the user’s hand. The pivot points act
as the origin in a user-centered coordinate system, with the origin at the user’s chest. In
our case, the pivots are located approximately at the position of the shoulder to obtain a
suitable vector R⃗r, as this vector should represent the real arm as accurately as possible.
28
3.3. Selection Metaphors
If the pivot points were located exactly in the center of the avatar, this would result in
an unnatural grab direction.
To calculate the Go-Go extension effect for a given arm and based on the corresponding
pivot, we first need to calculate the direction and distance the real arm is moving. This
is done by subtracting the pivot position from the real hand position to obtain the vector
⃗handPivot. To normalize ⃗handPivot, it is divided by Rr, which is the length of the
vector pointing from the hand position to the pivot position (see listing 1 lines 4 to 6).
The equations used are the following:
⃗handPivot = ⃗hand − ⃗pivot (3.1)
Rr = ( ⃗hand.x − ⃗pivot.x)2 + ( ⃗hand.y − ⃗pivot.y)2 + ( ⃗hand.z − ⃗pivot.z)2 (3.2)
| ⃗handPivot| =
⃗handPivot
Rr
(3.3)
After calculating these values, we need to check whether the length Rr is above the
threshold D (non-linear part of the mapping function) or below (linear part of the
mapping function). If the value is lower, the position of the virtual hand is simply
mapped to the position of the real hand. If the value is higher, we need to calculate
how far the virtual arm should extend. This is done by calculating the new length R′
r
(according to the Equation 2.2) using the coefficient K. The virtual hand position is then
calculated using the pivot, the normalized grabbing direction, and the calculated length.
Figure 3.6 shows how the different extensions of the arm affect the position of the virtual
arm.
3.3.2 Gaze-and-Manipulate
The second interaction metaphor is the so-called “Gaze-and-Manipulate“ interaction
technique. This metaphor is inspired by the “Gaze-and-Pinch“ interaction technique [39].
The idea behind this interaction metaphor is to track the user’s gaze and trigger the
selection and manipulation of objects by performing certain hand or finger movements.
Often a “pinch“ motion is performed with the thumb and index finger to manipulate
objects. Since in our setup we cannot directly track the user’s eye movements, we use the
orientation of the head to check whether objects are being looked at. We also use the
grip buttons on the VR controllers to select and manipulate objects instead of pinching
motions with the fingers.
Similar to the GoGoInteraction.cs script, the GazeAndManipulate.cs script
is located on the character’s GameObject so that the dissimilar avatar can use this
interaction metaphor to gaze at and retrieve objects. In our implementation, the objects
that the user is looking at (i.e., turning their head at) are highlighted with a highlight
29
3. Avatar Design and Implementation
(a) No extension of the real arm: linear part of the mapping function.
(b) Slight extension of the real arm: still in the linear part of the mapping function.
(c) Large extension of the real arm: non-linear part of the mapping function. Arm stretches
accordingly.
Figure 3.6: Go-Go interaction with different arm extensions. Left: real arm extension
movement. Middle: movement of the virtual avatars’ arm. Right: First-person perspective
of the virtual arm. The “wristband“ objects represent the positions of the real hands.
material. Once the object is highlighted, it can be retrieved to the hand by pressing
either the left or the right grip button (depending on the arm in which the user wants
to retrieve the object), and once in reach, grabbed as usual. The parameters for the
GazeAndManipulate.cs script are the smooth time and the gaze ray distance. The
smooth time defines how fast the object will reach the hand, where a smaller value
will reach the hand faster. The gaze ray distance parameter defines how far apart the
individual rays are that are used to check whether an object is in the users’ field of view.
Starting from the center of the screen, a total of nine rays are shot into the scene in a
3x3 grid (see Figure 3.7) to check whether an object is currently being viewed. If an
object is hit by one of the rays, it is considered to be viewed and is highlighted. The
larger the gaze ray distance parameter is, the less closely you have to look at objects
to highlight them. The smaller the value, the closer the rays are to each other and the
30
3.3. Selection Metaphors
closer and straighter you have to look at an object in order to interact with it. If the
value is too high, it can happen that a distant object has space between the rays without
being hit by the rays. Therefore, a compromise must be found between user-friendliness
and guaranteeing the selection of objects by finding a suitable value.
Figure 3.7: The nine rays that check for gazed objects. The gaze ray distance (GRD)
defines the spacing between the individual rays. (Slightly tilted view to be able to see
the rays.)
When the user looks at an object and presses the grip button, the object is pulled towards
the virtual avatar’s hand. To avoid lowering the SoE (as the object would just fly around),
we have added an arm animation that grabs the distant object. There is a halo object
for each of the avatar’s four tentacles, with a slightly transparent material. These halos
have the Go-Go stretch logic implemented, meaning they can reach the distant object by
stretching the mesh. During the retrieving animation, the halo always stays attached to
the object, making it look as if a ghost tentacle is pulling the object towards the avatar’s
hand. Once the object (and the halo arm) are close enough to the avatar’s hand, the
halo mesh is turned off. Figure 3.8 shows what this animation looks like.
3.3.3 Go-Go plus Gaze-and-Manipulate
In addition to the two interaction metaphors described above, we develop a third one. This
is a combination of the two metaphors “Go-Go Interaction“ and “GazeAndManipulate“.
In this metaphor, it is possible to grab objects with a stretched arm as in Go-Go, but
also to benefit from the gaze mechanism. For example, an object can be brought to the
stretched arm (which has been stretched by the Go-Go logic) using the gaze mechanism
31
3. Avatar Design and Implementation
Figure 3.8: Animation of the arm halo retrieving the distant object. Once the object is
close enough, the halo mesh is turned off.
and then grabbed and manipulated as usual using Go-Go. We did not evaluate this
metaphor in the user study because we explicitly wanted to analyze the effects of the
two metaphors on the dissimilar co-embodied avatar individually and not the interaction
between the two metaphors. However, combinations of metaphors can be further explored
in future studies.
3.4 FABRIK IK
A modified version of the FABRIK IK [3] (see section 2.5) algorithm is used to generate the
character’s movements. The data from the tracked HMDs and controllers is transferred
to the IK end effectors, which drive the articulated chains of the tentacles and eye-stalks.
We adapt the FABRIK IK algorithm to take into account the movements of the arms
when using the Go-Go interaction metaphor and adjust the joint constraints accordingly.
Once the position and rotation of the virtual arm are determined using the calculations
described in the interaction metaphor subsections above, the FABRIK IK code must take
these results into account. The default behavior of the FABRIK IK logic is to rotate
the individual bones according to the joints to reach the end effector. The length of
the articulated chain, and therefore the length of the tentacles, is limited to the sum
of all individual bone lengths. To achieve the desired Go-Go behavior, these lengths
must be increased accordingly. We, therefore, adapt the FABRIK IK code so that, in
addition to the default target (the user’s hand), it can also follow the virtual hand target
(which corresponds to the newly calculated position and rotation). For this purpose,
when defining the target (i.e., the end-effector in the articulated chain), it is checked
whether the current arm is a virtual arm (in the case of arm halos of the Gaze-and-
Manipulate interaction (see subsection 3.3.2 for a detailed explanation of the arm halo
logic)), or whether the arm uses the Go-Go interaction. In the case of the virtual arm,
the “RetrievingTarget“ (i.e., the object that the virtual halo arm follows) is set as the
end-effector. In the case of the Go-Go interaction, the end effector is the Go-Go target,
i.e., the position of the virtual hand target. If none of the cases apply, the default target
is used as the end effector, i.e., the position of the user’s real hand. The check of these
32
3.4. FABRIK IK
three cases is shown in listing 2.
After the end effector has been defined, it is checked whether the current arm is an arm
halo or a Go-Go-driven arm, and in both cases the “StretchIK“ code is executed. In the
other cases, the “DefaultIK“ code is executed, which, as described above, targets the
user’s real hand and does not stretch the bones. The “StretchIK“ logic uses a stretch
algorithm to extend the arm beyond the standard bone lengths by calculating new bone
lengths. The distance between the position of the end-effector and the position of the
last bone in the articulated chain (that would be the tip of the tentacle) is calculated.
Furthermore, the distance between the end-effector and the root bone (that is the first
bone in the articulated chain and is approximately at shoulder level) is calculated. Using
these two values, a new length is calculated for each bone (see Equations 3.4 to 3.6).
LRemaining = ( ⃗target.x − t⃗ip.x)2 + ( ⃗target.y − t⃗ip.y)2 + ( ⃗target.z − t⃗ip.z)2 (3.4)
LT otal = ( ⃗target.x − ⃗root.x)2 + ( ⃗target.y − ⃗root.y)2 + ( ⃗target.z − ⃗root.z)2 (3.5)
LDesired = (LBones(i)/LComplete) · LRemaining (3.6)
where
LComplete =
n
i=1
LBones(i) (3.7)
The next step is to check whether the total length (from the root bone to the end-effector)
is greater or less than the sum of the current bone lengths. Based on this, the newly
calculated length is either added to or subtracted from the current bone length to create
a “stretch“ or “shrink“ effect on the tentacle mesh. The new lengths of the bones are
added together and saved as the new total length (see Equation 3.7). Once the length
of each bone is determined, the position of each bone after the root bone is set. This
is done by multiplying the new bone length by the grabbing direction and adding it to
the position of the previous bone (see Equation 3.8). Refer to listing 3 for the complete
“StretchIK“ logic.
BonePosi = BonePosi−1 + (direction · LBones(i−1)) (3.8)
To give the user visual feedback about their real arm position, we have introduced a
torus (or “wristband“) for each arm that indicates where the user’s hand is (see Figure
33
3. Avatar Design and Implementation
3.6). Especially in interactions that use the “stretch“ logic, such as the Go-Go interaction
(where the user suddenly no longer sees the virtual hand where their real hand is), it can
be helpful for the user to have a visual indication of where their real hand is currently
located.
3.5 Evaluation Task
In order to evaluate the SoE and effectiveness of the developed interaction metaphors
(details in chapter 4), we had to design an evaluation task that the users will have to
perform in the user study. The requirements for these evaluation tasks were as follows:
1. For both interaction metaphors, all directions in the front half of the user’s field of
view must be covered.
2. The order of the trials must be randomized to minimize learning or predictable
situations.
3. Intuitive gabbing situations must be implemented that direct the user’s full attention
to the interaction and do not distract from the task at hand.
Based on these requirements, an evaluation task system is developed that generates new
random positions each time the user study system is started and spawns the objects at
these positions.
The number of objects that appear and the positions are freely configurable, but we
choose the following setting for our user study: 40 grabbing actions, so-called ‘sub-trials’,
should be performed for each body configuration. A sub-trial is the most atomic unit
of this evaluation task system. It defines where the grabbable object spawns and which
interaction metaphor the user must use to grab and collect this object. Further, it stores
a number of information that is needed for the user study evaluation (e.g., the time it
took the user to grab the object or how strongly the user felt a sense of agency). More
specifically, a sub-trial contains the following information:
• the spawning position (the position the grabbable object spawns in the scene)
• the spawning angle (the angle as seen from the player’s forward vector at which
the grabbable object spawns)
• the spawning distance (the distance from the player at which the grabbable objects
spawn)
• the time needed to grab the object (the time between the appearance of the
grabbable object and its collection; measured for later evaluation.)
34
3.5. Evaluation Task
• the arm to be used (whether the player should use the right or left arm for grabbing
the object)
• the agency question answer (the answer given by the user to the agency question
of the 7-point Likert scale)
• the interaction metaphor (which interaction metaphor the user should use to grab
the object)
The grabbable objects can appear in various positions, based on the configured angles
and distances. We choose five angles and two distances for our user study. As we do not
want the user to grab objects that are behind his body (see requirement 1), we choose
the following angles: -90°, -45°, 0°, 45°, and 90°. This covers the user’s front field of
view (180° in total). For the distances we choose a “medium“ distance (1 meter), and a
“long“ distance (2 meters). For each angle, there are two possible spawning positions,
so in total there are 10 spawning positions with this number of angles and distances.
Figure 3.9 shows the possible spawning position. In addition, each spawning position has
a small random vertical offset to prevent the user from being able to anticipate the exact
grabbing position.
Listing 4 shows the code that generates these positions based on the number of angles
and distances specified. First, a new list is initialized, containing a Vector3 value to store
the final position of the grabbable, an int value to store the spawn angle in degrees, and
a string value to indicate whether it is the medium or long distance. The position is used
to instantiate a new grabbable and place it in the scene, and the angle and distance are
stored in the sub-trial object for later evaluation. We iterate over all given angles and
convert them from degrees to radians using following equation:
angleRad = angleDeg · π/180; (3.9)
Then we iterate over all given distances and calculate the height of the grabbable using a
fixed height value multiplied by a random factor. The final position of the grabbable is
calculated using the converted angle in radians and the distance. Finally, the calculated
position, angle, and distance are stored in the newly created list, which is returned after
all angles and distances have been used to calculate the possible positions.
In order to evaluate two interaction metaphors for both user arms, each sub-trial also
contains information about which metaphor the user must use to grab the object, as well
as which arm the user must use. For each position, both interaction metaphors have to
be covered (as well as both arms), making a total of 40 sub-trials.
Two sub-trials form a so-called ‘trial’, i.e., a collection of two sub-trials that have the
same spawning position and require the same arm for grabbing the object. The difference
between the sub-trials is that the first sub-trial requires the first interaction metaphor to
grab the object, and the second requires the second metaphor. A trial also contains a
35
3. Avatar Design and Implementation
Figure 3.9: The possible spawning positions of the objects for five angles and two distances
(pink = medium distance, blue =long distance).
unique name for later evaluation. To meet the second requirement, the order of the two
sub-trials within a trial is randomized, so that for some positions the first metaphor is the
Go-Go Interaction and for others the first interaction metaphor is Gaze-and-Manipulate.
The order of the trials is also randomized to prevent the user from anticipating the
positions. For each sub-trial, a UI shows which interaction metaphor and which arm the
user must use to grab the object (see Figure 3.10). If the user tries to grab the object
with the wrong arm, it will not be possible to grab the object to avoid wrong data in the
user study.
After each sub-trial (after the user collects the object), a UI is displayed presenting the
question “I felt that I could control the virtual arm as if it were my own arm“. A slider
is displayed below the question, enabling the user to select a value on a Likert scale (see
Figure 3.11) [32]. The scale contains seven values: (1) strongly disagree, (2) disagree, (3)
36
3.5. Evaluation Task
Figure 3.10: The UI during each sub-trial to indicate the interaction metaphor and the
arm to use.
somewhat disagree, (4) neither agree nor disagree, (5) somewhat agree, (6) agree, and
(7) strongly agree. The next subsection 3.5.1 explains how the user selects the desired
value on the slider and confirms their choice. The answer is then saved in the “Agency
question answer“ field of the sub-trial object.
Once the user has:
1. grabbed an object with the first interaction metaphor
2. answered the agency question on the Likert scale UI
3. used the second interaction metaphor to grab the object on the same position
4. answered the second agency question
a UI appears asking the user for the preferred metaphor for that position and the arm
used (see Figure 3.12). This preference information is stored in the trial object in addition
to the two sub-trials and will be used later for evaluation. Listing 5 shows the code for
setting the preferred metaphor, waiting for user input, and saving the answer to the
question in the test data object in the “PreferredInteractionMetaphor“ field. As we are
37
3. Avatar Design and Implementation
Figure 3.11: The UI after each sub-trial with a 7-point Likert scale slider.
randomly assigning the two metaphors to the two sub-trials, we need to check the current
order of the metaphors; otherwise, we would save the wrong data. We then call the
function SpawnNextSubTrial() to start the next trial, if there are any left.
After a total of 20 trials, another UI appears informing the user that all trials for the
current body configuration have been completed and that they can put the HMD down.
Figure 3.13 shows the full procedure of the evaluation system for one body configuration.
3.5.1 UI Interaction
In order to use the evaluation tasks in a user study, it must be possible to interact with
the user interface using the VR controllers in order to answer the questions asked after
each sub-trial. The requirement is that it should be possible to answer the questions
about the agency and the preference of the interaction metaphors described above without
mouse or keyboard input. The reason for this is that users should not put the HMD
down to use the mouse and keyboard during the experiments, as this could destroy the
immersion. Therefore, a simple and intuitive UI interaction logic was added.
We use Unity’s new input system for this as it is versatile in the development of input
logic. Additionally, controller inputs from various VR headsets can be used thanks to
the OpenXR package. To select the desired value in the Likert slider UI panel or the
preferred interaction metaphor in the metaphor selection UI panel, the joystick of the
38
3.5. Evaluation Task
Figure 3.12: The UI after both sub-trials with a choice between the two interaction
metaphors to indicate the preference.
right controller can be used. The value of the x-axis of the joystick is being read (this
corresponds to a left-right movement of the joystick) and converted to a boolean output.
The following listing 6 shows the code for reading the joystick input. As we do not want
to read out the input with every update frame but only with every change of direction of
the joystick position, we must save the current joystick value (position) and compare it
each time so that the selection logic (i.e., a value in the slider is selected or a preferred
metaphor is selected) is not triggered with every frame. The selection logic is therefore
only triggered in these cases:
• Joystick idle position → Joystick moved to the right
• Joystick idle position → Joystick moved to the left
• Joystick left position → Joystick moved to the right
• Joystick right position → Joystick moved to the left
Once the desired value has been selected in the Likert slider UI, the selection can be
confirmed with the primary button on the right controller, i.e., the “A“ button. Apart
from selecting the desired values on the UI panels and confirming with the primary
button, the user does not need any other UI interaction methods.
39
3. Avatar Design and Implementation
3.5.2 CSV Export
In order to be able to use the data recorded during the trials for later evaluation, it
must be converted into a meaningful format and exported. For this purpose, we use the
UserStudyExporter.cs script, which converts the trial data into a .csv structure and
exports it to a .csv file. Once all trials have been completed, the script automatically
creates a file with the name “userstudydata_ddMMyy_HHmmss.csv“ (where the second
part is the current date and time) inside the “Exports“ folder. The following data from
the trials is saved in the file:
• BodyConfig. The body configuration with which the user completes the trial
(values: LeftEye-TopBody, RightEye-TopBody, LeftEye-BottomBody or RightEye-
BottomBody).
• Trial_Name The name that identifies the trial. The name is always as follows:
“Trial_X“, where X is a number from 1 to 20 (since there are 20 trials).
• SubTrial_Number. The number of the sub-trial (values: 1, 2). After two rows with
the sub-trial data for the current trial object, a third line with SubTrial_Number 0
is added. In this third line, the preferred interaction metaphor selected by the user
is stored.
• Metaphor. The metaphor in use for the current sub-trial. Contains the value “—“
if it is the “‘preference“ row (values: GoGoInteraction, GazeAndManipulate).
• Arm. The arm in use to grab the objects for the current sub-trial (values: Left,
Right).
• Angle. The angle at which the object for this sub-trial spawns (values: -90, -45, 0,
45, 90).
• Distance. The distance at which the object for this sub-trial spawns (values:
Medium, Long).
• Time. The time it takes the user to grab this object. Contains the value “-1“ if it
is the “‘preference“ row.
• Answer. The answers given by the user during the execution of the task. For the
sub-trial rows, this field stores the value of the 7-point Likert scale from 0 to 6 (see
Figure 3.11 for the corresponding UI). For the “‘preference“ row, this value stores
the preferred interaction metaphor (see Figure 3.12 for the corresponding UI).
40
3.5. Evaluation Task
Figure 3.13: One full iteration of the evaluation system for one body configuration. Blue
boxes: system actions. Green boxes: user actions.
41

CHAPTER 4
Evaluation
In order to evaluate the interaction metaphors designed for the dissimilar avatar, a
user study was conducted with 20 participants (16 male, 3 female, 1 non-binary). The
main goal of this study was to evaluate the usability and intuitiveness of the metaphors
in the context of co-embodiment and to assess the SoE of the users while controlling
the dissimilar avatar. In this study, the application was tested with one person at a
time to focus on the interaction metaphors. Co-embodiment with two users interacting
simultaneously was not the scope of this study and will be conducted in future works.
4.1 Study Design and Hypotheses
4.1.1 Design
In the user study, the two interaction metaphors were tested in four different blocks of
body configurations (i.e., which body part the user will control, including which eye and
which pairs of arms (see Figure 3.1 for an example body configuration)). Each block
consisted of 10 grabbing positions for both arms, in which both interaction metaphors
had to be used (see Figure 3.9 for the setup of the grabbing position). In total, there were
40 grabbing to perform, which were randomized within a block. The blocks themselves
were counterbalanced in the user study using a Balanced Latin Square design (see Table
4.1). In the user study, we had the following independent variables:
• Body Configuration: LeftEye-TopBody, RightEye-TopBody, LeftEye-BottomBody,
RightEye-BottomBody
• Angles of grabbable spawning positions: -90°, -45°, 0°, 45° and 90°
• Distances of grabbable spawning positions: medium (1 meter), long (2 meters)
43
4. Evaluation
Participant # Block 1 Block 2 Block 3 Block 4
1 LeftEye
TopBody
RightEye
TopBody
RightEye
BottomBody
LeftEye
BottomBody
2 RightEye
TopBody
LeftEye
BottomBody
LeftEye
TopBody
RightEye
BottomBody
3 LeftEye
BottomBody
RightEye
BottomBody
RightEye
TopBody
LeftEye
TopBody
4 RightEye
BottomBody
LeftEye
TopBody
LeftEye
BottomBody
RightEye
TopBody
Table 4.1: Order of the body configurations defined by a Balanced Latin Square design.
For the fifth participant, the order of participant #1 was used again, and so on.
• Interaction metaphor: Go-Go Interaction, Gaze-and-Manipulate Interaction
• Arm: Left Arm, Right Arm
For the spawning position of the grabbable, we chose a height of 1.2 meters (with a
0.2 random offset); for the Go-Go interaction, we chose an arms length of 0.75 meters;
and for the Go-Go coefficient K, we chose the value 6. Regarding the parameters of the
Gaze-and-Manipulate interaction, we chose the smooth time to be 0.05 sec and the gaze
ray distance to be 0.07 meters.
4.1.2 Research Questions and Hypotheses
The main aim of this work was to investigate how useful the developed interaction
metaphors are for the dissimilar avatar and how strong the user’s sense of agency was
through the use of the metaphors. Our main research questions are:
• How do different interaction metaphors influence the user’s SoE and immersion in
VR with a dissimilar avatar?
• Which interaction techniques do users prefer for interacting with objects while
being embodied in the dissimilar avatar?
We hypothesized that in the selection tasks, the objects with an angle closer to zero
(i.e., more frontal to the user) are easier to grab than the objects that are further to the
side of the user. We also hypothesize that objects that are further away from the user’s
body are faster to grab with the Gaze-and-Manipulate interaction metaphor than with
the Go-Go interaction metaphor. Accordingly, we believe that the nearer objects will
be grabbed faster with the Go-Go interaction. Therefore, we want to test the following
hypotheses on “User Performance“:
44
4.2. Technical Setup and Equipment
H1.1: Objects that are close to the avatar can be grabbed faster with Go-Go
interaction than with Gaze-and-Manipulate.
H1.2: Objects that are further away can be grabbed faster with Gaze-and-Manipulate
than with Go-Go interaction.
H1.3: Objects located at positions with a smaller angle relative to the player axis
can be grabbed faster than objects located at a larger angle.
A central component in the design of the interaction metaphors was to convey a high
SoE and sense of agency. We hypothesize that SoE and agency would be affected
differently depending on the trial configuration or body configuration. We test the
following hypotheses on “Sense of Embodiment and Agency“:
H2.1: The sense of agency will be influenced by the trial configuration (metaphor,
distance).
H2.2: The sense of agency will be influenced by the body configuration.
H2.3: The sense of self-location will be influenced by the body configuration.
We want to find out which interaction metaphor is more user-friendly and preferred
by the users. Since the Go-Go interaction is intuitive to use and we assume that the
Gaze-and-Manipulate interaction is more complicated to use, as well as the “teleportation“
of objects may seem unnatural to the user, our final hypothesis on “User Preference“
is as follows:
H3: Users will favor Go-Go interaction over Gaze-and-Manipulate interaction.
4.2 Technical Setup and Equipment
For the user study, a virtual scene was created in Unity3D that contained a defined area
surrounded by a fence. This fence was used to make it clear to the user that they did not
have to move and that the objects would not appear outside this boundary. In addition,
this fence was incorporated as a tool to increase the user’s immersion in the scene and
allow them to better perceive distance in space. Figure 4.1 shows the technical setup
and the first-person perspective of the user performing a grabbing task.
The user study was carried out at the VR Lab at TU Wien and for some participants
at the authors’ home. The main play area for the user study was an area measuring
approximately 2.5 meters by 1.5 meters. The study was conducted with an Oculus Quest
2 HMD and two Oculus Quest 2 Touch controllers, with the HMD connected to a desktop
computer via cable. The user study was conducted on a PC with Intel Core i9 (3.5GHz)
CPU, 32GB RAM, NVIDIA GeForce RTX 3090 GPU in the VR Lab at TU Wien. At
the authors’ home, a PC with AMD Ryzen 5 (3.6GHz) CPU, 32GB RAM, and NVIDIA
GeForce RTX 2060 GPU was used.
45
4. Evaluation
Figure 4.1: Left: User wearing a HMD and controllers performing a grabbing task. Right:
First-person perspective of the VR environment.
4.3 Participants
Twenty participants aged between 21 and 40 were recruited for the study. All participants
had normal or corrected-to-normal vision, and all but one were right-handed. Of these
participants, 10 reported having experience with VR using an HMD (7 of those regularly
(experience level 6), 3 of those reported experience level 5), 4 reported moderate experience
(level 2-4), and 6 had little to no experience with VR (2 answered experience level 0,
4 experience level 1). The VR experience levels are shown in Figure 4.2a. Regarding
video game experience (Figure 4.2b), 7 participants reported a level of 6 (regularly), 5
reported a level of 5, while 3 reported a level of 3, and 3 reported experience level 0
(never). Experience levels 1 and 4 were each reported once.
4.4 Procedure
During the study, a second instance of the application was launched that connected
to the first instance and took control of the limbs that the user was not controlling.
Although these limbs were not moved, their presence simulated a co-embodied scenario.
Throughout the evaluation, participants were closely supervised by an experimenter
46
4.4. Procedure
(a) VR with HMD experience. (b) Video games experience.
Figure 4.2: Participants’ experience levels with VR with an HMD and video games.
seated nearby. The experimenter first provided instructions on the VR controls and then
offered assistance during each trial, if needed.
The user study was divided into several parts:
1. Before the first block of trials: A preliminary questionnaire was used to collect
demographic data from the participants.
2. During the four blocks of trials: Participants were asked to perform tasks in a
body configuration in the VR application.
3. After each of the four blocks of trials: Participants were asked to complete a
post-block questionnaire to collect data about their perceived SoE.
4. After the last block of trials: A post-experiment questionnaire was used to
collect data about the participants’ VR experience.
Parts 2 and 3 were repeated four times, with a different body configuration chosen for
each block in order to test all possible configurations. Between parts 2 and 3, participants
were given some time to recover from the VR use while completing the post-block
questionnaire. Figure 4.3 shows a flowchart of the different parts of the user study
and the approximate time allocated to each part. To reduce the possible influence of
learning effects on the measurement data, we used a Balanced Latin Square design to
counterbalance the order of the body configurations, as shown in Table 4.1. This design
ensured that each configuration was presented in every possible order, thus preventing
systematic bias in our results.
47
4. Evaluation
Figure 4.3: Procedure of the study with the approximate time allocated to each part.
As the user study tasks described above could be carried out in a stationary position and
the participants did not have to move around in the room, the user study did not require
a lot of space. It was sufficient if the participants had enough space to fully extend their
arms and possibly lean forward or to the side. This allowed the study to be conducted in
spaces with limited room, such as the authors’ home or the VR lab at TU Wien. Despite
the different locations, the same procedure was followed in order to obtain usable and
comparable data.
After being welcomed and told how much space they would need to perform the tasks
to avoid any risk of injury or damage to the equipment, they were asked to complete
the preliminary demographic questionnaire. The HMD and controller with the necessary
controls were then explained to the participants, and the HMD was handed out. The
test environment was started to give the participants the opportunity to get used to
the VR experience. In order to familiarize the participants with the upcoming tasks
and interaction possibilities, they were given the opportunity to try out both metaphors
(Go-Go interaction, Gaze-and-Manipulate) by grabbing dummy objects placed in the
scene. A mirror was also placed in the virtual scene so that participants could see
the dissimilar avatar they were controlling and explore the capabilities and physical
limitations of the avatar. This “warm-up round“ was carried out until they felt ready to
start the user study.
At the start of the user study, the initial body configuration for the virtual avatar was set,
and the first of four blocks of trials began. Each trial consisted of a grabbing task (first
randomized interaction metaphor), followed by a Likert-type scale question to capture
the user’s sense of agency, followed by the second grabbing task (second interaction
metaphor), followed by another Likert-type scale question, and finally a choice between
the two interaction metaphors as to which one the user preferred for that cabbage
48
4.5. Data Collection and Analysis
position and arm. This was repeated 20 times, for a total of 40 grabbing tasks (40
agency questions and 20 preferred metaphor questions). After one body configuration,
participants were given a post-block questionnaire about the VR experience. After a
short break, the described procedure was repeated three more times to collect data
for all four body configurations. At the end of the experiment, the participants were
given a post-experiment questionnaire in which they answered statements about the VR
experience and the interaction metaphors and could optionally fill in a free-form text
field for additional comments and remarks.
4.5 Data Collection and Analysis
We collected both objective and subjective data in the user study. During the trials,
objective data was collected in the form of task completion time. The post-block
questionnaires and the post-experiment questionnaire collected subjective data in the
form of 7-point Likert-type scale statements and open-ended questions. In the objective
evaluation, we compared the interaction metaphors in terms of the time required to
complete the tasks. In the subjective evaluation, we analyzed the answers to the Likert-
type scale statements and open-ended questions about agency, SoE, and workload. The
results of the objective and subjective evaluation are presented in the following section
4.6 and discussed in chapter 5.
4.5.1 Demographics
In a preliminary questionnaire, participants were asked to provide demographic informa-
tion such as gender and age and to rate their experience with VR headsets and video
games on a scale of 0 to 6, with 0 (never) being the lowest and 6 (regularly) being the
highest. They were also asked to rate their general well-being before the start of the user
study on a scale of 0 to 10 (0 being how they felt when they entered, 10 being that they
would like to stop the experiment).
4.5.2 Objective Data
In the objective evaluation, the times measured during the execution of the evaluation
tasks were analyzed. For each sub-trial (see section 3.5 for more details), the time it takes
a user to grab and collect the grabbable object was measured. The time started when
the grabbable object spawned and the user received information on the UI about which
interaction metaphor and which arm to use to grab the object. The time was stopped
as soon as the user had placed the object in the basket. The data was used to find out
which interaction metaphor was used, which arm was used, and which grabbable object
positions took users longer to collect the objects and in which setting they were faster.
This information was then used to determine which interaction metaphor is objectively
more efficient.
49
4. Evaluation
ID Statement
OW1 It felt like the virtual body was my body.
OW2 It felt like the virtual body parts were my body parts.
OW3 The virtual body felt like a human body.
OW4 It felt like the virtual body belonged to me.
AG1 The movements of the virtual body felt like they were my movements.
AG2 I felt like I was controlling the movements of the virtual body.
AG3 I felt like I was causing the movements of the virtual body.
AG4 The movements of the virtual body were in sync with my own movements.
CH1 I felt like the form or appearance of my own body had changed.
CH2 I felt like the weight of my own body had changed.
CH3 I felt like the size (height) of my own body had changed.
CH4 I felt like the width of my own body had changed.
Table 4.2: First set of statements about ownership (OW), agency (AG) and change (CH)
of the post-trial questionnaire. The participants answered on a 7-point Likert-type scale
indicating the extent to which the statement applied to them during the trial (1=strongly
disagree, 7=strongly agree). Statements taken from [43].
4.5.3 Subjective Data
In addition to the preliminary questionnaire, participants were given post-block question-
naires and one post-experiment questionnaire to complete. These questionnaires included
questions about the VR experience, agency and SoE, and workload. Participants were
asked to complete a post-block questionnaire after each of the four blocks of the body
configurations. After the last run, a short post-experiment was given to the partici-
pants to complete. The preliminary, post-block, and post-experiment questionnaires can
be found in Appendix G. The post-experiment questionnaire used a mixed-methods
approach, including both Likert-type scale questions and open-ended questions with
free-text responses, to gather subjective feedback from participants.
Our user study focused primarily on the collection of subjective data in the form of 7-
point Likert-type statements (where the user indicated the extent to which the statement
applied to them during the trial) and in the form of preference questions. The subjective
data was divided into three different categories, depending on when they were collected
during the user study: post-trial, post-block and post-experiment data. Post-trial data
includes the 7-point Likert scale agency statement data after each grabbing sub-trial and
the preferred metaphor question after each trial. Post-block data, on the other hand,
refers to the embodiment questionnaires after each of the four body configuration blocks.
Finally, post-experiment data includes 7-point Likert scale statements about the VR
experience as well as a free-form text field for comments. In the analysis, these three
types of data categories were distinguished and analyzed individually.
For the post-block data analysis, two sets of statements were evaluated after each of
50
4.5. Data Collection and Analysis
ID Statement
SL1 I felt as if my body was located in the center of the virtual body.
SL2 I felt as if my body was located to the left of the virtual body.
SL3 I felt as if my body was located to the right of the virtual body.
SL4 I felt as if my head was located in the center of the virtual body.
SL5 I felt as if my head was located to the left of the virtual body.
SL6 I felt as if my head was located to the right of the virtual body.
SL7 I felt as if my arms were where I saw the upper arms of the virtual body to
be.
SL8 I felt as if my arms were where I saw the lower arms of the virtual body to be.
SL9 I felt as if my arms were to the left from where I saw the arms of the virtual
body to be.
SL10 I felt as if my arms were to the right from where I saw the arms of the
virtual body to be.
SL11 It was easy to grab the cabbages.
Table 4.3: Second set of statements about self-location (SL) of the post-trial questionnaire.
The participants answered on a 7-point Likert-type scale indicating the extent to which
the statement applied to them during the trial (1=strongly disagree, 7=strongly agree).
Custom statements by the authors.
the four body configurations. The first set consisted of statements about the sense of
ownership, agency, and change experienced by the participants while performing the
tasks. The second set of statements consisted of statements about self-location. Table
4.2 shows the statements from the first set, and Table 4.3 shows the statements from the
second set.
To be able to make a meaningful statement about each individual factor (“ownership“,
“agency“, “change“), the results has to be summed and grouped accordingly. This was
done by calculating a score for each factor, which is the sum of the scores for each
statement. Specifically, this means that the scores for statements OW1 to OW4 were
added together and divided by four to give an overall score for the factor “ownership“.
The same was done for the factors “agency“ and “change“ (see Equations 4.1, 4.2 and
4.3) [43]. We then compared the results of the individual factors between the different
body configurations.
Scoring “Ownership“ = OW1 + OW2 + OW3 + OW4
4 (4.1)
Scoring “Agency“ = AG1 + AG2 + AG3 + AG4
4 (4.2)
Scoring “Change“ = CH1 + CH2 + CH3 + Ch4
4 (4.3)
51
4. Evaluation
ID Statement
P1 I liked my virtual body.
P2 My virtual body was disturbing.
P3 It was easy to interact with the Go-Go interaction metaphor.
P4 It was easy to interact with the GazeAndManipulate interaction metaphor.
Table 4.4: Statements of the post-experiment questionnaire about the VR experience and
interaction metaphor usability. The participants answered on a 7-point Likert-type scale
indicating the extent to which the statement applied to them during the trial (1=strongly
disagree, 7=strongly agree). Custom statements by the authors.
For the post-experiment data analysis, statements about the VR experience and the
usability of the two interaction metaphors, as well as additional comments, were analyzed.
The statements are listed in Table 4.4.
4.5.4 Statistics
During the user study, we recorded a total of 3200 sub-trials (1600 trials) from 20 users.
Some of the sub-trials in the user study took an unreasonably long time, especially the
first few trials because the participants often did not look in all directions to find the
object and therefore took longer to find it. Therefore, we used the “Interquartile Range“
(IQR) method to detect and remove outliers from the data. To do this, we calculated the
IQR for each independent variable and removed all data points that were more than 1.5
* IQR below the 25th percentile (Q1) or above the 75th percentile (Q3). This method
ensured that extreme outliers in the data points did not skew the overall result in one
direction or the other. 196 outliers were identified and discarded, resulting in a total of
3004 sub-trials (1502 trials).
Regarding statistical analyses, we proceeded with the following pipeline. For normally
distributed metrics, assessed using the Shapiro-Wilk test, we analyzed variance (ANOVA)
with repeated measures factors. Greenhouse-Geisser adjustments were applied to the
degrees of freedom when the sphericity assumption was violated. For metrics that deviated
from a normal distribution, we used the non-parametric Aligned Rank Transform (ART)
test [47]. The post-hoc analysis involved pairwise t-tests with Bonferroni corrections for
customarily distributed dependent variables or the multifactor contrast test procedure
presented in [13] for the non-normally distributed ones. Significant tests are reported
with p-values lower than 0.05.
4.6 Results
In this chapter, we present the results of the data evaluation. We first present the
results from the objective data evaluation, followed by the data from the subjective data
evaluation (divided into post-trial, post-block data, and post-experiment data).
52
4.6. Results
4.6.1 Objective Data
Task Completion Time - Figure 4.4 shows how long users took on average to grab and
collect objects for the independent variables “distance“ and “angle“. We compared the
four independent variables “body configuration“, “arm“, “distance“, and “angle“. The
results indicated an effect of the selection technique on completion time (F1,3000 =33.64,
p < 0.001), where participants were faster with Go-Go (4.82±2.05) than Gaze-and-
Manipulate (5.22±2.31). For the independent variables “body configuration“ and “arm“,
no significant differences in task completion time were observed. However, for the
independent variable “distance“ we could see that a smaller distance leads to shorter
reaching times for both metaphors (F1,3000 = 116.51, p < 0.001). Participants were
faster to grab closer cabbages (4.68±2.12) than farther ones (5.37±2.21). Similarly, a
significant difference in task completion time was found based on the angle of the object
(F4,3000 =180.04, p<0.001). Cabbages were grabbed faster when they were in front of the
users at angle 0° (3.79±1.64), followed by -45° (4.71±1.96) and 45° (4.90±2.18), followed
by 90° (5.79±2.14) and -90° (5.99±2.25).
Figure 4.4: Task completion times for both interaction metaphors for the independent
variables “distance“ (left) and “angle“ (right).
4.6.2 Subjective Data
Post-Trial Data
Agency - After each sub-trial, we collected and analyzed the responses to the 7-point
Likert statement. This statement referred to the experienced agency during the grabbing
task. The results can be observed in Figure 4.5. All values on the 7-point Likert scale
were given at least once, with most responses in the upper range. The plot shows that
50% of the data lied between agency scores 5 and 7. We found a significant effect
of the selection technique on the agency after grabbing a cabbage (F1,3000 = 108.50,
53
4. Evaluation
Figure 4.5: Scores for the agency statement displayed after each sub-trial for the inde-
pendent variables “metaphors“ (top left), “arm“ (top right), “angle“ (bottom left), and
“distance“ (bottom right).
p<0.001), where participants felt a higher sense of agency with Go-Go (4.81±1.05) than
Gaze-and-Manipulate (4.49±0.97). Moreover, we noticed an effect of the independent
variable “angle“ on agency after each trial (F4,3000 =17.71, p<0.001), where users felt a
higher agency for cabbages in front of them at angle 0° (4.83±0.91), followed by cabbages
at -45° (4.72±0.98) and 45° (4.70±1.03), followed by cabbages at -90° (4.50±1.06) and
90 (4.50±1.10).
Selection Metaphor Preference - The data about the users’ preferences are shown in
Figure 4.6. These data refer to the question about the preferred interaction metaphor,
asked after each trial. For the independent variables “distance“ and “angle“, a clear
54
4.6. Results
preference for the Go-Go interaction was seen. When the objects were positioned further
away from the user, this preference was somewhat weaker but still clearly in favor of the
Go-Go interaction. For the independent variable “angle“, this preference for Go-Go was
significantly stronger for the smaller angles than for the extreme angles (-90° and 90°).
For objects positioned at extreme angles, there was only a small preference for Go-Go.
When the objects were positioned directly in front of the user, Go-Go was preferred
almost twice as often as Gaze.
The result of the analysis of the independent variable “body configuration and angle“
confirmed that the Go-Go interaction metaphor was generally preferred, but there are two
outliers visible in the plot, namely for the extreme angle of -90°, the Gaze-and-Manipulate
metaphor was preferred for the body configuration “Left Lower“, as well as for the angle
90° and the body configuration “Right Upper“. Otherwise, a similar picture emerged,
i.e., the Go-Go interaction was favored more strongly for the smaller angles than for the
larger angles.
Post-Block Data
This section presents the results of the post-block data analysis, which consisted of the
summed scores of the factors of the first block of statements (ownership, agency, and
change statements) and the individual scores of each statement of the second block
(self-location statements) from the post-block questionnaire (refer to Appendix G for
the complete questionnaire). The scores are shown in Figures 4.7, 4.8, and 4.9.
Ownership - The statements on body ownership were rated similarly for all body configu-
rations (F3,57 =2.08, p=0.13), with a median between 4 and 4.5 for all four configurations.
For the “Left Upper“ and “Right Upper“ configurations, the sense of ownership was rated
at least once with a score of 1, with “Left Upper“ also receiving a score of 7 at least once
(Figure 4.7a).
Agency - We found a significant effect of “body configuration“ on the agency scores
(F3,57 = 3.86, p < 0.01). The median for all configurations was between 5.3 and 5.8
(Figure 4.7b), and post-hoc tests showed that participants reported the lowest agency
with the “Right Lower“ configuration (5.55±0.77), while the highest sense of agency was
experienced with the “Right Upper“ configuration (5.77±0.81).
Change - We found no significant effect of “body configuration“ on the change scores
(F3,57 =2.11, p=0.10). In addition, each body configuration received a score of 1 at least
once, and the sense of change was not rated 7 for any configuration (Figure 4.7c).
SL1 - For this statement, we found no significant differences between the individual body
configurations (F3,57 =0.86, p=0.46). The median of all four configurations was 4.5 or 5
(Figure 4.8a).
SL2 - For this statement, we found statistically significant differences between the different
body configurations (F3,57 =6.30, p<0.001). Post-hoc tests showed that the scores were
higher for the “Left Lower“ (4.25±2.12) and “Left Upper“ (3.80±1.76) compared to
55
4. Evaluation
Figure 4.6: Preference scores for the independent variables “distance“ (top left), “angle“
(top right) and “body configuration and angle“ (bottom).
“Right Upper“ (2.40±1.39) and “Right Lower“ (2.55±1.50). The analysis clearly shows
that the participants rated this statement higher when they embodied the left eye than
when they embodied the right eye (Figure 4.8b).
SL3 - We also found statistically significant differences between the different body
56
4.6. Results
(a) Ownership scores (b) Agency scores
(c) Change scores
Figure 4.7: Boxplots of the ownership statements (OW1-OW4), agency statements
(AG1-AG4) and change statements (CH1-CH4) scores from the Virtual Embodiment
Questionnaire [43].
57
4. Evaluation
(a) SL1. (b) SL2.
(c) SL3. (d) SL4.
(e) SL5. (f) SL6.
Figure 4.8: Boxplots of the self-location statements scores (SL1-SL6) grouped by body
configuration.
58
4.6. Results
(a) SL7. (b) SL8.
(c) SL9. (d) SL10.
(e) SL11.
Figure 4.9: Boxplots of the self-location statements scores (SL7-SL11).
59
4. Evaluation
configurations (F3,57 = 6.30, p < 0.001). Post-hoc tests showed that the scores were
higher for the “Right Lower“ (3.65±1.98) and “Right Upper“ (3.70±1.83) compared to
“Left Upper“ (2.35±1.13) and “Left Lower“ (2.10±1.20). The analysis showed that the
participants rated this statement higher if they embodied the right eye compared to the
left eye (Figure 4.8c).
SL4 - We found no significant differences (F3,57 = 0.33, p = 0.80) between the body
configurations for the statement that users located their heads in the center of the virtual
body (Figure 4.8d).
SL5 - Similar to SL2 and SL3, we also found statistically significant differences between
the body configurations (F3,57 = 9.94, p < 0.001). Post-hoc tests showed that “Left
Lower“ (4.55±2.03) had higher scores than “Right Lower“ (2.55±1.50) and “Right Upper“
(2.05±0.88), and “Left Upper“ (3.50±1.76) than “Right Upper“ (Figure 4.8e).
SL6 - We found a significant effect of body configurations (F3,57 = 8.49, p < 0.001).
Post-hoc tests showed that scores for “Right Lower“ (3.85±1.89) and “Right Upper“
(3.85±1.78) were higher than “Left Lower“ (2.00±1.02) and “Left Upper“ (2.55±1.39)
(Figure 4.8f).
SL7 - This statement referred to the location of the arms and whether the participant
saw them where the upper arms of the virtual body are or where the lower arms of
the virtual body are. Again, we found a statistically significant difference between the
different body configurations (F3,57 =5.69, p<0.001). This statement was rated higher
when participants controlled the upper arms (5.20±1.28) compared to the lower arms
(3.95±1.78) (Figure 4.9a).
SL8 - The same applied here as for SL7, with the difference that the statement refered
to the location of the lower arms. We found a significant effect of body configurations
(F3,57 =4.87, p<0.001). The scores were higher if the participants performed the grabbing
tasks with the lower arms (3.85±1.85) than the upper arms (2.55±1.46) (Figure 4.9b).
SL9 - In this statement we did not find a significant effect of body configurations
(F3,57 =2.35, p=0.08). The question refers to the fact that the arms were to the left of
the point where participants saw the arms of the virtual body (Figure 4.9c).
SL10 - We could not find any significant differences for this statement (F3,57 = 0.13,
p=0.93) (Figure 4.9d).
SL11 - We also found no differences between the individual body configurations for this
statement. The statement that it was easy to grab the cabbages was consistently rated
highly, with a median of 6 for all body configurations (Figure 4.9e).
Post-Experiment Data
This section summarizes the results of the post-experiment data analysis, which consisted
of the scores of the statements of the post-experiment questionnaire (the full questionnaire
60
4.6. Results
Figure 4.10: Scores for the post-experiment questionnaire statements.
can be found in Appendix G) and a summary of the additional comments made by the
participants. The results are shown in Figure 4.10.
Post-Experiment questions Participants largely reported that they liked the virtual avatar
(P1) (4.55±1.73), but some found it disturbing (P2) (3.15±1.54), possibly due to its
inhuman appearance. When asked if it was easy to interact with the Go-Go interaction
(P3) , the majority gave a very high score (6.00±0.97). However, when asked about the
ease of interacting with the Gaze interaction metaphor (P4), participants tended to give
a lower score (4.80±1.57).
Participant Feeback Some participants stated that they found the Go-Go interaction
intuitive and comfortable. In addition, some stated that they preferred the Go-Go
interaction for interacting with close objects and the Gaze interaction for interacting
with distant objects. In terms of the Gaze interaction, some participants commented that
it would be more convenient if the same button was used for retrieving and grabbing an
object rather than two separate buttons. One participant said that a crosshair would be
helpful for aiming at objects and that gaze would work better if eye movements were
tracked by the HMD. Regarding the avatar, it was noted that the shoulder height of the
avatar should be adjusted to the real height of the participant in order to improve the
agency. It was also noted that a third option for the user interface to select the preferred
metaphor would be good, as sometimes both interaction metaphors felt the same in terms
of agency and participants were forced to choose one interaction metaphor.
61

CHAPTER 5
Discussion
In this chapter, we summarize and discuss the results of the user study. We test
the proposed hypotheses by interpreting the collected data on task performance, user
experience, and SoE. In section 5.4, we draw conclusions about the usability of the
dissimilar avatar in a co-embodied scenario and the effectiveness of the two interaction
metaphors, as well as pointing out the limitations of our work in section 5.5.
5.1 User Performance
5.1.1 H1.1
The results showed that the time taken by users to grab the objects with the interaction
metaphors differed only slightly for the medium and long distances (see Figure 4.4). The
median for all positions and both interaction metaphors was in the range of 4 and 5.3
seconds. Nevertheless, it could be seen that for both the long and medium distances,
the time taken by the users to grab the object with the Go-Go interaction was slightly
shorter. This may be due to the fact that the Go-Go interaction does not require any
additional button clicks, whereas the Gaze-and-Manipulate interaction requires a button
click to retrieve the object to the hand. Although the literature suggested that gaze-
based interaction was faster than hand-based selection with a mouse [44] and provided
more subjective immersion than mouse-based interaction [18], we did not observe such
tendencies. However, studies have shown that the Go-Go interaction (in combination with
the PRISM technique) was faster when interacting with objects at close range (objects
0.6 meters away), whereas gaze-based interactions were faster when interacting with
objects at a longer distance (objects 3+ meters away) [46]. This was partly consistent
with our results, as our long-distance setup used a distance of 2 meters, which may
have been too short for efficient gaze interactions. We therefore hypothesize that we
obtained these results because the objects were too close to exploit the full potential of the
63
5. Discussion
Gaze-and-Manipulate interaction. In summary, our first hypothesis (H1.1) is validated;
objects that are close to the avatar can be grabbed faster with Go-Go interaction than
with Gaze-and-Manipulate.
5.1.2 H1.2
The results also showed that task completion times for the Go-Go interaction were slightly
shorter than for the Gaze-and-Manipulate interaction for both the medium and long
positions (see Figure 4.4). This could be due to the fact that in the Gaze-and-Manipulate
interaction, the user first has to aim at the object before he can retrieve it, which can
take a short moment if the object is not directly being looked at. In summary, these
results do not support our second hypothesis (H1.2). Objects that are further away
cannot be grabbed faster with Gaze-and-Manipulate than with the Go-Go interaction;
instead, they can be grabbed slightly faster with the Go-Go interaction.
5.1.3 H1.3
The results confirmed that the time it took the player to grab and collect objects depended
on the position at which the object was spawned. While there was no significant difference
in task completion time between the positive angle and its mirrored negative value, there
was indeed a difference in time between smaller and larger angles (regardless of whether
it was a positive or negative angle) (see Figure 4.4). The measured time was smallest
for the 0° angle, followed by the 45°/-45° angle. The longest time was measured for the
90°/-90° angles. As Fitts’ law states, the time required to reach an object is a relationship
between the distance and width of the target, as well as the reaction time required by
the user to localize the target (see Section 2.6 for a detailed explanation). We assume
that users could locate objects in front of them more quickly than objects to the side,
as they did not have to “search“ for the object. Therefore, the total time required to
complete the task was shorter. Thus, our third hypothesis (H1.3) is validated; objects
located at positions with a smaller angle relative to the player axis can be grabbed faster
than objects located at a larger angle.
5.2 Sense of Embodiment and Agency
5.2.1 H2.1
We tested whether the different trial configurations (metaphor, arm, angle, distance)
have an effect on the sense of agency. We found no significant effects for the independent
variables “arm“ and “distance“ (see Figure 4.5). For the variable “metaphor“, however,
the results showed that Go-Go conveyed a stronger sense of agency than Gaze-and-
Manipulate (see Figure 4.5). The same was true for the variable “angle“: the smaller the
angle, the greater the sense of agency. Participants rated agency lowest at the extreme
angles (-90°/90°) (see Figure 4.5). We believe that Go-Go created a stronger sense of
agency because the movements of the avatar and objects in this metaphor were completely
64
5.2. Sense of Embodiment and Agency
controlled by the user, as opposed to Gaze, where sometimes a “virtual arm halo“ moved
the object. Since studies have shown that the sense of agency is based on, among other
things, efferent motor signals, reafferent feedback signals, and action intentions, which
were not fully present in the case of the arm halo because the user did not actively control
the arm halo, agency may have suffered [10]. Regarding the independent variable “angle“,
we hypothesize that the smaller angles had higher agency scores because it felt more
realistic for users to grab the objects from the front rather than from the side. The fact
that the dissimilar avatar did not turn to the side as soon as the user turned around may
have given users a strange sense of embodiment, which affected agency. This could also
be seen from the fact that the task completion time was shorter for the smaller angles,
which meant that it was easier for users to grab the object overall. We conclude that our
hypothesis (H2.1) is partially validated, as only certain trial configurations significantly
influence the sense of agency.
5.2.2 H2.2
Furthermore, we tested whether the sense of agency was influenced by the body con-
figuration. The results showed that there was a significant effect of the independent
variable “body configuration“ on the sense of agency (see Figure 4.7b). This could be
due to the fact that the upper arms were more in line with the actual shoulders, resulting
in a more human-like embodiment. For the lower arms, the sense of agency may have
suffered because the virtual arms were positioned too far down the body, resulting in
too much difference between the position of the real and virtual shoulders. Although
the body configuration influenced the users’ agency, there were no significant results in
terms of task performance. On average, users were similarly fast at collecting the objects
with both the upper and lower arms. This could be due to the fact that users were very
familiar with the movements of the virtual avatar due to the high agency, and therefore
the body configuration did not hinder them much in performing the movements, resulting
in similarly good performances for all configurations. This leads to the conclusion that
our hypothesis (H2.2) is validated according to which body configuration influences
agency.
5.2.3 H2.3
Regarding our third hypothesis (H2.3), we found an influence of the independent variable
“body configuration“ on the sense of self-location (see Figures 4.8b, 4.8c, 4.8e, 4.8f, 4.9a,
and 4.9b). The statements about the position of the arms and head were rated higher
when the participants were in the body configurations mentioned in the statements. For
example, SL2 (“I felt as if my body was located to the left of the virtual body.“ [43]) was
rated higher for the body configurations “Left Upper“ and “Left Lower“ than for “Right
Upper“ and “Right Lower“. This showed that users perceived the body configurations they
embodied exactly as we intended and that the dissimilar avatars provided a consistent
sense of self-location to the users. Therefore, our hypothesis (H2.3) is validated; the
body configuration influences the sense of self-location.
65
5. Discussion
5.3 User preferences
5.3.1 H3
Finally, the results showed that there was a clear preference overall for one interaction
metaphor. At both medium and long distances, participants preferred the Go-Go
interaction over the Gaze-and-Manipulate interaction (see Figure 4.6). The same applied
to the different angles of the positions at which the objects were spawned. While a clear
preference for the Go-Go interaction could be observed at the smaller angles (at the 0°
angle the number of responses in favor of the Go-Go interaction was almost twice as high
as for the Gaze-and-Manipulate interaction), the larger angles (90°/-90°) no longer showed
such a clear preference. In addition, when we looked at the factor “body configuration
and angle“, we could see that for the extreme angles there was even a preference for
the Gaze-and-Manipulate interaction over the Go-Go interaction (see Figure 4.6). One
reason for this could be that the Go-Go stretch logic behaved differently depending on
the direction in which the arm was stretched. Because the pivot point of the stretch logic
was fixed, the virtual arm was stretched more when the real arm was moved forward
than when it was moved sideways. It is possible that the objects spawned at a larger
angle could not be reached as easily and without the user having to lean sideways as in
the Gaze-and-Manipulate interaction. In summary, our hypothesis (H3) is validated by
the results; objects located at positions with a smaller angle relative to the player axis
are preferably grabbed with Go-Go than with Gaze-and-Manipulate.
5.4 Discussion
To summarize, the developed interaction metaphors were an effective complement to the
dissimilar avatar and, according to the results of the user study, definitely promoted a
SoE. After Xu et al. [49] found that embodying and controlling a dissimilar avatar, such
as a stray animal, elicited a strong SoE in users, we confirm that our dissimilar avatar
also conveyed a decent SoE. It turned out that the Go-Go interaction was overall slightly
more efficient and user-friendly than the Gaze-and-Manipulate interaction metaphor.
Especially for close objects, Go-Go was an effective way to use the physical characteristics
of the dissimilar avatar and to enrich the interaction with it. Gaze-and-Manipulate,
on the other hand, had its strengths when interacting with distant objects, which was
consistent with the results of previous work [46]. Therefore, we recommend considering
the Go-Go interaction metaphor whenever interaction with close objects is required and
the avatar allows non-linear distortion of the arms.
In the course of the user study, we made interesting discoveries that could be useful for
future development of the interaction for the dissimilar avatar, such as that the avatar
should possibly rotate with the user to achieve a higher sense of agency (H2.1), or
a Go-Go interaction logic with adaptive pivots (H3). In addition, when designing a
co-embodied, dissimilar avatar, attention should be paid to where the arms are positioned
on the virtual body. In the case of our dissimilar avatar, users perceived a stronger SoE
66
5.5. Limitations
with the upper arms, which can be used as information for the design of future dissimilar
avatars. The shoulder height of the avatar should be adjusted to the real height of the
participant in order to improve the SoE and agency. Overall, we believe that with this
work and the results obtained, we have created an important basis for further research
on this topic.
5.5 Limitations
While the developed interaction metaphors aim to provide an efficient way to interact
with (distant) VR objects, there are also limitations in the described setting. A major
limitation is that the avatar does not move or react to the user’s walking movements.
A system would need to be developed that effectively transfers both users’ walking
movements to the avatar without severely compromising the SoE or even creating an
“out-of-body“ experience. Although the existing VR environment provides a basic solution
for navigation, it would need to be reconciled with the two interaction metaphors in a
future iteration to realize the full potential of co-embodiment.
Furthermore, in our evaluation task, we only evaluated how efficient the interaction
metaphors for the dissimilar avatar are for grabbing objects located in front of the user.
We only tested a maximum field of view of 180°. However, it would be interesting to find
out how the usability and SoE of users would behave with a 360° setup, i.e., if objects
spawned not only in front of the user but also behind them, forcing them to turn around.
Another limitation concerns the Gaze-And-Manipulate interaction metaphor. The se-
lection of objects was found to be difficult and imprecise by some participants in the
user study. In our implementation, we use the direction in which the user moves their
head to determine the direction of gaze, which only works well if the user really aligns
his head precisely with the object and rotates his head accordingly. However, this can
lead to neck pain during prolonged interaction if objects are positioned very far to the
side of the user. One solution would be to track the much more accurate and natural eye
movement instead of the head movement and select objects based on the actual direction
of gaze. In this way, the selection of objects would be much easier for the user.
As there is no standard embodiment questionnaire for the use of co-embodied dissimilar
avatars, we had to use a questionnaire developed for human avatars for the post-experiment
questionnaires. This can be seen as a limitation in that the embodiment of a dissimilar
avatar in a co-embodied scenario can be seen as a novel sensation, and the questions of
the Virtual Embodiment Questionnaire [43] may not optimally capture the perceived
agency and SoE.
67

CHAPTER 6
Conclusion
6.1 Summary
In this thesis, we have presented an experimental platform where different interaction
metaphors can be used to enable co-embodied experiences with a dissimilar avatar in
VR. This area of research is still unexplored and offers exciting opportunities in the
design of new ways of interacting with avatars, as well as the possibility of creating
novel experiences with co-embodied, dissimilar avatars that can be used, for example, in
collaborative learning environments or therapeutic applications. For the development, we
used the Unity3D engine and the Photon Unity Networking 2 (PUN2) library 1 library for
real-time synchronization of both users’ movements. Two users can launch the application
and simultaneously control a dissimilar avatar in the form of an upright standing slug.
The dissimilar avatar has four tentacles and two eye stalks. The body configuration,
i.e., which user controls which limbs, can be set freely. Furthermore, the users can use
two different selection and manipulation metaphors to grab, view, and move objects
in the virtual environment. The use of a non-linear mapping technique in the form of
the Go-Go interaction technique and a gaze-based interaction metaphor in the form of
the Gaze-and-Manipulate interaction technique allows the user to interact with distant
virtual objects and manipulate them in an efficient and intuitive way.
By conducting a user study, the proposed experimental platform was evaluated in terms of
user experience, task performance, and SoE. By comparing the two developed interaction
metaphors, the usability and effectiveness of each technique were evaluated, as well as
the co-embodiment experience when using a dissimilar avatar.
The results showed that both tested interaction metaphors evoked a high sense of agency
among the users. Furthermore, the usability of both techniques was tested and resulted
in consistently positive feedback from the participants. There was a slight preference
1https://www.photonengine.com/pun
69
6. Conclusion
for the Go-Go interaction, and it was shown that both interaction techniques have the
potential to enrich user interaction with a dissimilar avatar and that especially the Go-Go
interaction metaphor has a great potential for co-embodied interaction.
6.2 Future Work
The experimental platform presented here can be further developed in the future. One
possible improvement would be to investigate situations in which two users simultaneously
control the dissimilar avatar and interact with the environment. This was not investigated
in the user study conducted; instead, only a second instance of the application was
launched, and the limbs of the second instance were set to a static position. Although
this gave the participants a feeling of co-embodiment, it would be even more immersive
if these limbs also moved as a result of user interaction.
As described in the “Limitations“ section 5.5, the Gaze-and-Manipulate selection logic
relies on the user’s head movements. However, it would be possible to determine the
direction of gaze much more accurately with real eye movement detection. This is another
possible improvement for future updates of the experimental platform. This would require
appropriate hardware, namely HMDs that can record and interpret eye movements, such
as the “HTC Vive Pro Eye“ 2 or the “Meta Quest Pro“ 3 headset.
Overall, we have created a foundation for future development of interaction in a co-
embodied environment with a dissimilar avatar. The avatar can be made even more
user-friendly, and the topic of co-embodiment for a dissimilar avatar can be further
investigated by developing additional interaction metaphors and working on the co-
navigation of the avatar.
2https://www.vive.com/sea/product/vive-pro-eye/overview/
3https://www.meta.com/at/en/quest/quest-pro/
70
List of Figures
2.1 The three body representations to test body ownership (first-person perspec-
tive) [34]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Types of virtual avatars. (a) and (b) are anthropomorphic avatars, (c) is a
non-anthropomorphic (dissimilar) avatar [24]. . . . . . . . . . . . . . . . . 9
2.3 Different avatar representation reported in the literature. . . . . . . . . . 10
2.4 Proposed categorization system for dissimilar avatars applied to a virtual
hand [7]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Social presence configurations of agents [35]. . . . . . . . . . . . . . . . . . 12
2.6 Virtual avatar controlled based on the weighted average of the teacher’s and
learner’s movements in a co-embodiment scenario. [29]. . . . . . . . . . . 13
2.7 Full iteration of the FABRIK algorithm consisting of a forward iteration
(a)-(d) and a backward iteration (e)-(f) [3]. . . . . . . . . . . . . . . . . . 15
2.8 The mapping function F used in the Go-Go Interaction technique [41]. . . 18
2.9 The “Gaze-and-Pinch“ interaction with one or two hands: look at an object,
pinch to select it, manipulate it with hand gestures [39]. . . . . . . . . . . 19
3.1 Left - Overview of the project’s architecture, including the co-embodied
dissimilar avatar. Right - An example body configuration of the two users
showing the limbs they control. . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 The interaction test scene in which the user can grab cabbages using the
interaction metaphors (left - in third-person perspective, right - first-person
perspective). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Overview of the room creation and networking logic when starting the appli-
cation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Overview of the logic of the synchronized and networked movements of the
players that drive the dissimilar avatar limbs. . . . . . . . . . . . . . . . . 26
3.5 Left: Shaded model of the dissimilar avatar. Right: Transparent avatar with
its rig and bones structure. . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6 Go-Go interaction with different arm extensions. Left: real arm extension
movement. Middle: movement of the virtual avatars’ arm. Right: First-
person perspective of the virtual arm. The “wristband“ objects represent the
positions of the real hands. . . . . . . . . . . . . . . . . . . . . . . . . . . 30
71
3.7 The nine rays that check for gazed objects. The gaze ray distance (GRD)
defines the spacing between the individual rays. (Slightly tilted view to be
able to see the rays.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.8 Animation of the arm halo retrieving the distant object. Once the object is
close enough, the halo mesh is turned off. . . . . . . . . . . . . . . . . . . 32
3.9 The possible spawning positions of the objects for five angles and two distances
(pink = medium distance, blue =long distance). . . . . . . . . . . . . . . . 36
3.10 The UI during each sub-trial to indicate the interaction metaphor and the
arm to use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.11 The UI after each sub-trial with a 7-point Likert scale slider. . . . . . . . 38
3.12 The UI after both sub-trials with a choice between the two interaction
metaphors to indicate the preference. . . . . . . . . . . . . . . . . . . . . . 39
3.13 One full iteration of the evaluation system for one body configuration. Blue
boxes: system actions. Green boxes: user actions. . . . . . . . . . . . . . . 41
4.1 Left: User wearing a HMD and controllers performing a grabbing task. Right:
First-person perspective of the VR environment. . . . . . . . . . . . . . . 46
4.2 Participants’ experience levels with VR with an HMD and video games. . 47
4.3 Procedure of the study with the approximate time allocated to each part. 48
4.4 Task completion times for both interaction metaphors for the independent
variables “distance“ (left) and “angle“ (right). . . . . . . . . . . . . . . . . 53
4.5 Scores for the agency statement displayed after each sub-trial for the inde-
pendent variables “metaphors“ (top left), “arm“ (top right), “angle“ (bottom
left), and “distance“ (bottom right). . . . . . . . . . . . . . . . . . . . . . 54
4.6 Preference scores for the independent variables “distance“ (top left), “angle“
(top right) and “body configuration and angle“ (bottom). . . . . . . . . . 56
4.7 Boxplots of the ownership statements (OW1-OW4), agency statements (AG1-
AG4) and change statements (CH1-CH4) scores from the Virtual Embodiment
Questionnaire [43]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.8 Boxplots of the self-location statements scores (SL1-SL6) grouped by body
configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.9 Boxplots of the self-location statements scores (SL7-SL11). . . . . . . . . 59
4.10 Scores for the post-experiment questionnaire statements. . . . . . . . . . . 61
72
List of Tables
4.1 Order of the body configurations defined by a Balanced Latin Square design.
For the fifth participant, the order of participant #1 was used again, and so
on. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 First set of statements about ownership (OW), agency (AG) and change
(CH) of the post-trial questionnaire. The participants answered on a 7-point
Likert-type scale indicating the extent to which the statement applied to them
during the trial (1=strongly disagree, 7=strongly agree). Statements taken
from [43]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3 Second set of statements about self-location (SL) of the post-trial questionnaire.
The participants answered on a 7-point Likert-type scale indicating the extent
to which the statement applied to them during the trial (1=strongly disagree,
7=strongly agree). Custom statements by the authors. . . . . . . . . . . . 51
4.4 Statements of the post-experiment questionnaire about the VR experience
and interaction metaphor usability. The participants answered on a 7-point
Likert-type scale indicating the extent to which the statement applied to them
during the trial (1=strongly disagree, 7=strongly agree). Custom statements
by the authors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
73

Bibliography
[1] Ferran Argelaguet and Carlos Andujar. A survey of 3d object selection techniques
for virtual environments. Computers & Graphics, 37(3):121–136, 2013.
[2] Ferran Argelaguet, Ludovic Hoyet, Michaël Trico, and Anatole Lécuyer. The role of
interaction in virtual embodiment: Effects of the virtual hand representation. In
2016 IEEE virtual reality (VR), pages 3–10. IEEE, 2016.
[3] Andreas Aristidou and Joan Lasenby. Fabrik: A fast, iterative solver for the inverse
kinematics problem. Graphical Models, 73(5):243–260, 2011.
[4] Chris Auteri, Mark Guerra, and Scott Frees. Increasing precision for extended reach
3d manipulation. International Journal of Virtual Reality, 12(1):66–73, 2013.
[5] Samantha Bond, Deepika R Laddu, Cemal Ozemek, Carl J Lavie, and Ross Arena.
Exergaming and virtual reality for health: implications for cardiac rehabilitation.
Current Problems in Cardiology, 46(3):100472, 2021.
[6] Matthew Botvinick and Jonathan Cohen. Rubber hands ‘feel’touch that eyes see.
Nature, 391(6669):756–756, 1998.
[7] Antonin Cheymol, Anatole Lécuyer, Jean-Marie Normand, Ferran Argelaguet, et al.
Beyond my real body: Characterization, impacts, applications and perspectives
of “dissimilar” avatars in virtual reality. IEEE Transactions on Visualization and
Computer Graphics, 2023.
[8] Carlos Coelho, Jennifer Tichon, Trevor J Hine, Guy Wallis, and Giuseppe Riva. Media
presence and inner presence: the sense of presence in virtual reality technologies.
From communication to presence: Cognition, emotions and culture towards the
ultimate communicative experience, 11:25–45, 2006.
[9] Nicolas Courty and Elise Arnaud. Inverse kinematics using sequential monte carlo
methods. In International Conference on Articulated Motion and Deformable Objects,
pages 1–10. Springer, 2008.
[10] Nicole David, Albert Newen, and Kai Vogeley. The “sense of agency” and its
underlying cognitive and neural mechanisms. Consciousness and cognition, 17(2):523–
534, 2008.
75
[11] Diane Dewez, Rebecca Fribourg, Ferran Argelaguet, Ludovic Hoyet, Daniel Mestre,
Mel Slater, and Anatole Lécuyer. Influence of personality traits and body awareness
on the sense of embodiment in virtual reality. In 2019 IEEE international symposium
on mixed and augmented reality (ISMAR), pages 123–134. IEEE, 2019.
[12] Diane Dewez, Ludovic Hoyet, Anatole Lécuyer, and Ferran Argelaguet Sanz. Towards
“avatar-friendly” 3d manipulation techniques: Bridging the gap between sense of
embodiment and interaction in virtual reality. In Proceedings of the 2021 CHI
Conference on Human Factors in Computing Systems, pages 1–14, 2021.
[13] Lisa A Elkin, Matthew Kay, James J Higgins, and Jacob O Wobbrock. An aligned
rank transform procedure for multifactor contrast tests. In The 34th annual ACM
symposium on user interface software and technology, pages 754–768, 2021.
[14] Guoxin Fang, Yingjun Tian, Zhi-Xin Yang, Jo MP Geraedts, and Charlie CL Wang.
Efficient jacobian-based inverse kinematics with sim-to-real transfer of soft robots
by learning. IEEE/ASME Transactions on Mechatronics, 27(6):5296–5306, 2022.
[15] Chlöé Farrer, M Bouchereau, Marc Jeannerod, and Nicolas Franck. Effect of distorted
visual feedback on the sense of agency. Behavioural neurology, 19(1-2):53–57, 2008.
[16] Rebecca Fribourg, Nami Ogawa, Ludovic Hoyet, Ferran Argelaguet, Takuji Narumi,
Michitaka Hirose, and Anatole Lécuyer. Virtual co-embodiment: evaluation of the
sense of agency while sharing the control of a virtual body among two individuals.
IEEE Transactions on Visualization and Computer Graphics, 27(10):4023–4038,
2020.
[17] Andrew Goldenberg, Beno Benhabib, and Robert Fenton. A complete general-
ized solution to the inverse kinematics of robots. IEEE Journal on Robotics and
Automation, 1(1):14–20, 1985.
[18] Teresia Gowases, Roman Bednarik, and Markku Tukiainen. Gaze vs. mouse in games:
The effects on user experience. In Proceedings of the International Conference on
Advanced Learning Technologies, Open Contents & Standards (ICCE), pages 773–777,
2008.
[19] Tovi Grossman and Ravin Balakrishnan. Pointing at trivariate targets in 3d envi-
ronments. In Proceedings of the SIGCHI conference on Human factors in computing
systems, pages 447–454, 2004.
[20] Harin Hapuarachchi and Michiteru Kitazaki. Knowing the intention behind limb
movements of a partner increases embodiment towards the limb of joint avatar.
Scientific Reports, 12(1):11453, 2022.
[21] Ludovic Hoyet, Ferran Argelaguet, Corentin Nicole, and Anatole Lécuyer. “wow! i
have six fingers!”: Would you accept structural changes of your hand in vr? Frontiers
in Robotics and AI, 3:27, 2016.
76
[22] Hamid Hrimech, Leila Alem, and Frederic Merienne. How 3d interaction metaphors
affect user experience in collaborative virtual environment. Advances in Human-
Computer Interaction, 2011(1):172318, 2011.
[23] Yu Jiang, Zhipeng Li, Mufei He, David Lindlbauer, and Yukang Yan. Handavatar:
Embodying non-humanoid virtual avatars through hands. In Proceedings of the 2023
CHI Conference on Human Factors in Computing Systems, pages 1–17, 2023.
[24] Dominic Kao. The effects of anthropomorphic avatars vs. non-anthropomorphic
avatars in a jumping game. In Proceedings of the 14th international conference on
the foundations of digital games, pages 1–5, 2019.
[25] Ben Kenwright. Inverse kinematics–cyclic coordinate descent (ccd). Journal of
Graphics Tools, 16(4):177–217, 2012.
[26] Konstantina Kilteni, Raphaela Groten, and Mel Slater. The sense of embodiment in
virtual reality. Presence: Teleoperators and Virtual Environments, 21(4):373–387,
2012.
[27] Chang-Seop Kim, Myeongul Jung, So-Yeon Kim, Kwanguk Kim, et al. Controlling
the sense of embodiment for virtual avatar applications: methods and empirical
study. JMIR Serious Games, 8(3):e21879, 2020.
[28] Daiki Kodama, Takato Mizuho, Yuji Hatada, Takuji Narumi, and Michitaka Hirose.
Effect of weight adjustment in virtual co-embodiment during collaborative training.
In Proceedings of the Augmented Humans International Conference 2023, pages
86–97, 2023.
[29] Daiki Kodama, Takato Mizuho, Yuji Hatada, Takuji Narumi, and Michitaka Hirose.
Effects of collaborative training using virtual co-embodiment on motor skill learning.
IEEE Transactions on Visualization and Computer Graphics, 29(5):2304–2314, 2023.
[30] Andrey Krekhov, Sebastian Cmentowski, Katharina Emmerich, and Jens Krüger.
Beyond human: Animals as an escape from stereotype avatars in virtual reality
games. In Proceedings of the annual symposium on computer-human interaction in
play, pages 439–451, 2019.
[31] Andrey Krekhov, Sebastian Cmentowski, and Jens Krüger. Vr animals: Surreal body
ownership in virtual reality games. In Proceedings of the 2018 Annual Symposium
on Computer-Human Interaction in Play Companion Extended Abstracts, pages
503–511, 2018.
[32] R. Likert. A Technique for the Measurement of Attitudes. Number Nr. 136-165 in A
Technique for the Measurement of Attitudes. Columbia university, 1932.
[33] Christos Lougiakis, Akrivi Katifori, Maria Roussou, and Ioannis-Panagiotis Ioannidis.
Effects of virtual hand representation on interaction and embodiment in hmd-based
77
virtual environments using controllers. In 2020 IEEE Conference on Virtual Reality
and 3D User Interfaces (VR), pages 510–518. IEEE, 2020.
[34] Jean-Luc Lugrin, Maximilian Ertl, Philipp Krop, Richard Klüpfel, Sebastian Stier-
storfer, Bianka Weisz, Maximilian Rück, Johann Schmitt, Nina Schmidt, and
Marc Erich Latoschik. Any “body” there? avatar visibility effects in a virtual reality
game. In 2018 IEEE conference on virtual reality and 3D user interfaces (VR),
pages 17–24. IEEE, 2018.
[35] Michal Luria, Samantha Reig, Xiang Zhi Tan, Aaron Steinfeld, Jodi Forlizzi, and John
Zimmerman. Re-embodiment and co-embodiment: Exploration of social presence for
robots and conversational agents. In Proceedings of the 2019 on Designing Interactive
Systems Conference, pages 633–644, 2019.
[36] I Scott MacKenzie. Fitts’ law as a research and design tool in human-computer
interaction. Human-computer interaction, 7(1):91–139, 1992.
[37] Ramakrishnan Mukundan. A robust inverse kinematics algorithm for animating a
joint chain. International Journal of Computer Applications in Technology, 34(4):303–
308, 2009.
[38] Solène Neyret, Anna I Bellido Rivas, Xavi Navarro, and Mel Slater. Which body
would you like to have? the impact of embodied perspective on body perception and
body evaluation in immersive virtual reality. Frontiers in Robotics and AI, 7:492886,
2020.
[39] Ken Pfeuffer, Benedikt Mayer, Diako Mardanbegi, and Hans Gellersen. Gaze+ pinch
interaction in virtual reality. In Proceedings of the 5th symposium on spatial user
interaction, pages 99–108, 2017.
[40] Thibault Porssut, Olaf Blanke, Bruno Herbelin, and Ronan Boulic. Reaching
articular limits can negatively impact embodiment in virtual reality. Plos one,
17(3):e0255554, 2022.
[41] Ivan Poupyrev, Mark Billinghurst, Suzanne Weghorst, and Tadao Ichikawa. The
go-go interaction technique: non-linear mapping for direct manipulation in vr. In
Proceedings of the 9th annual ACM symposium on User interface software and
technology, pages 79–80, 1996.
[42] Anna Samira Praetorius and Daniel Görlich. How avatars influence user behavior: A
review on the proteus effect in virtual environments and video games. In Proceedings
of the 15th International Conference on the Foundations of Digital Games, pages
1–9, 2020.
[43] Daniel Roth and Marc Erich Latoschik. Construction of the virtual embodiment
questionnaire (veq). IEEE Transactions on Visualization and Computer Graphics,
26(12):3546–3556, 2020.
78
[44] Linda E Sibert and Robert JK Jacob. Evaluation of eye gaze interaction. In
Proceedings of the SIGCHI conference on Human Factors in Computing Systems,
pages 281–288, 2000.
[45] William Steptoe, Anthony Steed, and Mel Slater. Human tails: ownership and
control of extended humanoid avatars. IEEE transactions on visualization and
computer graphics, 19(4):583–590, 2013.
[46] Matthias Weise, Raphael Zender, and Ulrike Lucke. How can i grab that? solving
issues of interaction in vr by choosing suitable selection and manipulation techniques.
i-com, 19(2):67–85, 2020.
[47] Jacob O Wobbrock, Leah Findlater, Darren Gergle, and James J Higgins. The aligned
rank transform for nonparametric factorial analyses using only anova procedures.
In Proc. of the ACM SIGCHI conference on human factors in computing systems,
pages 143–146, 2011.
[48] Erik Wolf, Nathalie Merdan, Nina Dölinger, David Mal, Carolin Wienrich, Mario
Botsch, and Marc Erich Latoschik. The embodiment of photorealistic avatars
influences female body weight perception in virtual reality. In 2021 IEEE Virtual
Reality and 3D User Interfaces (VR), pages 65–74. IEEE, 2021.
[49] Yao Xu, Ding Ding, Yongxin Chen, Zhuying Li, and Xiangyu Xu. istraypaws:
Immersing in a stray animal’s world through first-person vr to bridge human-animal
empathy. In 30th ACM Symposium on Virtual Reality Software and Technology,
pages 1–11, 2024.
[50] Laura Zapparoli, Eraldo Paulesu, Marika Mariano, Alessia Ravani, and Lucia M
Sacheli. The sense of agency in joint actions: A theory-driven meta-analysis. Cortex,
148:99–120, 2022.
79
Appendix
Appendix A
1 private void MapHandPosition(Vector3 pivot, PlayerArmStruct arm)
2 {
3 // calculate the direction and distance between real arm position and pivot
4 Vector3 handPivot = arm.RealHand.position - pivot;
5 float R_r = Vector3.Distance(arm.RealHand.position, pivot);
6 Vector3 handPivot_norm = handPivot/R_r;
7
8 if (R_r >= _D)
9 {
10 // non-linear part of the mapping function
11 float R_r_ = R_r + _coeffK * Mathf.Pow(R_r - _D, 2);
12
13 // set the new position
14 arm.VirtualHand.position = pivot + handPivot_norm * R_r_;
15 }
16 else
17 {
18 // lienar part of the mapping function
19 arm.VirtualHand.position = arm.RealHand.position;
20 }
21
22 // set the new rotation
23 arm.VirtualHand.rotation = arm.TentacleTip.rotation;
24 }
Listing 1: The MapHandPosition function that maps the real hand position to the
Go-Go hand position based on a non-linear mapping function.
80
Appendix B
1 Vector3 targetPosition;
2 Quaternion targetRotation = GetRotationRootSpace(Target);
3
4 // check which target has to be used based on the selected interaction
metaphor
5 if (_isVirtualArm && RetrievingTarget != null)
6 {
7 // this is the character halo representation
8 targetPosition = GetPositionRootSpace(RetrievingTarget);
9 }
10 else if (_characterController.ArmUsesGoGoInteraction(GoGoTarget))
11 {
12 // follow the go-go target (stretch)
13 targetPosition = GetPositionRootSpace(GoGoTarget);
14 }
15 else
16 {
17 // default interaction (no stretch)
18 targetPosition = GetPositionRootSpace(Target);
19 }
Listing 2: Check for end-effector targets in the FABRIK IK logic.
Appendix C
1 private void StretchIK(Vector3 targetPos)
2 {
3 var direction = (targetPos - Positions[0]).normalized;
4
5 // calculate ’tentacle tip/end-effector’ and ’root bone/end-effector’-
distances
6 float remainingLen = (targetPos - Positions[Positions.Length - 1]).
magnitude;
7 float totLength = (targetPos - Positions[0]).sqrMagnitude;
8
9 for (int i = 0; i < BonesLength.Length; i++)
10 {
11 // calculate the desired length of the current bone
12 float desiredLength = (BonesLength[i] / CompleteLength) * remainingLen;
13
14 // check if the arm should stretch or shrink
15 bool stretch = totLength > CompleteLength * CompleteLength;
16 if (stretch)
17 BonesLength[i] += desiredLength; // Stretch
18 else
19 BonesLength[i] -= desiredLength; // Shrink
20 }
21
22 // update complete length variable
23 CompleteLength = BonesLength.Sum(l => l);
24
25 //set everything after root
26 for (int i = 1; i < Positions.Length; i++)
27 Positions[i] = Positions[i - 1] + direction * BonesLength[i - 1];
28 }
Listing 3: The logic that stretches or shrinks the bones.
Appendix D
1 private List<(Vector3, int, string)> CalculatePositions()
2 {
3 List<(Vector3, int, string)> pos = new List<(Vector3, int, string)>();
4 System.Random rnd = new System.Random();
5
6 foreach (float angleDeg in _anglesInDeg)
7 {
8 float angleRad = angleDeg * Mathf.PI / 180f;
9
10 foreach (float dist in _distances)
11 {
12 float yPos = _height + ((float)rnd.NextDouble() * (2 *
_randomHeightOffset) - _randomHeightOffset);
13 pos.Add((new Vector3(dist * -Mathf.Cos(angleRad), yPos, dist * -Mathf.
Sin(angleRad)), (int)angleDeg, dist.Equals(_distanceMedium) ? "Medium" :
"Long"));
14 }
15 }
16 return pos;
17 }
Listing 4: Calculating grabbable positions based on the angles and distances chosen.
Appendix E
1 private IEnumerator ShowPreferredMetaphorPanel()
2 {
3 // ask question which metaphor was better
4 _textManager.SetPreferredTechniquePanel(UserStudyData[_currentTrial.
TrialName].SubtrialOne.Metaphor, UserStudyData[_currentTrial.TrialName].
SubtrialTwo.Metaphor);
5
6 while(!_textManager.PreferredValueSelected)
7 yield return new WaitForEndOfFrame();
8
9 UserStudyData[_currentTrial.TrialName].PreferredInteractionMetaphor =
10 _textManager.GetPreferredMetaphorValue() == 1
11 ? UserStudyData[_currentTrial.TrialName].SubtrialOne.Metaphor
12 : UserStudyData[_currentTrial.TrialName].SubtrialTwo.Metaphor;
13
14 SpawnNextSubTrial();
15 }
Listing 5: The ShowPreferredMetaphorPanel function to obtain the user’s
preference.
Appendix F
1 private bool GetOneShotDirectionValueFromJoystick(InputAction.CallbackContext
context, bool left, ref bool joystickPerformed, float threshold)
2 {
3 float xValue = context.ReadValue<Vector2>().x;
4 if (joystickPerformed && Mathf.Abs(xValue) < 0.01f)
5 {
6 // reset joystickPerformed when near the center
7 joystickPerformed = false;
8 return false;
9 }
10
11 if (left && xValue < -threshold && !joystickPerformed)
12 {
13 joystickPerformed = true;
14 return true;
15 }
16
17 if (!left && xValue > threshold && !joystickPerformed)
18 {
19 joystickPerformed = true;
20 return true;
21 }
22
23 return false;
24 }
Listing 6: Reading the joystick input and converting it to a “one shot“ boolean value.
Appendix G
* Erforderlich
Dissimilar Avatar Selection Metaphor user study form
Demographics
Der Wert muss eine Zahl sein.
UserID * 1.
Woman
Man
Non-binary
Prefer not to disclose
Prefer to self describe
What is your gender? * 2.
Please self describe your gender3.
Geben Sie eine Zahl größer als 17 ein.
Age * 4.
Left
Right
Both
What is your dominant hand * 5.
What is your arm length (filled by experimenter) * 6.
Please answer the following questions * 7.
0 Never 1 2 3 4
Die Zahl muss zwischen 0 und 10 liegen
On a scale of 0–10, 0 being how you felt coming in, 10 is that you want to stop the 
experiment, where are you now? * 
8.
Have you
experienced
virtual reality
with a head
mounted
display?
Have you
experienced
videogames?
Condition 1
Left
Right
Eye Condition tested (filled by experimenter) * 9.
Upper
Lower
Body Condition tested (filled by experimenter) * 10.
Die Zahl muss zwischen 0 und 10 liegen
On a scale of 0–10, 0 being how you felt coming in, 10 is that you want to stop the 
experiment, where are you now? * 
11.
Please read each statement and answer on a 1 to 7 scale indicating   how much each 
statement applied to you during the experiment. There are   no right or wrong answers. 
Please answer spontaneously and  intuitively. Scale example: 1–strongly disagree, 4–neither 
agree nor  disagree, 7–strongly agree. * 
12.
Strongly disagree Disagree Somewhat disagree Neither agree nor
disagree Somewhat a
It felt like the
virtual body
was my body.
It felt like the
virtual body
parts were my
body parts.
The virtual body
felt like a
human body.
It felt like the
virtual body
belonged to
me.
The movements
of the virtual
body felt like
they were my
movements.
I felt like I was
controlling the
movements of
the virtual
body.
I felt like I was
causing the
movements of
the virtual
body.
The movements
of the virtual
body were in
sync with my
own
movements
I felt like the
form or
appearance of
my own body
had changed.
I felt like the
weight of my
own body had
changed.
I felt like the
size (height) of
my own body
had changed.
I felt like the
width of my
own body had
changed.
Please read each statement and answer on a 1 to 7 scale indicating   how much each 
statement applied to you during the experiment. There are   no right or wrong answers. 
Please answer spontaneously and  intuitively. Scale example: 1–strongly disagree, 4–neither 
agree nor  disagree, 7–strongly agree. * 
13.
Strongly disagree Disagree Somewhat disagree Neither agree nor
disagree Somewhat a
I felt as if my
body was
located in the
center of the
virtual body.
I felt as if my
body was
located to the
left of the
virtual body.
I felt as if my
body was
located to the
right of the
virtual body.
I felt as if my
head was
located in the
center of the
virtual body.
I felt as if my
head was
located to the
left of the
virtual body.
I felt as if I my
head located to
the right of the
virtual body.
I felt as if my
arms were
where I saw the
upper arms of
the virtual body
to be.
I felt as if my
arms were
where I saw the
lower arms of
the virtual body
to be.
I felt as if my
arms were to
the left from
where I saw the
arms of the
virtual body to
be.
I felt as if my
arms were to
the right from
where I saw the
arms of the
virtual body to
be.
It was easy to
grab the
cabbages.
Dieser Inhalt wurde von Microsoft weder erstellt noch gebilligt. Die von Ihnen übermittelten Daten werden an den Formulareigentümer
gesendet.
Microsoft Forms
Post experiment questionnaire
Die Zahl muss zwischen 0 und 10 liegen
On a scale of 0–10, 0 being how you felt coming in, 10 is that you want to stop the
experiment, where are you now? *
29.
Please answer the following questions *30.
Strongly disagree Disagree Somewhat disgree Neither agree or
disagree Somewhat a
Do you have any additional comments?31.
I liked my
virtual body.
My virtual body
was disturbing.
It was easy to
interact with
the Gogo
technique.
It was easy to
interact with
the Gaze and
Pinch
technique.