International Conference on Artificial Reality and Telexistence
Eurographics Symposium on Virtual Environments (2024)
S. Hasegawa, N. Sakata and V. Sundstedt (Editors)
Towards Environment- and Task-Independent Locomotion
Prediction for Haptic VR
Shokoofeh Varzandeh †1 Khrystyna Vasylevska †‡ 2 Emanuel Vonach2 and Hannes Kaufmann2
1 Amirkabir University of Technology, Iran 2TU Wien, Austria
Abstract
The use of robots presenting physical props has significantly enhanced the haptic experience in virtual reality. Autonomous
mobile robots made haptic interaction in large walkable virtual environments feasible but brought new challenges. For effective
operation, a mobile robot must not only track the user but also predict her future position for the next several seconds to
be able to plan and navigate in the common space safely and timely. This paper presents a novel environment- and task-
independent concept for locomotion-based prediction of the user position within a chosen range. Our approach supports the
dynamic placement of haptic content with minimum restrictions. We validate it based on a real use case by making predictions
within a range of 2 m to 4 m or 2 s to 5 s. We also discuss the adaptation to arbitrary space sizes and configurations with
minimal real data collection. Finally, we suggest optimal utilization strategies and discuss the limitations of our approach.
CCS Concepts
• Human-centered computing → Virtual reality; Interaction techniques;
1. Introduction
Recent technological developments have led to an exciting combi-
nation of previously independent technologies. One of them brings
together synthetic visual experiences in virtual reality (VR) and
the abilities of robots to create encountered-type haptic devices
(ETHD) that make VR content tangible [YHK96, MHSM∗21].
This integration enhances the realism of the simulated worlds by
providing physical props for interaction, supporting the illusion
with haptic stimuli. Recently, this concept was extended to mo-
bile robots, making it applicable to large walkable VR environ-
ments [SHZ∗20, MVVK23]. However, collocating a mobile robot
with a VR user blindfolded by the headset raises safety concerns
and requires high system reliability. At the same time, the simu-
lation realism should not suffer, and the haptic objects should al-
ready be in place when the user reaches them. Consequently, the
core challenges in human-robot interaction in VR revolve around
safety and response time [MML21]. The robot should be aware of
the user and maintain a safe distance from her, especially during
locomotion. That, in turn, might delay the serving of a haptic prop,
which increases the robot’s response time. That poses a challenge
as planning and navigating takes time, especially if the next ob-
ject for interaction is not known in advance. Many existing ETHDs
tackle this by predefining and optimizing the positions of the hap-
tic objects [VGK17] or by utilizing a specific task for the user to
† These authors contributed equally.
‡ khrystyna.vasylevska@tuwien.ac.at
create a time gap between the haptic interactions for the robot to
move. Support of unrestricted haptic interaction with a number of
arbitrarily or dynamically placed objects is still problematic due to
these requirements. One way to address this is to anticipate which
haptic object or group of objects the user will interact with next.
This way, the robot can already navigate to the predicted position
before the user arrives. In VR scenes with predetermined interac-
tion locations, such as a museum, an algorithm can be trained to
predict in real-time [DMAH24] to provide more time for the robot
to respond. Yet, anticipating interactions becomes challenging in a
large VR environment with dynamic content placement, like a large
architectural studio during an unpredictable creative process.
This work presents a more universal space-independent real-time
prediction concept that supports unrestricted content placement.
We discuss the specifics of our probabilistic model and evaluate
which features work best for it. We also suggest a better data align-
ment method for our prediction approach. Furthermore, we demon-
strate that the retraining can be performed with minimal losses of
accuracy using synthetic data, minimizing the need for real user
data collection after each change. In addition, we discuss how our
concept can be adjusted for the specifics of a given space, users,
and other requirements to achieve high variability and scalability.
2. Related Work
Human locomotion prediction involves forecasting a person’s fu-
ture positions, trajectories, or actions based on current and past
movement patterns. Many researchers explored the use of Gaussian
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
This is an open access article under the terms of the Creative Commons Attribution License, which
permits use, distribution and reproduction in any medium, provided the original work is properly
cited.
2 of 10 Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR
Processes (GPs) or Gaussian Mixture Models (GMMs) to recog-
nize or predict such movement patterns. While GPs predict future
points based on the relation to existing data, GMMs can be used
to find how data is clustered. Tay and Laugier [TL08] developed a
framework using GMMs and GPs to predict the movement of dy-
namic objects in familiar scenes. Kim et al. [KLE11] focused on
creating continuous dense flow fields from sparsely collected vec-
tor sequences. Yoo et al. [YYY∗16] aimed to identify prevalent
patterns within a scene and their concurrent occurrence propen-
sities using a mixture of topics and GMMs. They clustered ob-
served movement tracks into distinct groups, representing typical
patterns that co-occur with a significant likelihood, and predictions
were based on the most dominant pattern group. Makansi et al.
[MIÇB19] presented a mixture density network architecture, which
generates a spectrum of possible future positions at fixed intervals
and then fits a mixture of Gaussian or Laplace distributions to these
predictions. Carvalho et al. [CVPK19] leveraged large databases of
observed trajectories and combined the concepts of localized move-
ment patterns and clustering by representing each cluster with a lin-
ear vector field over a space map. All these methodologies focus on
generalizing statistical data within a specific environment, but the
final results are space-bound and not universal.
In contrast, location-agnostic approaches match observed partial
trajectories to a library of prototype paths, which offers the flexi-
bility to be employed in any free space. Hermes et al. [HWSK09]
predicted vehicular paths by comparing the observed trajectory to
a collection of patterns using a rotation-invariant distance metric.
Keller et al. [KHG11] introduced a probabilistic hierarchical tra-
jectory matching approach that employs a probabilistic tree of sam-
pled human movement snippets to locate a matching sub-sequence.
Trautman and Krause [TK10] demonstrated the use of GPs for pre-
dicting individual trajectories, with an interaction potential that ad-
justs the trajectory set based on the proximity of people at each
moment in time. Later, they integrated goal information into the
model [TMMK13], adding the desired destination as a training
point within the GP. Xiao et al. [XWF15] categorized sample paths
into pre-set motion classes and standardized them by aligning their
starting points and extending along a common axis. Although these
approaches offer more flexibility regarding the environment, they
require a large collection of general movement patterns or need to
be tailored for a specific task.
Dynamic Time Warping (DTW) is widely used to analyze the
similarity between two movement paths. It can be employed to
build robust path prediction models by finding an optimal align-
ment to historical path variations. Unhelkar et al. [UPSS15] used
DTW to build a prediction model for human motion trajectories
to navigate mobile robots safely in the same environment. Pérez-
D’Arpino and Shah [PS15] anticipated human hand-reaching mo-
tions employing DTW for safe cooperation with a robotic arm. In
order to reduce the computational complexity of DTW, Choi et
al. [CCLJ20] presented a constrained DTW technique only consid-
ering alignments in a limited window. However, DTW has signifi-
cant computational costs, resulting in a trade-off between flexibility
and real-time requirements. In contrast, our proposed method sim-
plifies the alignment process, which makes it more robust against
noise and deviations at less computational costs, and is suitable for
real-time applications.
Alternatively, researchers employ unsupervised learning meth-
ods or Convolutional Neural Networks (CNNs) to derive patterns
directly from data for prediction. Käfer et al. [KHW∗10] introduced
a method based on a coupled Hidden Markov Model for concurrent
vehicle trajectory estimation at crossroads. Luber et al. [LSSA12]
explored the joint interactions between pairs of pedestrians, em-
ploying social dynamics to learn motion prototypes based on ob-
served relative motion in public spaces. Their methodology em-
ployed an unsupervised clustering technique to predict the most
likely paths for two individuals approaching a point of interaction.
Su et al. [SZDZ17] put forward an approach harnessing a social-
aware Long Short-Term Memory (LSTM) network as a crowd de-
scriptor, which was then integrated with a deep GP to forecast a
comprehensive distribution over future pathways for all individuals
in a crowd. Nikhil and Tran Morris [NM18] proposed an approach
using CNNs to map an input trajectory of a specified length to an
entire future path. [MLSL19] Mao et al. treat the human pose as
a graph to train a CNN for up to 1 s motion prediction in trajec-
tory space. Chai et al. [CSBA20] adopted a different strategy by
using a fixed set of "anchor" trajectories, which are state sequences
clustered from training data and represent possible future behavior
modes. These anchors serve as inputs to a CNN that infers mid-
level scene features and predicts a discrete distribution over the
anchors. The model also calculates offsets from the anchor way-
points and uncertainties to produce a Gaussian mixture at each time
step. [WMS21] Wang et al. train a neural network with pose data to
predict the position of a walking human 0.5 s in advance. [GDS∗23]
Guo et al. reduce the parameter set for a neural network to only
0.14 million for 1 s human pose prediction. These prediction meth-
ods reflect a shift away from strictly sequential models towards
frameworks that accommodate the complex and dynamic nature of
motion in real-world environments. However, they can lack inter-
pretability of their learned models and adaptation to varying en-
vironments, goals, or users might require retraining with massive
amounts of training data.
Unlike previous solutions, we strive to create a scalable and ro-
bust prediction approach. It supports large virtual and real spaces
of different shapes and arbitrary placement of interactive objects.
One of the use cases is a mobile robot facilitating a creative pro-
cess providing the haptic interaction for a freely walking user, like
in Mortezapoor et al. [MVVK23]. In such a scenario, haptically in-
teractive objects might vary in number and be relocated at any time.
Supporting such an unrestricted yet realistic scenario is still chal-
lenging. Related works presented above often employed computa-
tionally heavy models, relying on video training, separate analysis
of video parts (e.g., trees, cars, pedestrians), or object detection in
streaming data. In contrast, our proposed concept is user-oriented
and adaptable to the specifics of the task, users, sizes and shapes
of haptic objects and spaces. It is based on short trajectories and
is more universal and lightweight, requiring fewer features. This
makes it suitable for real-time use in encountered-type haptic VR.
3. Prediction Concept
We propose to detach the prediction from the environment and
make predictions within a dedicated area around the user. This pre-
diction area should be sized to meet the requirements for the time or
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR 3 of 10
distances at which the predictions should be made. Since we focus
on the prediction without knowledge of the environment, we need
to account for rapid changes in the user’s heading within the VE
in all possible directions. This suggests a circular prediction area
around the user for the general case, as shown in Figure 1. Should
the user’s activity be limited by the nature of the task, the circu-
lar area can be reduced to a sector-based area. The prediction area
should move with the user but take into account the nearby haptic
objects. For prediction, we split the area into sectors roughly sized
to the haptic objects as shown in Figure 1 b. The center of the arch
of the sector is then marked by the prediction target. The number of
sectors determines the spatial precision of the prediction. We con-
ceptualize that each person can decide on multiple movement direc-
tions, prioritizing based on surroundings and preferences. There-
fore, we estimate the likelihood of all prediction targets simulta-
neously to anticipate the possible changes in behavior as soon as
possible. Since the prediction area should react to all the haptic
objects in the user’s proximity, the resulting trajectories might not
always start from the center of the prediction area. Therefore, we
introduce a tolerance zone in the prediction area’s center (see Fig-
ure 1 a). The tolerance zone allows better fitting of the boundary
of the prediction area to the haptic objects and facilitates organic
locomotion within the prediction area, including the changes in di-
rection. If there are no haptic objects, the prediction area moves
with the user and aligns with the haptic objects when they are in
close proximity. Once the user’s prediction boundary gets near a
haptic object, the area’s position is gradually adjusted to match one
of the targets with the object, and a prediction for possible inter-
action with the object can be easily made. If multiple objects are
nearby, the priority of the alignment is decided by the distance to
the object and the user’s current heading. Using the GMM proba-
bilistic learning on trajectories within the prediction area, we can
estimate the probability of the object for interaction that is within
the range. It is also important to decrease the influence of possible
signal noise and increase the prediction’s overall accuracy. There-
fore, our prediction algorithm considers both the current and the
recent user’s motions.
4. Training
A set of targets T = {T1,T2, . . . ,T16} is located on the bound-
ary of the prediction area. The entire set of trajectories to a tar-
get Tj is represented X j = {x1,x2, . . . ,xM}, with M being the total
number of trajectories. Each trajectory xi from the set X j is de-
scribed by a sequence of features per time step k ∈ {1,2, . . . ,Ki}:
xi = {f1, f2, . . . , fK}, where K is the total number of time steps
(frames) in a trajectory, and fk is a feature vector of a time step
k. The trajectory data comprises the ID of the trajectory and a col-
lection of feature vectors fk consisting of the following data: 2D
position vector pk ∈ R2, head yaw rotation ψhead,k ∈ [0,2π), body
yaw rotation ψbody,k ∈ [0,2π), and 2D velocity vector vk ∈ R2.
Depending on the dataset, each trajectory might differ in the
number of time steps Ki due to framerate variation, differences in
user behavior, and average path length. Therefore, we need to align
all trajectories before training. In our case, we calculated the aver-
age number of time steps Kmean and fixed it at a mean number of
372 based on all the trajectories for all the targets in our collected
Figure 1: Prediction area: (a) its structure and (b) visualization
within a virtual reconstruction of a real workspace with symbolic
haptic objects (cyan) and a user (yellow).
data. Should the individual trajectory’s Ki be shorter or longer, we
employ linear interpolation to proportionally resample the trajec-
tory to fit the Kmean. Then, we calculate the 2D velocity vector us-
ing a first-order derivative estimate with a finite difference equation
based on the Taylor extension with the fourth-order five-point back-
ward stencil [Tay16].
In the training phase, we calculate the mean feature vector µ j[k]
as follows:
µ j[k] =
1
M j
M j
∑
i=1
fi[k], (1)
where M j is the number of trajectories belonging to target Tj, and
fi[k] is the feature vector at time step k for trajectory xi leading to
target Tj. Similarly, the covariance matrix of the feature vectors at
time step k per target Tj is calculated as:
Σ j[k] =
1
M j −1
M j
∑
i=1
(fi[k]−µ j[k])(fi[k]−µ j[k])
⊤, (2)
where Σ j[k] is the covariance matrix, µ j[k] is the mean of the feature
vectors, M j is the number of trajectories belonging to target Tj, and,
fi[k] is the feature vector of trajectory xi. Note that we consider
the difference relative to the corresponding target’s orientation for
the rotation feature in mean and covariance calculation. Therefore,
every rotation difference has values between -180 and 180 degrees.
This way, we fit a Gaussian distribution for each target at every
time step, utilizing the data of the recorded trajectories according to
Equation 1 and Equation 2. The model training results in 16 trained
GMMs, one model per target.
5. Alignment for Prediction
Our model continuously analyzes the user’s tracked motion in real-
time to infer her intention. The aim is to determine which target
they are most likely going for. For this, we need to find a method
to align the current trajectory with stored distributions. The most
straightforward approach would be to use the Euclidean distance,
where we need to find a time step with minimal distance for each
GMM. However, since the prediction area is circular, we took this
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
4 of 10 Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR
into consideration and investigated a second approach to alignment.
Thus, we implemented and compared both methods in section 11.
Euclidean Alignment Method. Here, we align streaming posi-
tion data with a reference time step sequence of µ j by finding the
minimum Euclidean distance in the reference sequence for each
point in the streaming data. For each point in the streaming 2D po-
sition data puser = (p1, p2), we calculate the Euclidean distance to
every point in the position feature pi(µ j[k]) from the mean refer-
ence sequence of each target µ j. For k = 1 to Kmean,
dk =
√
(p1 − p1(µ j[k]))2 +(p2 − p2(µ j[k]))2,
eucl_index j = argmin
k
(dk)
(3)
The point k in the reference sequence with the minimum Euclidean
distance dk is used as the alignment index eucl_index j for target Tj.
Circular Alignment with Radius-Based Search. We propose
a method that relies on the circular nature of the prediction area
to align streaming data with a reference sequence. Each new data
point in the streaming data is used to calculate the distance between
the center of the circle and the user’s 2D position puser = (p1, p2)
resulting in a radius r. For each target, we take the point in the
position feature pi(µ j[k]) of the mean reference sequence µ j with a
minimum distance dk to the calculated radius. For k = 1 to Kmean,
dk = |r−
√
p2
1(µ j[k])+ p2
2(µ j[k])|,
circ_index j = argmin
k
(dk)
(4)
The time step k of minimum dk is then used as the circular align-
ment index circ_index j for target Tj.
6. Probability Inference
After the alignment, we employ GMMs to identify the target with
the highest probability of being the next destination. For that, we
calculate the log posterior to identify the target that best matches
the user’s observed trajectory xo. We utilize a Bayesian approach
[DW12] to determine the most probable target Tj based on the ob-
served trajectory xo[1 : Ko], as shown in Equation 5.
P(Tj | xo[1 : Ko])∝ P(Tj) ·P(xo[1 : Ko] | Tj), (5)
where P(Tj) is the prior probability of the target Tj. We use a uni-
form prior for all targets. P(xo[1 : Ko] | Tj) is the likelihood of ob-
serving the trajectory xo[1 : Ko] given the target Tj. The likelihood
term can be calculated:
P(xo[1 : Ko] | Tj) =
(
Ko
∏
k=1
N (µ j[k],Σ j[k])
)1/Ko
. (6)
Then we can compute the product from Equation 6 as a logarithm
for each target Tj at the time step k = Ko as expressed here:
1
Ko
Ko
∑
k=1
[
− log(2π)
Nf
2 − 1
2
log |Σ j[k]|−
1
2
δ[k]T Σ
−1
j [k]δ[k]
]
, (7)
where N f is the number of features used, δ[k] represents the differ-
ence between the observation xo[k] and the mean µ j[k] at a partic-
ular time step k, as determined by the alignment process. This esti-
mation is executed during runtime for each frame of the retrospec-
tive data points from previous frames. To predict short-term future
behavior, we also consider the recent data points. We implement
this approach by applying weighted scaling to the probabilities, di-
viding by 2^((n - 1 - j) / 10 + 1), where n is the total number of
data points, and j is the point’s index number in the sequence from
oldest to newest. Then, the resulting probabilities are summed up
for each target. The target with the highest likelihood is identified
as the best target corresponding to the observed trajectory.
7. Integration and Validation
For integration and validation of our proposed approach, we uti-
lized a real use case for large-area haptic interaction with large ob-
jects in VR served by a robot. Therefore, the parameters of the pre-
diction area were decided based on the real environment and the
targeted prediction time and precision. Consequently, real training
and evaluation data were collected for this configuration. We dis-
cuss alternative integration scenarios and reuse of the trained algo-
rithm in section 12.
Our test space sized 12 m by 13 m contains obstacles that divide
the space into two equal, interconnected rooms with a width of 6 m,
as can be seen from the reconstruction Figure 1 b. The robot’s re-
sponse time range is 2-5 s. Therefore, we chose the prediction area
with the maximum possible radius rpredict = 3 m that fits within
the room. This allows us to achieve an acceptable prediction accu-
racy within 2-3 s needed for the robot to arrive. Our sample hap-
tic elements have a 1 m2 footprint and are spread throughout the
workspace. Therefore, we split our prediction area into 16 sectors
with 1.2 m spacing between the prediction targets, each covering
an angle of 22.5◦. That is sufficient since the haptic objects are
comparable in size to the user and distributed throughout the space
to allow the user free navigation between them. Similarly, we de-
fined the tolerance zone to have a 1 m radius. The movement of the
prediction area along the direction of the user’s heading or towards
the objects in proximity is limited to 0.07 m per frame to mini-
mize the impact on the user’s relative trajectory. The fitting hap-
pens when the distance between the person and the object is within
rpredict − 0.8 m and rpredict + 0.5 m, and the distance between the
target’s center and the haptic object is within 1-2 rpredict . This con-
dition ensures fitting to multiple objects. To handle multiple nearby
objects and expand the prediction window, the circle moves to fit
the objects roughly within the user’s heading direction. Finally, the
prediction algorithm retrospect is set to the last 50 frames (approx.
0.7 s at 75 fps). We implemented the training and inference as regu-
lar Unity C# scripts. For the evaluation, we ensured precisely timed
recording and accurate replay to reflect the real framerate.
8. Technical Setup and VR Environment
We used a Windows 10 PC with an Intel i9-9900K CPU, NVIDIA
RTX 2080Ti GPU, and 32 GB RAM for the evaluation and VR
rendering. The user was provided visual input via the HTC Vive
Pro head-mounted display (HMD) with a standard wireless mod-
ule and a power bank. The tracking employed 4 HTC Vive v.2 base
stations covering the 6.5 m by 6.5 m tracking area. The head was
tracked for position and orientation with the HMD. The user also
wore one additional HTC Vive v.2 tracker on the tailbone to pro-
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR 5 of 10
Figure 2: Data collection: (a) virtual cabin positioned next to the
prediction area targets, (b) experimenter’s top view on the user’s
path, (c) a user during the data collection.
vide body orientation and two more on each foot for collecting ad-
ditional data (step length and width). We used Unity 3D 2022.1.24f
with OpenXR support for VR rendering and motion tracking. The
virtual reconstruction of the real workspace with the user inside the
prediction area is shown in Figure 1.
9. Real Data Collection
To train our prediction model, we invited 24 volunteers (12 fe-
male, 12 male) to collect locomotion data. Participants ranged in
age from 19 to 42 years (Mean = 28.83, SD = 5.48). We used the
setup described in section 8 for the data collection. The recording
was done for a stationary prediction area within the correctly regis-
tered workspace reconstruction. Note that for the training, we aim
to collect a range of trajectories from 2 m to 4 m, because trajecto-
ries < 2 m are too close to direct hand interaction range to position
a robot in time. During actual prediction, the prediction area moves
with the user and responds to her actions. Therefore, the collected
data also applies to cases with trajectories between different ob-
jects due to prediction area realignment. Inspired by Unhelkar et
al. [UPSS15], we chose to record the positions and orientations of
the HMD and trackers with timestamps at approximately 75 Hz.
Procedure and Task. Each participant was informed of the pur-
pose of the data collection, what data would be recorded, and the
possible outcomes of the VR exposure. That was followed by sign-
ing the informed consent and filling out the general questionnaire
and Kennedy SSQ questionnaire [KLBL93]. Participants were also
informed that they could pause or discontinue their participation at
any moment. Next, the participants were given the task to find and
enter a red cabin (as shown in Figure 2 a), then stay there for 3 s.
This triggered the cabin’s relocation to a new position that was al-
ternating between a random prediction target position and a random
pose within the tolerance zone. The participants were instructed to
continue chasing after the cabin in the same manner. To keep the
participants motivated, we gamified the task by granting the partic-
ipants a random piece of a puzzle picture for each visit to the red
cabin. Typical resulting trajectories are shown in Figure 2 b.
Data Preprocessing. We recorded a total of 1166 trajectories
from all 24 participants for all the targets. Due to the occasional
issues with wireless connection, the data had to be preprocessed.
Some recorded trajectories were incomplete and thus had to be dis-
carded, and some had interruptions. We identified the parts of the
Figure 3: The data preprocessing: (a) raw data, (b) filtering a hook
at the beginning of the trajectory, (c) filtering a hook at the end of
the trajectory to obtain the final trajectory data.
same trajectory by ID. If the distance between the parts was less
than 0.1 m, we combined them into a continuous trajectory. Also,
the need to search for the next cabin resulted in participants turn-
ing at the beginning and the end of trajectories, creating hook-like
trajectory ends as in Figure 3 a. As these hooks are not part of the
trajectory but rather a task artifact, we filtered these parts of tra-
jectory data. We excluded the parts of the trajectories beginning
with the head rotation exceeding 60◦ from the target direction (see
Figure 3 b) and endings with the rotation that exceeded 15◦ angle
away from the target. The filtering did not change the general flow
and shape of the trajectories. Figure 3 c shows the final result. Af-
ter that, the positions were converted to a 2D XZ plane relative to
the center of the prediction area, and we calculated velocity. The
yaw rotations of the HMD and tailbone tracker are in degrees to the
forward vector of the not-rotating prediction area.
10. Synthetic Data Generation
Previously, there was a need to collect new user data for each sig-
nificant change in the environment or task. However, since our pre-
diction area is environment-independent and is oriented only to
targets in proximity, we saw the possibility of minimizing the ef-
fort. Based on the filtered real training data, we simulate the user’s
path and tested whether our synthetic data could be used to train
the GMMs to predict for a real user with sufficient reliability. Al-
though, real human motions have multiple complex details and lim-
itations, due to our feature selection and the averaging of the feature
data, a simplified modeling is appropriate (as discussed in subsec-
tion 11.3). We modeled movement data by introducing stochastic
variations to a straightforward path and incorporating intentional
semi-randomness into each trajectory. The initial sequence of way-
points is a random selection of the starting position within the tol-
erance zone and the final position associated with the prediction
target (Tj). This path is then broken into segments roughly equal to
double the average step length of 45 cm in VR, based on our obser-
vations and prior findings [LJKM∗17]. A new waypoint is calcu-
lated for each 90 cm segment of the path, deviating from the origi-
nal direction to one side, mimicking the average distance between
the feet of 30 cm. This deviation simulates the user’s weight shift
during walking. The shift’s side is randomized for each trajectory.
New points are inserted along the path to form a target-oriented tra-
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
6 of 10 Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR
Figure 4: Standard deviations of head rotation (blue dots) along
the path to target approximated with a fitted function (red line).
jectory roughly resembling human locomotion behavior. The path
is then refined regarding the behavior and velocity using the built-
in Unity navigation agent. Our settings with double average speed
and average acceleration values with active auto-braking compen-
sate for speed reduction due to the high number of waypoints. This
results in a gradual and continuous trajectory. For large prediction
areas or spaces with many obstacles, non-linear base trajectories
might be beneficial, requiring only minor modifications.
Stochastic Modeling of Head Rotation. Using real-world data,
we analyzed the head rotation as a function of the difference to
the target. The calculated mean and standard deviation for each of
4 mm intervals (1000 per maximal trajectory length of 4 m) showed
the mean value tending to zero. Consequently, we fit a sinusoidal
function (Equation 8) with a linear attenuation trend to standard
deviation as shown at Figure 4.
f (d) = Asin(ωd +φ)+Bd +C, (8)
where A = 2.4299, ω = −3.5986, φ = 11.5808, B = 4.9929, and
C = 7.8498. This enables stochastic modeling of randomized head
rotation reflecting the behavioral uncertainty at the beginning of
the trajectory where the goal selection is not finalized. Thus, our
simulated user rotates its head relative to the body with a realistic
variability for a human-like behavior.
Finally, we take advantage of our circular prediction area to sim-
plify the synthesis of a large trajectories dataset.For that, we deploy
Unity’s NavAgent to a single target to generate a data pool. Then,
we bootstrap it to create a subset for each target and rotate the data
accordingly. Thereby, we can adapt the synthetic data to any num-
ber of targets for the same prediction area. This approach ensures
time-saving and data variation between the targets.
11. Evaluation
Our evaluation investigates the minimum required dataset size for
effective training, compares the effects of two different alignment
methods on prediction accuracy, and assesses the training with syn-
thetic data. We evaluated our approach with two types of data: real
data collected from the real participants and synthetic data that ap-
proximates the real behavior. After preprocessing the real data, we
obtained 1088 trajectories in total, resulting in 68 trajectories for
Figure 5: Influence of the different training dataset sizes (5, 10, 25,
50) per target on the resulting trained means (black lines) and vari-
ance (shaded areas) of position (top) and rotation (bottom) vectors.
each of the 16 targets. From it, we formed a training dataset with
48 trajectories and a testing dataset with 20 trajectories per target.
For the synthetic dataset, we generated 2000 trajectories and made
a training dataset by bootstrapping 100 trajectories per target.
11.1. Minimum Training Dataset Size
We addressed the question "How big should the training dataset
be?" using our synthetic training dataset due to the unlimited data
availability and compared the stability of the mean and variance
vectors for all time steps. For the comparison, we chose the sample
sizes of 5, 10, 25, 50, and 100 trajectories per target and focused
on the head position and orientation data. The comparison results
for 5 to 50 trajectories are shown in Figure 5. As can be noticed,
the shaded variance areas become better separated and homoge-
neously distributed as the number of training trajectories per target
increases. In contrast, small training samples caused these areas
to overlap and even create gaps, suggesting that some neighboring
targets are more similar than others. Similarly, the means in both
rotational and positional data become more stable and distinct with
an increased size of the dataset. This leads us to the conclusion that
the minimum size of the dataset should be more than 25 unique
trajectories per target. Moreover, if the number of targets on the
boundary of the prediction area is increased, this number should
also be proportionately increased.
11.2. Alignment Comparison
To determine the best alignment method, we compare the Eu-
clidean Alignment with nearest point search and the Circular Align-
ment with radius-based search on various combinations of features.
Since both methods rely on the head position, this feature is present
in all combinations. For this part of the evaluation, we train and test
exclusively on the real dataset. We evaluate the prediction accuracy
(highest probability match to the assigned target) relative to the dis-
tance to the target as it is a more stable reference between subjects
then time. The results of the alignment methods comparison with
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR 7 of 10
Figure 6: Prediction accuracy of Circular and Euclidean Alignments with different feature combinations relative to the distance to the target.
different feature combinations are shown in Figure 6. Overall, we
observed that the velocity v feature does not contribute to the pre-
diction performance. The likely reason is the strong variation in
magnitude and acceleration in the data. Therefore, velocity is not
suitable for our specific context. Previously, [UPSS15] made a sim-
ilar observation. Our backward stencil size was also a half smaller
smoothing the estimate, but not affecting the outcome.
Circular Alignment. The prediction performs best when using
the head position p and rotation ψhead together with the body rota-
tion ψbody feature combination (see Figure 6 a). It achieves an ac-
curacy of over 75% from the very beginning and steadily increases
from there. We see that the {p, ψhead} and {p, ψhead, ψbody}
feature sets perform similarly. The slightly better performance for
{p, ψhead, ψbody} suggests that more stable body rotation helps
to reduce the impact of possible natural head rotations on the re-
sults. Also, the less accurate results for {p, ψbody} compared to
{p, ψhead} demonstrate the importance of the head fixation on the
target at the beginning of the trajectory.
Euclidean Alignment. While the Euclidean Alignment method
also shows some promising results (see Figure 6 b), it underper-
forms compared to the Circular alignment method, resulting in
lower starting accuracy below 65% and overall steeper slopes,
reaching the highest prediction precision only 1.4 m away from
the target. Moreover, there is not a single feature combination that
steadily performs well from the beginning of the trajectory to its
end. In the beginning, the best performance is achieved by the
{p, v, ψhead, ψbody} and {p, v, ψhead}. However, later, the {p}
and {p, ψhead} perform much better.
11.3. Synthetic Training Viability for Prediction
For this evaluation, we trained the algorithm exclusively on the syn-
thetic data and tested the prediction accuracy with the real testing
dataset. Our synthetic dataset for this evaluation contained 48 tra-
jectories per target (a total of 768 trajectories). The testing dataset
was the same as in the previous subsection. We employed the Cir-
cular Alignment method as it performs best and tested the same
feature combinations to see how they compare. The results are pre-
sented in Figure 7. In this case, we can see a slight change in the
performance of the feature sets. The best prediction results were
obtained for the feature vector {p, ψhead}. That might be explained
by the stronger coupling in the synthetic data of the body rotation
with the position, whereas in the real dataset, the torso rotation
has more variance. However, the {p, ψhead, ψbody} feature set per-
forms only slightly worse than with the real dataset, achieving an
over 70% prediction accuracy within the first meter of the trajec-
tory. The slight decrease in accuracy at the end of the trajectories
is likely due to the preemptive turn-around behavior in the testing
dataset that was discussed in section 9 Data Preprocessing. Since
this behavior is a task artifact, it was not modeled in the synthetic
dataset.
Figure 7: Prediction accuracy for the synthetic training dataset
and real testing dataset with Circular Alignment.
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
8 of 10 Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR
To better understand the results for the real and the synthetic
training datasets, we looked at the two best feature vector candi-
dates and computed the mean accuracy per dataset. For the real
training, we achieved 93.4% and 93.8% mean accuracy over the
entire distance to a target for {p, ψhead} and {p, ψhead, ψbody},
respectively. For the synthetic data, the results were 86.8% and
81.1%, respectively. The results show 7-13% lower performance
for synthetic data. These parameters might be improved if the train-
ing dataset were a mix of real and synthetic data. Potentially, the
synthetic data generation can also be improved if the body rotation
is modeled similarly to the head rotation. That, in turn, might also
improve the accuracy. Ultimately, the synthetic data can be used to
train prediction models like GMMs with a limited feature vector.
The synthetic generation might also be beneficial for underrepre-
sented groups of people where the collection of real data is difficult
for the participants or the size of the real dataset is too small. This
way, the prediction models might become more inclusive, flexible,
and reduce the bias for underrepresented groups of users.
12. Discussion
In this paper, we presented a user-oriented prediction approach that
is not dependent on the environment and can be used for vari-
ous tasks with different goals. Unlike the previous solutions, our
method does not need large datasets as other GMM approaches
like [CVPK19], CNN training [LSSA12], or location agnostic so-
lutions [KHG11] to generalize. Our approach also does not re-
quire detailed knowledge of the environment [TL08, UPSS15] or
human body pose [WMS21] since we aim to predict the user’s in-
tended goal. Although inspired by [UPSS15], we use a lighter set
of GMMs that do not require multi-threading instead of the com-
putationally intensive instances of DTW. Consequently, with multi-
threading, there is a possibility of running several instances of the
predictive algorithm. This can be used to make the prediction more
inclusive. For example, one instance can focus on healthy adults,
and the other will focus on a user group with different behavioral
patterns, such as users in a wheelchair or people with ADHD, for
whom the head rotation might not be a good predictor. Addition-
ally, the instances can be focused on different ranges, for example
interaction with differently sized objects. In this case, the larger cir-
cle will predict the general direction, a large object or a group of
smaller objects. If there are small objects at the interaction loca-
tion, we speculate that the close-range interaction (< 2 m) could
be handled with a smaller prediction area. This would be similar to
the solution in [PS15] but with lighter GMMs. For instance, we can
differentiate between locomotion and hand interaction. Depending
on the specifics of the interaction or environment, it is also possible
to adapt the prediction area further. For example, to reduce the cir-
cular area to a 180◦ sector or add additional prediction targets. This
way, we can avoid the unnecessary computation behind the user or
counter the density of the objects of interest.
The short training time for our algorithm suggests that a sin-
gle instance of the algorithm might be retrained at runtime if sev-
eral datasets are available. This offers the possibility of individual-
based retraining after 15-20 minutes. In this case, our approach for
the synthetic data, with the rotation of the trajectory data and boot-
strapping, might help create a training-ready dataset. We chose the
prediction area’s radius based on the prediction time requirements
and our environment’s size. However, from a practical view, our
prediction approach can be reused in other environments thanks to
the circular design which is space-independent. It might be adapted
or scaled and retrained on the corresponding data to meet other re-
quirements such as prediction time or other ranges.
Naturally, our approach has limitations: The user’s position off
the center of the prediction area and possible changes in the lo-
comotion direction are countered mainly by the distribution of the
trajectories’ starts within the tolerance zone and its size. However,
as the density of the haptic objects in the proximity increases, there
might be cases when the prediction cannot be made in time. Due
to the fitting process prioritization for the objects in the heading di-
rection, some objects might end up deep within the prediction area
and close to the user, on the sides, or behind her. Should the user
change the direction towards one of these objects, there might be
up to approximately a second of considerably higher uncertainty of
the prediction until the prediction targets and haptic objects realign.
There is also a chance she will reach it faster than a reliable esti-
mate can be made. Or if there are neighboring objects, there can
be confusion for this short time. Furthermore, the use of synthetic
data will always lead to an accuracy loss. However, our average
drop of 10% accuracy will decrease with the improvement of the
simulation in the future. Finally, there is still a practical interrela-
tion between the size of the prediction area and the size of the space
it is deployed in. In particular, that is true when the prediction area
is much larger than the space itself. In this case, reusing the trained
area is not recommended, and requirements should be reviewed.
13. Conclusion
ETHDs presenting physical props can enhance the realism of haptic
VR tremendously. However, it brings new challenges concerning
safety and response time, which may require the ability to predict
the user’s locomotion and interaction targets ahead of time. In this
work, we proposed a novel prediction approach for haptic interac-
tion, employing a circular predictive area around the user, which
makes our method both more universal and real-time capable. We
describe the implementation, training, and performance of our ap-
proach, as well as an innovative technique to increase adaptability
and scalability by employing synthetic data for training. In our eval-
uation, we showed that a training set of more than 25 trajectories
per target could produce acceptable accuracy in our test scenario.
However, a larger and more diverse dataset would perform consid-
erably better. We also presented a Circular Alignment method for
trajectories, which proves to be an ideal match for our approach
compared to an Euclidean Alignment. We evaluated the ideal fea-
ture combination for our algorithm and the viability of using syn-
thetic data for training compared to real user data. With training
data based on 48 trajectories per target collected from real users,
our algorithm showed a prediction accuracy of almost 80% within
the first meter and up to 95% two meters from the target.
Acknowledgements
This work was funded by the Austrian Science Fund, grant F77
(SFB “Advanced Computational Design,” SP 5). Special thanks to
Alexander Schallhart and Mohammad Ghazanfahri for their help.
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR 9 of 10
References
[CCLJ20] CHOI W., CHO J., LEE S., JUNG Y.: Fast Con-
strained Dynamic Time Warping for Similarity Measure
of Time Series Data. IEEE Access 8 (2020), 222841–
222858. Conference Name: IEEE Access. URL: https:
//ieeexplore.ieee.org/abstract/document/9290106,
doi:10.1109/ACCESS.2020.3043839. 2
[CSBA20] CHAI Y., SAPP B., BANSAL M., ANGUELOV D.: Multi-
Path: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior
Prediction. In Proceedings of the Conference on Robot Learning (May
2020), PMLR, pp. 86–99. https://proceedings.mlr.press/
v100/chai20a.html. 2
[CVPK19] CARVALHO F., VEJDEMO-JOHANSSON M., POKORNY
F. T., KRAGIC D.: Long-term Prediction of Motion Trajectories Us-
ing Path Homology Clusters. In 2019 IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS) (Nov. 2019), pp. 765–
772. https://ieeexplore.ieee.org/document/8968125.
doi:10.1109/IROS40897.2019.8968125. 2, 8
[DMAH24] DOHAN M., MU M., AJIT S., HILL G.: Real-walk
modelling: deep learning model for user mobility in virtual real-
ity. Multimedia Systems 30, 1 (Jan. 2024), 44. URL: https://
doi.org/10.1007/s00530-023-01200-z, doi:10.1007/
s00530-023-01200-z. 1
[DW12] DONG S., WILLIAMS B.: Learning and Recognition of Hy-
brid Manipulation Motions in Variable Environments Using Prob-
abilistic Flow Tubes. International Journal of Social Robotics
4, 4 (Nov. 2012), 357–368. https://doi.org/10.1007/
s12369-012-0155-x. doi:10.1007/s12369-012-0155-x.
4
[GDS∗23] GUO W., DU Y., SHEN X., LEPETIT V., ALAMEDA-
PINEDA X., MORENO-NOGUER F.: Back to MLP: A Simple
Baseline for Human Motion Prediction. In 2023 IEEE/CVF
Winter Conference on Applications of Computer Vision (WACV)
(Jan. 2023), pp. 4798–4808. ISSN: 2642-9381. URL:
https://ieeexplore.ieee.org/document/10030747,
doi:10.1109/WACV56688.2023.00479. 2
[HWSK09] HERMES C., WOHLER C., SCHENK K., KUMMERT F.:
Long-term vehicle motion prediction. In 2009 IEEE Intelligent Vehi-
cles Symposium (June 2009), pp. 652–657. https://ieeexplore.
ieee.org/document/5164354. doi:10.1109/IVS.2009.
5164354. 2
[KHG11] KELLER C. G., HERMES C., GAVRILA D. M.: Will the
Pedestrian Cross? Probabilistic Path Prediction Based on Learned Mo-
tion Features. In Pattern Recognition (Berlin, Heidelberg, 2011), Mester
R., Felsberg M., (Eds.), Springer, pp. 386–395. doi:10.1007/
978-3-642-23123-0_39. 2, 8
[KHW∗10] KÄFER E., HERMES C., WÖHLER C., RITTER H., KUM-
MERT F.: Recognition of situation classes at road intersections.
In 2010 IEEE International Conference on Robotics and Automation
(May 2010), pp. 3960–3965. https://ieeexplore.ieee.org/
document/5509919. doi:10.1109/ROBOT.2010.5509919.
2
[KLBL93] KENNEDY R. S., LANE N. E., BERBAUM K. S., LILIEN-
THAL M. G.: Simulator sickness questionnaire: An enhanced method
for quantifying simulator sickness. The international journal of aviation
psychology 3, 3 (1993), 203–220. 5
[KLE11] KIM K., LEE D., ESSA I.: Gaussian process regression flow
for analysis of motion trajectories. In IEEE International Conference on
Computer Vision (Nov. 2011), pp. 1164–1171. doi:10.1109/ICCV.
2011.6126365. 2
[LJKM∗17] LAVIOLA JR. J. J., KRUIJFF E., MCMAHAN R. P., BOW-
MAN D. A., POUPYREV I.: 3D User Interfaces: Theory and Practice.
Addison-Wesley, 2017. 5
[LSSA12] LUBER M., SPINELLO L., SILVA J., ARRAS K.: Socially-
aware robot navigation: A learning approach. In IEEE/RSJ International
Conference on Intelligent Robots and Systems. IEEE/RSJ International
Conference on Intelligent Robots and Systems (Oct. 2012), pp. 902–907.
doi:10.1109/IROS.2012.6385716. 2, 8
[MHSM∗21] MERCADO V. R., HOWARD T., SI-MOHAMMED H.,
ARGELAGUET F., LÉCUYER A.: Alfred: the Haptic Butler On-Demand
Tangibles for Object Manipulation in Virtual Reality using an ETHD. In
2021 IEEE World Haptics Conference (WHC) (July 2021), pp. 373–378.
doi:10.1109/WHC49131.2021.9517250. 1
[MIÇB19] MAKANSI O., ILG E., ÇIÇEK Ö., BROX T.: Overcom-
ing Limitations of Mixture Density Networks: A Sampling and Fit-
ting Framework for Multimodal Future Prediction. In 2019 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR)
(June 2019), pp. 7137–7146. https://ieeexplore.ieee.org/
abstract/document/8953435. doi:10.1109/CVPR.2019.
00731. 2
[MLSL19] MAO W., LIU M., SALZMANN M., LI H.: Learning Trajec-
tory Dependencies for Human Motion Prediction. In 2019 IEEE/CVF
International Conference on Computer Vision (ICCV) (Oct. 2019),
pp. 9488–9496. ISSN: 2380-7504. URL: https://ieeexplore.
ieee.org/document/9009559, doi:10.1109/ICCV.2019.
00958. 2
[MML21] MERCADO V. R., MARCHAL M., LÉCUYER A.: “Haptics
On-Demand”: A Survey on Encountered-Type Haptic Displays. IEEE
Transactions on Haptics 14, 3 (July 2021), 449–464. doi:10.1109/
TOH.2021.3061150. 1
[MVVK23] MORTEZAPOOR S., VASYLEVSKA K., VONACH E., KAUF-
MANN H.: Cobodeck: A large-scale haptic vr system using a collabora-
tive mobile robot. In IEEE Conference on Virtual Reality (2023). 1,
2
[NM18] NIKHIL N., MORRIS B.: Convolutional Neural Networkfor Tra-
jectory Prediction. In European Conference on Computer Vision (ECCV)
Workshops (2018). 2
[PS15] PEREZ-D’ARPINO C., SHAH J. A.: Fast target prediction of hu-
man reaching motion for cooperative human-robot manipulation tasks
using time series classification. In 2015 IEEE International Confer-
ence on Robotics and Automation (ICRA) (Seattle, WA, USA, May
2015), IEEE, pp. 6175–6182. http://ieeexplore.ieee.org/
document/7140066/. doi:10.1109/ICRA.2015.7140066.
2, 8
[SHZ∗20] SUZUKI R., HEDAYATI H., ZHENG C., BOHN J. L., SZAFIR
D., DO E. Y.-L., GROSS M. D., LEITHINGER D.: RoomShift: Room-
scale Dynamic Haptics for VR with Furniture-moving Swarm Robots.
In Proceedings of the 2020 CHI Conference on Human Factors in Com-
puting Systems (New York, NY, USA, Apr. 2020), CHI ’20, Associ-
ation for Computing Machinery, pp. 1–11. URL: https://doi.
org/10.1145/3313831.3376523, doi:10.1145/3313831.
3376523. 1
[SZDZ17] SU H., ZHU J., DONG Y., ZHANG B.: Forecast the Plausible
Paths in Crowd Scenes. In International Joint Conference on Artificial
Intelligence (Aug. 2017), pp. 2772–2778. doi:10.24963/ijcai.
2017/386. 2
[Tay16] TAYLOR C. R.: Finite difference coefficients calculator.
https://web.media.mit.edu/~crtaylor/calculator.
html, 2016. 3
[TK10] TRAUTMAN P., KRAUSE A.: Unfreezing the robot: Naviga-
tion in dense, interacting crowds. In 2010 IEEE/RSJ International
Conference on Intelligent Robots and Systems (Oct. 2010), pp. 797–
803. https://ieeexplore.ieee.org/document/5654369.
doi:10.1109/IROS.2010.5654369. 2
[TL08] TAY M. K. C., LAUGIER C.: Modelling Smooth Paths
Using Gaussian Processes. In Field and Service Robotics: Re-
sults of the 6th International Conference, Laugier C., Siegwart R.,
(Eds.). Springer, Berlin, Heidelberg, 2008, pp. 381–390. https:
//doi.org/10.1007/978-3-540-75404-6_36. doi:10.
1007/978-3-540-75404-6_36. 2, 8
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.
10 of 10 Varzandeh, Vasylevska, Vonach & Kaufmann / Towards Environment- and Task-Independent Locomotion Prediction for Haptic VR
[TMMK13] TRAUTMAN P., MA J., MURRAY R., KRAUSE A.: Robot
navigation in dense human crowds: The case for cooperation. In IEEE
International Conference on Robotics and Automation (May 2013),
pp. 2153–2160. doi:10.1109/ICRA.2013.6630866. 2
[UPSS15] UNHELKAR V. V., PÉREZ-D’ARPINO C., STIRLING L.,
SHAH J. A.: Human-robot co-navigation using anticipatory indica-
tors of human walking motion. In 2015 IEEE International Confer-
ence on Robotics and Automation (ICRA) (May 2015), pp. 6183–6190.
doi:10.1109/ICRA.2015.7140067. 2, 5, 7, 8
[VGK17] VONACH E., GATTERER C., KAUFMANN H.: VRRobot:
Robot Actuated Props in an Infinite Virtual Environment. In Proceed-
ings of IEEE Virtual Reality 2017 (Los Angeles, CA, USA, 2017), IEEE,
pp. 74–83. URL: http://ieeexplore.ieee.org/document/
7892233/, doi:10.1109/VR.2017.7892233. 1
[WMS21] WANG A., MAKINO Y., SHINODA H.: Machine Learning-
based Human-Following System: Following the Predicted Position
of a Walking Human. In 2021 IEEE International Conference
on Robotics and Automation (ICRA) (May 2021), pp. 4502–4508.
ISSN: 2577-087X. URL: https://ieeexplore.ieee.org/
abstract/document/9561691, doi:10.1109/ICRA48506.
2021.9561691. 2, 8
[XWF15] XIAO S., WANG Z., FOLKESSON J.: Unsupervised robot
learning to predict person motion. In IEEE International Conference
on Robotics and Automation (USA, June 2015), vol. 2015, IEEE. doi:
10.1109/ICRA.2015.7139254. 2
[YHK96] YOKOKOHJI Y., HOLLIS R. L., KANADE T.: What You Can
See Is What You Can Feel - Development of a Visual/Haptic Interface to
Virtual Environment. In Proceedings of the IEEE 1996 Virtual Reality
Annual International Symposium (1996), pp. 46–53. 1
[YYY∗16] YOO Y., YUN K., YUN S., HONG J., JEONG H., CHOI
J. Y.: Visual Path Prediction in Complex Scenes with Crowded Mov-
ing Objects. In 2016 IEEE Conference on Computer Vision and Pat-
tern Recognition (CVPR) (Las Vegas, NV, USA, June 2016), IEEE,
pp. 2668–2677. http://ieeexplore.ieee.org/document/
7780661/. doi:10.1109/CVPR.2016.292. 2
© 2024 The Authors.
Proceedings published by Eurographics - The European Association for Computer Graphics.