This page demonstrates how to use our R code and apply our optimization methods (Tsandilas and Dragicevic 2022) to the gesture elicitation study of Bailly et al. (2013).

The study

Bailly et al. (2013) investigated gestural shortcuts for their Métamorphe keyboard. Métamorphe is a keyboard with actuated keys that can sense user gestures, such as pull, twist, and push sideways. In this study, 20 participants suggested a keyboard shortcut for 42 referents on a Métamorphe mockup. Proposing a shortcut required choosing (i) a key and (ii) the gesture applied to the key. The participants produced a total of 71 unique keys, 27 unique gestures, and 358 unique combinations of keys and gestures. Our elicited sign set is the set of those 358 unique combinations (our signs).

The optimization problem

Our goal is to find an optimal set of mappings between signs (combinations of keys and gestures) and referents. We denote this optimal sign mapping set as \(\pi_{opt}\). Now, to define the optimization problem, we require:

  1. An objective function. We aim to maximize guessability (Wobbrock et al. 2005), where guessability \(G(\pi)\) can be defined as the probability that the sign of a gesture performed by a random user for a random referent belongs to the sign mapping set \(\pi\).

  2. A set of design constraints. We will require each referent to be mapped to a single and unique sign.

Ideally, we should find a sign mapping set that is optimal for the full population of users. In practice, however, we can only provide an estimate \(\widehat\pi_{opt}\) of the optimal mapping set based on a sample, i.e., the participants of the study. Since \(\widehat\pi_{opt}\) is not necessarily optimal, that is, \(G(\widehat\pi_{opt}) \le G(\pi_{opt})\), we would like to:

  1. Quantify our uncertainty (or confidence) about \(\pi_{opt}\) given the current sample

  2. Determine a sample size such that \(G(\widehat\pi_{opt})\) is close enough to \(G(\pi_{opt})\)

Read the dataset

This is the original dataset as provided by the authors.

As a first step, we need to read the data:

data <- read.csv("datasets/bailly2013.csv", stringsAsFactors=F)

# For each participant, there are five columns, where the first captures the key and the second captures the gesture  
keys <- data[, seq(2, ncol(data), by=5)] # These are participants' proposals of keys
gestures <- data[, seq(3, ncol(data), by=5)] # These are participants' proposals of key gestures

# Also build a table with keys + gestures combined.
keys_gestures = data.frame(keys)
for (r in 1:nrow(keys_gestures)) {
    keys_gestures[r,] = paste0(keys_gestures[r,], "-", gestures[r,])
}

# Replace the column names by participant IDs
pIDs = paste0("P", 1:ncol(keys))
names(keys_gestures) <- pIDs

# Replace the row names by referent IDs
row.names(keys_gestures) <- data$cmd

The resulting keys_gestures data frame is as follows (scroll to see details):

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20
Accept Y-pull Y-top F9-towards enter-top enter-top A-top Y-top enter-top enter-towards Y-top A-pull X-left Y-top esc-towards enter-top A-top +-top M-LR A-top Y-LR
Align bottom A-towards A-towards shift-towards A-towards A-towards B-towards A-towards A-top+towards –towards A-towards B-towards A-towards A-towards F6-towards J-towards A-towards win-towards J-towards O-top A-towards
Align justify A-pull A-LR shift-pull shiftR-LR A-away J-top A-FB J-top |-left+right(wiggle) J-right J-pull J-push A-top F1-pull J-pull A-pull +-LR J-LR J-right A-pull
Align left A-left A-left shift-left A-left A-left A-left A-left A-top+left [-left A-left L-left A-left A-left F6-left J-left A-left menu-left J-left [-left A-left
Align middle A-FB A-FB shift-top A-FB A-top D-towards A-LR A-top +-pull A-FB M-FB A-FB A-FB F6-top J-top A-top win-LR J-top %-top A-FB
Align right A-right A-right shift-right A-right A-right A-right A-right A-top+right ]-right A-right R-right A-right A-right F6-right J-right A-right win-right J-right ]-right A-right
Align top A-away A-away shift-away A-away A-away T-top A-away A-top+away –right A-away T-away A-away A-away F6-away J-away A-away win-away J-away O-pull A-away
Close X-top O-pull esc-LR X-away backspace-CCW C-top backspace-top esc-top F4-away X-top C-LR C-top O-towards esc-top F4-top C-top win-CCW P-CCW spacebar-top C-top
Copy C-top C-top C-towards capslock-FB C-away C-towards C-top C-top C-towards C-LR C-top C-FB C-top C-top C-top C-LR +-top-double K-FB C-top C-top
Cut C-away X-top X-top capslock-LR X-away C-away C-LR X-top C-away C-pull X-top C-LR X-top backspace-LR X-top C-pull –left K-pull C-away X-top
Decrease volume V-CCW P-towards spacebar-CCW ctrlR-CCW –away V-towards <-towards spacebar-towards 7-CCW V-CCW V-towards V-CCW –top –CCW –top –CCW win-CCW ?-left V-towards V-CCW
Delete D-pull backspace-top backspace-top D-away backspace-top D-top D-top backspace-top backspace-pull D-top D-pull D-top backspace-top backspace-top backspace-LR backspace-top –right U-top D-top D-CCW
Duplicate C-directional D-directional 6-CW 2-LR +-away D-away D-CW C-top+directional 2-pull D-top(double) D-top D-pull D-top V-FB shiftL-top D-LR menu-top-double U-right D-pull D-top
Enlarge D-CW ^-away +-pull E-towards +-pull E-top enter-right +-away +-CCW L-pull E-pull E-left S-CW F12-pull A-CW +-right +-CW U-pull O-CW S-CW
Find F-top F-top F-top F-towards F-away F-top F-top F-top F-FB F-CCW F-left F-top F-top F-top F-top F-top +-FB L-top F-top F-top
Find next F-right F-right tab-right enter-right >-CW F-away F-right F-right >-right F-towards F-towards F-CW F-left F-CW F-right F-right +-away L-towards F-right F-right
Find previous F-left F-left tab-left backspace-CCW <-CCW P-right backspace-left F-left <-left F-away F-away F-CCW F-right F-CCW F-left F-left win-left L-away F-left F-left
Help 1-away esc-top H-pull ?-top ?-away H-top ?-top F1-top menu-pull H-pull H-top H-left H-top F5-FB F9-top menu-FB +-pull M-pull H-pull H-top
Increase volume V-CW P-away spacebar-CW ctrlR-CW +-away V-top >-away spacebar-away 7-CW V-CW V-away V-CW +-top +-CW +-top +-CW win-CW ?-right V-away V-CW
Insert I-top I-towards shift-top enter-top I-away I-top P-LR ^-away I-left+right(wiggle) I-towards I-top I-left I-top I-FB I-top I-towards –top-double U-CW I-top I-FB
Maximize M-away tab-away 9-away S-towards +-CW M-top enter-right M-away F1-pull D-pull M-left M-pull +-CW F9-CW M-pull +-top +-CW M-pull spacebar-pull M-pull
Menu access win-top esc-towards menu-top menu-top M-away Q-top menu-LR menu-top menu-top M-LR M-top M-top menu-top F5-pull F10-top menu-top menu-FB M-away menu-top M-CW
Minimize M-towards tab-towards 1-towards M-away –CCW M-towards enter-left M-towards F1-LR D-top M-left M-CCW –CCW F9-CCW M-top –top –CCW M-top spacebar-top M-FB
Move a little Q-left N-directional F1-directional tab-LR </>-left/right L-towards M-directional ctrl-directional tab-directional B-directional >-top M-left/right N-directional F12-directional+LR comma-directional M-right win-right U-pull O-left(pulse)/right(pulse) J-directional
Move a lot Q-away M-directional F12-directional tab-FB </>-pull L-away M-directional ctrl-directional capslock-directional L-directional >-pull M-left/right M-directional F9-directional+LR .-directional M-LR –right U-FB O-left(long)/right(long) L-directional
Next >-top tab-right N-right 2-right >-top N-top >-right spacebar-right backspace-right H-right N-right N-right >-top ~-right F-CW N-right win-right L-right >-right N-left/right
Open O-top O-top enter-top enter-pull O-away O-top O-top enter-top F4-towards O-pull O-top O-top O-away O-pull enter-top O-pull menu-CW M-CW spacebar-pull O-top
Pan G-left/right P-directional tab-left/right spacebar-FB Shift-left/right P-left/right M-directional win-directional shift-directional S-directional V-CW O-left/right V-directional F6-directional+LR ?-directional tab-directional menu-FB L-left spacebar-left/right P-CW/CCW
Paste C-towards V-towards V-top 1-top V-away P-top P-LR V-top +-towards V-top V-top V-left V-top V-top V-top P-top –top-double K-LR P-top V-top
Pause V-top P-top spacebar-top >-top P-towards P-towards spacebar-top spacebar-top doublequote-FB P-top P-pull L-LR spacebar-top ~-pull spacebar-top P-away –top ?-push =-towards P-FB
Play V-top P-top spacebar-top >-right P-away Y-top spacebar-top spacebar-top enter-LR/FB P-LR P-top L-top enter-top ~-top spacebar-top P-towards +-top ?-CW =-away T-top
Previous <-top tab-left P-left 1-left <-top X-top backspace-left spacebar-left backspace-left H-left P-left P-left <-top ~-left F-CCW P-left win-left L-left <-left P-left
Reject X-pull N-top F8-away R-away backspace-top J-away N-top esc-top X-towards N-FB R-pull R-top N-top esc-away backspace-top R-top –CCW M-CCW R-top R-top
Rotate R-CW/CCW R-CW/CCW menu-CW/CCW R-FB R-CW/CCW R-top R-CW/CCW ctrl-CW/CCW 0-CW/CCW R-CW/CCW R-CW R-CW/CCW R-CW/CCW F12-CW/CCW ^-CW/CCW R-CW/CCW win-FB+CW U-CW/CCW R-CW/CCW R-CW/CCW
Save S-top S-pull S-top ctrl-top S-away S-top S-top S-top S-FB S-top S-away S-right S-top S-top S-top S-top menu-top P-left S-top S-top
Save all S-FB S-FB F4-pull ctrlL-pull S-CW L-top S-LR S-top+away S-pull A-top S-left S-top S-pull S-LR S-pull S-left +-CW P-top S-LR H-CW
Save as S-pull S-top F5-top ctrlL-pull S-right S-left S-top S-towards S-towards S-pull S-right S-left S-right S-FB S-directional S-right win-top-double P-right F1-top S-pull
Shrink D-CCW ^-towards –LR S-away –top K-top enter-left –towards +-CCW S-FB E-top S-top S-CCW F12-top A-CCW –left –CCW U-CCW O-CCW S-CCW
Task switch tab-left/right tab-CW/CCW win-towards tab-pull >-CW/CCW W-top T-left/right tab-left/right tab-CW/CCW S-left W-top T-left W-left/right F5-LR tab-top win-right menu-right M-LR spacebar-CW/CCW T-towards
Undo U-top Z-top backspace-CCW shiftR-CCW backspace-left U-top backspace-top <-top U-pull Z-CCW U-top U-left Z-top backspace-away Z-top backspace-left win-CCW P-left U-pull U-CCW
Zoom in Z-away Z-away +-top Z-towards Z-CW Z-top enter-right win-top Z-right Z-pull Z-CW Z-top Z-CW F9-pull Q-top Z-CW win-CW L-CW Z-CW Z-FB
Zoom out Z-towards Z-towards –pull Z-away Z-CCW Z-away enter-left win-pull Z-left Z-top Z-CCW Z-pull Z-CCW F9-top Q-pull Z-CCW win-CCW L-CCW Z-CCW Z-pull

Estimate the optimal sign mapping set

We use a sign mapper that is based on the Hungarian algorithm to solve the optimization problem.

source("gelicopt/sign-mappers.R") # Implementation of various sign mappers

mappings <- hungarian_sign_mapper(keys_gestures)

The resulting mappings data frame is the following (scroll to see details):

refname ref sign
Accept 1 Y-top
Align bottom 2 A-towards
Align justify 3 A-pull
Align left 4 A-left
Align middle 5 A-FB
Align right 6 A-right
Align top 7 A-away
Close 8 esc-top
Copy 9 C-top
Cut 10 X-top
Decrease volume 11 V-CCW
Delete 12 backspace-top
Duplicate 13 D-top
Enlarge 14 +-pull
Find 15 F-top
Find next 16 F-right
Find previous 17 F-left
Help 18 H-top
Increase volume 19 V-CW
Insert 20 I-top
Maximize 21 M-pull
Menu access 22 menu-top
Minimize 23 –CCW
Move a little 24 N-directional
Move a lot 25 M-directional
Next 26 N-right
Open 27 O-top
Pan 28 L-left
Paste 29 V-top
Pause 30 spacebar-top
Play 31 P-top
Previous 32 P-left
Reject 33 R-top
Rotate 34 R-CW/CCW
Save 35 S-top
Save all 36 S-pull
Save as 37 S-right
Shrink 38 S-CCW
Task switch 39 tab-left/right
Undo 40 U-top
Zoom in 41 Z-CW
Zoom out 42 Z-CCW


As we explained earlier, these mappings (\(\widehat\pi_{opt}\)) are not necessarily optimal for the full user population. To quantify our confidence about the optimal mappings, we use the bootstrap method, where we iteratively run the optimization algorithm after sampling with replacement from the original sample of the 20 participants:

source("gelicopt/inference.R") # Implementation of bootstrapping used for inference

# Create the bootstrapping distribution. By default, it contains R = 200 samples. 
# But you could set R to a different number.
boot_samples <- bootstrap(keys_gestures, hungarian_sign_mapper, R = 300)

# For each referent, get only signs with at least 10% confidence
signs <- getSigns(boot_samples, conf.min = .1) 

# And we will then use it to flatten signs and confidence scores for presentation purposes
confidence_res <- data.frame(ref=row.names(keys_gestures), signConfidence = flatten(signs))

The resulting confidence_res data frame is the following (scroll to see details):

ref signConfidence
Accept enter-top:0.40, Y-top:0.38, A-top:0.14
Align bottom A-towards:1
Align justify A-pull:0.42, J-top:0.13, J-pull:0.12, J-right:0.11
Align left A-left:1
Align middle A-FB:0.88
Align right A-right:1
Align top A-away:1
Close esc-top:0.27
Copy C-top:0.91
Cut X-top:0.79, C-away:0.17
Decrease volume V-CCW:0.50, V-towards:0.22
Delete backspace-top:0.77, D-top:0.21
Duplicate D-top:0.45, D-pull:0.20
Enlarge +-pull:0.27, S-CW:0.18
Find F-top:1
Find next F-right:0.96
Find previous F-left:0.93
Help H-top:0.46, H-pull:0.23, ?-top:0.12
Increase volume V-CW:0.53, +-top:0.11
Insert I-top:0.74, I-towards:0.17
Maximize M-pull:0.47, +-CW:0.33
Menu access menu-top:0.91
Minimize –CCW:0.40, M-towards:0.29, M-top:0.13
Move a little N-directional:0.19, win-right:0.13
Move a lot M-directional:0.44, L-directional:0.17
Next N-right:0.47, >-top:0.30, >-right:0.12
Open O-top:0.87
Pan menu-FB:0.15
Paste V-top:0.93
Pause spacebar-top:0.69, P-towards:0.12
Play P-top:0.15, spacebar-top:0.14, +-top:0.12
Previous P-left:0.63, <-top:0.20
Reject R-top:0.52, N-top:0.33
Rotate R-CW/CCW:0.99
Save S-top:1
Save all S-pull:0.30, S-LR:0.28, S-left:0.12, S-FB:0.10
Save as S-right:0.50, S-pull:0.22, S-towards:0.12
Shrink S-CCW:0.19
Task switch tab-left/right:0.24, tab-CW/CCW:0.15, W-top:0.13
Undo U-top:0.34, Z-top:0.24, backspace-left:0.17
Zoom in Z-CW:0.69
Zoom out Z-CCW:0.73


The value next to each sign (in a range from 0 to 1) expresses our confidence that the sign is the optimal one for the given referent, thus it belongs to \(\pi_{opt}\). Here, we don’t show signs for which the confidence score is lower than 10%.

Determine an adequate sample size

Bailly et al. (2013) recruited 20 participants, which is the most frequent sample size for gesture elicitation studies (Villarreal-Narvaez et al. 2020). But is this number a good choice? Would results be the same or similar if a larger sample was used? We will try to investigate this question.

Let us first evaluate the guessability of the optimal mappings \(\widehat\pi_{opt}\) that we found earlier:

source("gelicopt/guessability.R") # Implementation of guessability functions

guess <- guessability(mappings, keys_gestures)
cat("Guessability =", guess, "\n")
## Guessability = 0.2785714

Unfortunately, this guessability score \(\widehat G(\widehat\pi_{opt}) = 27.9\%\) has been evaluated on the actual sample on which it was previously optimized (trained), so there is a risk of overfitting.This training guessability value generally overestimates the true guessability \(G(\widehat\pi_{opt})\) that we would like to assess and compare to \(G(\pi_{opt})\). To evaluate this overfitting problem, we perform cross-validation using the Leave-One-Out Cross Validation (LOOCV) method:

guess <- guessability.loocv(keys_gestures, hungarian_sign_mapper)
cat("LOOCV Guessability =", guess, "\n")
## LOOCV Guessability = 0.222619

To better understand how the training and cross-validation guessability scores converge, we will calculate them for an increasing number of participants (\(n = 3, 4,... 20\)):

gevolution <- guessability.evolution(keys_gestures, hungarian_sign_mapper)

Based on these scores, we can also try to predict the evolution of the guessability error \(\epsilon_G = G(\pi_{opt}) - G(\widehat\pi_{opt})\). If this error is low enough, a larger sample size is not likely to result in meaningful improvements. We use a simple linear model that we train with data of simulated gesture elicitation studies (optimized with the Hungarian algorithm).

source("gelicopt/prediction.R") # Code for building and using a prediction model of guessability error

# We provide data from simulated studies that can be used to train our model
trainingDatasets <- c("training/synthetic-hungarian-1.csv", "training/synthetic-hungarian-2.csv")

# We will train a simple linear model with two predictors: the size of the sample and the difference between the training and cross-validation score 
lmodel <- buildModel(trainingDatasets) 

predicted <- predictError(lmodel, gevolution) 
gevolution$fit <- predicted[,"fit"] # Best guess
gevolution$lwr <- predicted[,"lwr"] # Lower bound of 95% prediction interval
gevolution$upr <- predicted[,"upr"] # Upper bound of 95% prediction interval

We can now plot the evolving training and cross-validation guessability scores, their difference, and the predicted guessability error:

library(ggplot2) # We will use ggplot2 to plot the training and cross-validation curves
library(gridExtra) # To plot the two graphs side by side

plot <- ggplot(gevolution, aes(x = N)) +
  geom_line(aes(y = validation*100), color = "orange") +
  geom_line(aes(y = training*100), color = "steelblue") +
  ylab("Guessability (%)") + xlab("Participants") +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black")) +
  annotate("text", x = 9, y = 33, label = "training curve", size = 3, hjust = 0) +
  annotate("text", x = 9, y = 17, label = "cross-validation curve", size = 3, hjust = 0) +
  scale_x_continuous(expand = c(0, 0), breaks = seq(0, 22, 2), limits = c(0, 22)) +
  scale_y_continuous(expand = c(0, 0), limits = c(0, 40))

plot_error <- ggplot(gevolution, aes(x = N)) +
  geom_line(aes(y = (training - validation)*100), color = "chartreuse4") +
  geom_line(aes(y = fit*100), color = "blue") +
  geom_line(linetype = 2, aes(y = lwr*100), color = "blue") +
  geom_line(linetype = 2, aes(y = upr*100), color = "blue") +
  ylab("Guessability Error (%)") + xlab("Participants") +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.line = element_line(colour = "black")) +
  annotate("text", x = 9, y = 13, label = "training minus cross-validation curve", size = 3, hjust = 0) +
  annotate("text", x = 4, y = 9.5, label = "predicted error", size = 3, hjust = 0) +
  scale_x_continuous(expand = c(0, 0), breaks = seq(0, 22, 2), limits = c(0, 22)) +
  scale_y_continuous(expand = c(0, 0), limits = c(0, 25))

grid.arrange(plot, plot_error, ncol=2)

The predicted guessability error is expected to be lower than \(4.5\%\), with a best guess of around \(2.5\%\) – this number corresponds to a relative error of around \(10\%\) with respect to the expected population guessability. The error is decreasing slowly, but the cross-validation cross still fluctuates. A sensible decision could be to opt for a large sample size, e.g., \(n = 30\). However, the investigators may decide that the cost of recruiting additional participants outweighs any benefits from potential increases in guessability scores. In all cases, plotting the above graphs other researchers better interpret their results.

In the above analysis, we considered the full set of 42 referents. However, you might decide to concentrate on a smaller set of referents and adapt the optimization accordingly. Finally, if you interested in learning about how to conduct agreement analysis on the above dataset, please refer to this page.

References

Bailly, Gilles, Thomas Pietrzak, Jonathan Deber, and Daniel J. Wigdor. 2013. “Métamorphe: Augmenting Hotkey Usage with Actuated Keys.” In Proceedings of the Sigchi Conference on Human Factors in Computing Systems, 563–72. CHI ’13. New York, NY, USA: ACM. https://doi.org/10.1145/2470654.2470734.

Tsandilas, Theophanis, and Pierre Dragicevic. 2022. “Gesture Elicitation as a Computational Optimization Problem.” In CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491102.3501942.

Villarreal-Narvaez, Santiago, Jean Vanderdonckt, Radu-Daniel Vatavu, and Jacob O. Wobbrock. 2020. “A Systematic Review of Gesture Elicitation Studies: What Can We Learn from 216 Studies?” In Proceedings of the 2020 Acm Designing Interactive Systems Conference, 855–72. DIS ’20. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3357236.3395511.

Wobbrock, Jacob O., Htet Htet Aung, Brandon Rothrock, and Brad A. Myers. 2005. “Maximizing the Guessability of Symbolic Input.” In CHI ’05 Extended Abstracts on Human Factors in Computing Systems, 1869–72. CHI Ea ’05. New York, NY, USA: ACM. https://doi.org/10.1145/1056808.1057043.