Wednesday, March 31, 2010

Whack gestures: inexact and inattentive interaction with mobile devices

Authors:
Scott E. Hudson
Chris Harrison
Beverly L. Harrison
Anthony LaMarca

Summary:
Goal: Inexact & inattentive interaction. interaction that does not require visual attention and does not disrupt the current activity.

Whack gestures - a set of gestures which are combinations of whack/ tap and wiggle/ shake. An adhoc recognition engine is set up to recognize the gestures. A user study with 11 people was performed to check the accuracy of the recognition.

Discussion:
I was totally taken back by the evaluation. I was expecting qualitative evaluation of the system as such user experience evaluation. I did not expect a recognizer accuracy evaluation. It would have been more interesting if they had evaluated the user experience given a certain scenario.

Comments: Murat

Gameplay issues in the design of spatial 3D gestures for video games

Authors:
John Payne
Paul Keir
Jocelyn Elgoyhen
Mairghread McLundie
Martin Naef
Martyn Horner
Paul Anderson

Summary:
effective implementation in 2D, 3D spatial gestures present their own problems in relation to: how to present 3D gesture feedback, user performance differences,how to instruct/learn user gestures,what are familiar semiotics for 3D gestures. The authors have created a device called 3motion and created several games to evaluate the useful ness of gestures in gaming scenarios. They identified intuitive gestures, clear user feedback and effective semiotics are important.

Discussion:
I did not realize that this was a paper before Wii was in the market. elaborate research on user feedback and recognizing features of feedback would be helpful. In fact, each topic identified as important needs an elaborate study.

Comments: Drew, franck

Wiizards: 3D gesture recognition for game play input

Authors:
Louis Kratz
Matthew Smith
Frank J. Lee

Summary:
Goal : Gesture recognition from 3D accelorometer data from Wiimote. Fluid tolerance and more immersive experience in gaming.

Paper evaluated a HMM model for recognizing Wii accelerometer gestures. The effect number of training data and number of states in HMM on the recognition speed and accuracy has been analyzed. The average correctness of HMM without training data is also measured which is around 50%. The HMM performed well at 15 states and after 10 samples per gesture yielded 80% accuracy and after 20 samples per gesture training 95% accuracy. The results showed that more states in HMM meant more training time and slower recognition speed.

Discussion:
I do not see what is novel about this system. Almost all the classifiers need some form of training so why not think about making the training process interesting or making the training part of the game.

Comments: Franck, Kevin

Device agnostic 3D gesture recognition using hidden Markov models

Authors:
Anthony Whitehead
Kaitlyn Fox

Summary:
Goal: test the accuracy of the 3D gestures by dividing the 3D space into subspaces.

The 3D space is divided into 27, 64 and 127 cubes. The data collected is then used to train HMM and tested on a gesture set. The number of training examples needed, recognition speed are measured. 27 state HMM acheived 800 recognition per second while a single recognition took seconds for 125 state HMM. The number of training data needed to train 25 state HMM was lesser. More states in the HMM meant more training data to train.

Discussion:
I am confused between the states and the segmentation of the space. By states do they mean the actual number of states in the HMM or the number of segmentations. The interesting part though is dividing the 3d space into cubes.

Comments: Franck,

Monday, March 15, 2010

Sensory Hand

Author:
Vernon b Mountcastle

Summary:

* Movement is the facilitating agent for complex tactile experiences.
the ability of the brain to integrate successive patterns of input to create a perceptual whole. The intra-cortical processing times of 80-100 ms for each pattern. Mechanical oscillations can be delivered to the skin. Humans can sense over frequency range from about 5 - 600 Hz. Frequencies in the range of 5 -50 Hz evoke a localized stimulus site. Increasing the frequencies changes the sensory experience gradually to the deep, spreading and poorly localized hum.
* Weber Law - Ratio of the perceivable change in tactile stimuli and the magnitude of stimuli is constant.
* Fechner law - Extending the Weber's law, Fechner quantified the sensation as logarithmic function of the magnitude of the stimulus.

Chapter 11

Localization error is minimum at finger pad and gradually increase towards proximal and palm.
(Schady & Torebjork, 1983) - point localization and Wheat, et al - tactile localization experiment on hand - separation threshold to distinguish a point from sphere of radius to the object of same radius. The human receptiveness increases with time. Sensitivity to stimuli increased in 300ms.

Movement and direction - all the stimuli reaching hand are sensed by scanning movements combine lateral movement and skin stretch.

Flutter vibration - experiments on frequency change detections and change detection thresholds.


Tuesday, March 9, 2010

An architecture for gesture-based control of mobile robots

Authors:
Iba, S.
Weghe, J. M. V.
Paredis, C. J. J.
Khosla, P. K.

Summary:
Goal: Gesture recognition system to interact with Robots. More appropriately gesture spotting with HMM.
Hardware used : Cyberglove + 6DOF location sensor.
Feature set: 18 sensor data is reduced to 10 dimension vector and each dimension taken with it derivatives increases the dimension of the vector to 20. This 20 dimension vector is coded on to a 32 bit integer. This codeword is then sent into HMM for recognition. HMM is trained with 5000 postures from full hand posture space. Restricting the observation sequence to better quantification. Wait state is included to provide a method to reject invalid gestures.

Gesture set : opening, opened, closing, pointing, waving left and waving right. These gestures carry different semantics in local mode and global mode of robot control.

Discussion:
Interesting extension to HMM for rejecting invalid gestures. Are gestures better than joystick controls while interacting with robots?

Comments: Drew, Franck

Human-centered interaction with documents

Authors:
Andreas Dengel
Stefan Agne
Bertin Klein
Achim Ebert
Matthias Deller

Summary:
Goal: Combining 3D visualization techniques and hand gesture recognition to improve interaction with documents.
Visualization:
Well designed virtual reality like graphical document explorer in 3D. The documents are represented in the form of books in the book case. Search bar is invoked using a gesture and the documents related to the query are displayed with higher semantic zoom and greater detail than book case. The system provides 2 modes of visualization. Plane mode - matrix of documents. Color coding - yellowness for the oldness of the document, animating pulsing behavior to state the importance, thickness to show the size of the document and the thumbnail shows the first page of the document.
Cluster Mode - 3d visualization of the related documents. The relations are visualized by the colored flashing lines connecting the documents.
Gesture recognition - a gesture recognition system which would allow definition of gestures, take in training data and provide recognition results. The system contains 2 threads - data collection and gesture manager. Data collection thread generates events based on the glove movements - glove move, posture changed,... . Gesture manager receives the events from the data collection thread and is responsible for the gesture recognition.

Discussion:
The document moving gesture and the pointing gesture were intuitive. Other gestures for opening the document and returning the document to the grid did not seem to be intuitive. The paper claims to have several ways of interacting with a document but did not mention it how/ what those are. There is no user study. I could not understand some parts of the visualization concepts used in the system like relation visualization, occlusion due to the links and green box to solve occlusion. A systematic user study would have helped in seeing what the original problems are.
I thought i would get some ideas on the gestures users tend to use in 3D environments in the paper.

Comments: Franck

Monday, March 1, 2010

Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes

Authors:
Jacob O. Wobbrock
Andrew D. Wilson
Yang Li

Summary:
Goal :
1. be resilient to variations in sampling due to movement speed or sensing;
2. support optional and configurable rotation, scale, and position invariance;
3. require no advanced mathematical techniques (e.g., matrix inversions, derivatives, integrals);
4. be easily written in few lines of code;
5. be fast enough for interactive purposes (no lag);
6. allow developers and application end-users to “teach” it new gestures with only one example;
7. return an N-best list with sensible [0..1] scores that are independent of the number of input points;
8. provide recognition rates that are competitive with more complex algorithms previously used in HCI to recognize the types of gestures

A class library contains templates . A template/ gesture is a sequence of points. A user entered gesture is first resampled, rotated to indicative angle and scaled to match the template. A MSE score for the best angle is then calculated for each class template. N - best list is then generated based on the MSE score.

User study : Data was collected from 10 users - 4800 gestures. 16 gestures were asked to draw in slow, medium and fast speeds.
The recognizer provides 97% accuracy on 1 training sample / class and 99.5% accuracy for 3+ training sample/ class.

Limitations: $1 cannot distinguish gestures whose identities depend on specific orientations, aspect ratios, or locations. 1D gestures like horizontal and vertical lines cannot be recognized - will be affected by non uniform scaling. The gestures cannot be differentiated based on speed.

Discussion:
A pretty extensive user study which analyzes effect number of training sample/ template, execution speed, accuracy,... I do not understand how qualitative data can be collected for a classifier. But the authors have collected qualitative data on the gestures that users liked. Analysis of the limitations of the recognizer was interesting. The authors also provide solution for removing some limitations.

Comments: Peschel, Franck

The $3 recognizer: simple 3D gesture recognition on mobile devices

Authors:
Sven Kratz
Michael Rohs

Summary:
Goal: build a simple 3D gesture recognition system - simple to implement and require less training data
A gesture recognizer was build for 3D gestures extended Wobbrock's 1$ recognizer for 2D touch gestures. It is easy to implement, requires as less as 5 samples for good recognition result. The system uses raw acceleration data. The change in acceleration is calculated. A sequence of acceleration delta is used to represent a gesture trace. A class library contains a list of gesture traces. The user entered gesture is compared to the class libraries to generate a score table. A heuristic is then applied to recognize the gesture.Resampling, rotation to indicative angle and rescaling are used to normalize the user performed gesture. MSE at best angle if found for each class and a score table is generated. To reduce false positives, a heuristics is used. A threshold value is set. The score table is sorted based on the scores. A heuristic to select class is defined based on threshold. If the threshold dont hold the gesture is recognized as unclassified.

Discussion:
Very similar to the $1 recognizer. easy to implement and easy to train. Providing accuracy data would have been more compelling in defending the classifier. introducing "unclassified" tag is an improvement over the $1 recognizer.

Comment: Franck

Office Activity Recognition using Hand Posture Cues

Authors:
Brandon Paulson
Tracy Hammond

Summary:
Goal: Activity recognition based on hand postures. hand posture to determine object interaction and user dependency in interaction style.
Activity recognition can help establish context of the user interaction. Activity theory - activities have objectives and are accomplished using tools and objects. Therefore, by identifying the object that the user is interacting with information about activity can be extracted.
Previous work: recognizing movement related activities - vision based, wearable accelerometers
object interaction - RFID tags on objects with tag reader in hand.
Grasp types - vision based and glove input data
Implementation: CyberGlove 2 with 22 sensors is used in the system. 1NN classifier is used to classify between 12 different activities. An user study with 8 users was conducted. User independent testing produced a very low accuracy in activity recognition (average - 62%). An user dependent testing with 2 training and 3 testing samples produced 78 % accuracy while 4 training sample produced 94% accuracy. User independent gestures produced lot of variations. User dependent gestures were better recognized but confusion occured in typing on keyboard & phone and circular grip on objects like mug, drawer, telephone and stapler.

Discussion:
An interesting method to identify activity. It was not clear if each activity was captured separately or if it was performed in a sequence. it would be interesting to see the scalability of the system to various other activities. I would also like to see an example application where the context information is used. I am a little confused on how this data can be used.

Comments: Paul
Drew

American sign language recognition in game development for deaf children

Authors:
Helene Brashear
Valerie Henderson
Kwang-Hyun Park
Harley Hamilton
Seungyon Lee
Thad Starner

Summary:
Goal: Design a ASL gesture recognition system.
A wizard of oz user study was conducted with 5 children to collect gesture data. 541 phrase samples and 1959 sign samples were collected (22 word vocabulary). The system uses a camera and wrist mounted accelerometer to capture motion of the hand. HMM is used to train on the gestures and recognize gestures. Feature vector is combination of the vision data and the accelerometer data. x,y,z of accelerometer data for both hands, x,y change in center position of hand in frames, mass, eccentricity, length of major and minor axis, angle of major axis and direction of major axis in x,y offset. User dependent model of classifier produced 93.39% accuracy and user independent model produced 86.28% accuracy in recognizing gestures.

Discussion:
The data has pruned and the gestures which were seemed to correct were selected for testing. I donot think it is a correct practise to test a recognizer. Also, vocabulary of words used for testing was small.

Comment:

An empirical evaluation of touch and tangible interfaces for tabletop displays

Authors:
Aurlien Lucchi
Patrick Jermann
Guillaume Zufferey
Pierre Dillenbourg

Discussion:
Goal: Evaluation of touch and tangible interfaces on table top displays.
This is a comprehensive user study with 40 users and a set of activities to compare touch and tangible interfaces. Tangible interfaces are complex to measure based on Fitt's law. A top projected table top system is built for supporting both the interfaces. The table can track multiple finger interaction. The system uses a camera mounted on the top of the table to track the tagged tangible objects. The toolbar in the touch interface was replaced by the tangible objects. Users were asked to model a building using shelves and walls. The virtual wall in touch interface was rescalable while the tangible one was not. Activities like translation and rotation has natural gestures in tangible. Selection did not have any meaning in the tangible interface. Addition, removal and adjustment activities were performed physically in tangible interface while certain tool bar icons were used in touch interface. An introductory video was presented to the users to explain the different actions available. The actiions of the users were video taped and logged. The user were asked to model a warehouse. To compare the interfaces 3 variables were used - Completion time of overall experiment, completion time of each action and Accuracy. Overall completion time favored tangible interfaces. Comparing individual activities showed that scaling, translation and deletion operations favor touch interface while addition operation favors tangible interface. Touch was slightly less accurate. User preferences show that tangible interfaces were easier to learn than touch interfaces. Touch interface made user more stressed and irritated.

Summary:
Tangible interface seem to be more intuitive way to interact. But the rigidity involved or the difficulty involved in editing may affect the user experience. I am impressed by the user study. It is expensive, systematic and has both quantative and qualitative data.
It would have been interesting if the authors did an follow up study to see if the users retained what they learned.

Comments: