View this PageEdit this PageAttachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide

SpeedDating.P4.Analysis of Results

Speed Dating
Clint Cope | Aaron Levisohn | Matt McKeon | Arvind Venkataramani

 Overview   ::   Understanding the Problem   ::   Design   ::   Prototypes   ::   Evaluation   ::   Final Presentation   ::   Appendix 
 Overview  |  Rationale  |  Evaluation Materials  |  Results  |  Analysis of Results  |  Improvements to Design  |  Criqitique of Evaluation  |  Conclusion 

The analysis of the results reveals some gaffes that we commited, and some interesting facts about how users approach the system that has bearing on a redesign of the system.

Note: Most of the bar charts on this page are averaged from responses on a six-point Likert scale. We have normalized these plots so that the neutral response is at zero. Thus, if respondants disagreed with a statement on average, the bar would appear below the the x-axis.

Overall Ratings

Overall, the system was found to be both usable and useful.
Uploaded Image: overall-ratings.png

Handheld

We found that the handheld had No Learnability Whatsoever . However, once learned, users thought the handheld was very easy to use, and most liked using it, or thought they would in a real speed-dating scenario. Overall, they strongly agreed that usage of the handheld system was enjoyable.

Prompts & Categories
The average responses to our first three handheld prompts ("Looks?", "Smarts?", and "Intrigue?") were approximately the same. However, only half as many participants responded affirmatively to the "Chemistry?" prompt. This may be reflective of the experimental design, in which our participants were rating fictional characters absent any real dating experience. It may also be the vagueness of the description - some participants did comment that they would like to see a greater variety of characteristics to rate. Other observations:
Uploaded Image: handheld-buttons.png Uploaded Image: handheld-voice.png Uploaded Image: handheld-misc.png

Voice Recording
There was a marked gender difference in the usage of voice memos. Every male participant recorded multiple voice memos, while no female participants did so. In reviewing survey responses, this gender difference in attitude towards voice recording also became apparent (one female participant did not answer the voice recording questions on the survey because, she notes, she didn't use the function). This may be an attribute effect, in that female-gendered attitudes towards relationships may inhibit voicing evaluations in a semi-public context. It may also be a selection effect, due to our small experimental population. Finally, it may be a researcher effect, in that two male investigators were present and within close proximity to the participant when voice memos were being recorded. Our presence may have made female participants uncomfortable making compatibility statements about even fictional characters, while male participants may have been encouraged by a male atmosphere. Some other lessons learned were:
Uploaded Image: handheld-voice-gendered.png

Environmental Elements

These features were found to be useful while being marginally distracting, as well as being somewhat enjoyable. The chart below illustrates this.
Uploaded Image: system-environment.png

Conversation Lights
There was a mixed response to this feature. Overall, users thought it was not distracting, but were divided as to its utility.

    Our analysis of the data from the conversation lights was inconclusive. The plot at left shows the graphs generated from participants' usage of the conversation lights during the experiment. While there were both strongly positive and strongly negative responses from our participants regarding the lights, a correlation of conversation patterns with their reactions fails to show any particular systematic covariance. However, we did discover that our timing of the lights was inappropriate for a conversation pattern that approximates that of a speed date (limited duration, open-ended, and social in nature). While the timing was fine-grained enough, it resulted in large plots that could not be reasonably shrunken for a computer display without losing critical details. We would need to do further experimenting to find the right balance between maintaining an appropriate granularity of data and optimizing for display.


Lava lamp

Coasters
The coasters were a hit! People liked the associativity between the coasters and the dates. They also had other observations:
Uploaded Image: coasters-recall.png

Website

Layout, navigation etc
Users wanted more assistance in comparing the dates, for instance, by viewing all the graphs/photos at the same time, or by viewing only those were rated high enough. Otherwise, they found the website layout & organization of information satisfactory, and felt confident making decisions based on the information presented.
Uploaded Image: website-overall.png

Conversation graphs

Photo morphs
Not successful at all for the following reasons:
Uploaded Image: website-others.png


Environment vs. Website

Uploaded Image: LightsVsWebsite.png
This figure compares our participants' ratings of the conversation plots on the website with their ratings of the real-time conversation feedback in the environment. The yellow bars represent their evaluation of the lights' utility, while the yellow lines represent their evaluation of the website plots' utility. The green bars represent their enjoyment / engagement with the lights in the environment, while the green lines represent the same for the website plots.

Note that the distance between the green and yellow bars tends, on average, to match the disparity between the green and yellow lines. This means that the difference between our users' ratings of the lights in the environment and the website remains constant, on average, across both categories of utility and engagement. This suggests one of several possible explanations. First, it is possible that when encountering the lights, we fail to help users make a connection between the real-time feedback and the visualizations that are generated from it. Second, it's possible that real-time feedback is simply less useful than the conversational record, and that we should try omitting real-time feedback to see if it makes an appreciable difference in the usability of the website plots. Third, there may exist an ordering effect in the experiment, where users' impression of the plots on the website is colored by the "Aha!" moment they have when they recognize the connection to the conversation lights. Fourth, there may be a "Halo Effect" in the survey, where participant response to an engagement question is colored by their response to the corresponding utility question.