-
Towards Understanding Human Mistakes of Programming by Example: An Online User
Study
IUI '17 Proceedings of the 22nd International Conference on Intelligent User
Interfaces
Tak Yeon Lee, Casey Dugan, and Benjamin B. Bederson
Programming-by-Example (PBE) enables users to create programs without writing a line of code.
However, there is little research on people's ability to accomplish complex tasks by providing
examples, which is the key to successful PBE solutions. This paper presents an online user
study, which reports observations on how well people decompose complex tasks, and disambiguate
sub-tasks. Our findings suggest that disambiguation and decomposition are difficult for
inexperienced users. We identify seven types of mistakes made, and suggest new opportunities for
actionable feedback based on unsuccessful examples, with design implications for future PBE
systems.
-
The Human Touch: How Non-expert Users Perceive, Interpret, and Fix Topic Models
International Journal of Human-Computer Studies, Volume 105, September 2017
Lee, T.Y., Smith, A., Seppi, K., Elmqvist, N., Boyd-Graber, J., and Findlater,
L.
Topic modeling is a common tool for understanding large bodies of text, but is typically
provided as a "take it or leave it" proposition. Incorporating human knowledge in unsupervised
learning is a promising approach to create high-quality topic models. Existing interactive
systems and modeling algorithms support a wide range of refinement operations to express
feedback. However, these systems' interactions are primarily driven by algorithmic convenience,
ignoring users who may lack expertise in topic modeling. To better understand how non-expert
users understand, assess, and refine topics, we conducted two user studies—an in-person
interview study and an online crowdsourced study. These studies demonstrate a disconnect between
what non-expert users want and the complex, low-level operations that current interactive
systems support. In particular, our findings include: (1) analysis of how non-expert users
perceive topic models; (2) characterization of primary refinement operations expected by
non-expert users and ordered by relative preference; (3) further evidence of the benefits of
supporting users in directly refining a topic model; (4) design implications for future
human-in-the-loop topic modeling interfaces.
-
Evaluating Visual Representations for Topic Understanding and Their Effects on
Manually Generated Labels
Transactions of the Association for Computational Linguistics, 2016.
Alison Smith, Tak Yeon Lee, Forough Poursabzi-Sangdeh, Leah Findlater, Jordan
Boyd-Graber, and Niklas Elmqvist
Probabilistic topic models are important tools for indexing, summarizing, and analyzing large
document collections by their themes. However, promoting end-user understanding of topics
remains an open research problem. We compare labels generated by users given four topic
visualization techniques (word lists, word lists with bars, word clouds, and network graphs) in
addition to automatically generated labels on how well downstream users believe the labels
appropriately describe corresponding documents. Our study has two phases: a labeling phase where
users label visualized topics and a validation phase where new users select which labels best
describe the topics' documents. Although all visualizations produce similar quality labels,
simple visualizations like word lists allow users to quickly understand topics, while complex
visualizations take longer but expose multi-word expressions that simpler visualizations
obscure. Automatic labels lag behind user-created labels, but our dataset of manually labeled
topics suggest preferred linguistic patterns (e.g., hypernyms, phrases) that can improve
automatic topic labeling algorithms.
-
Human-Centered and Interactive: Expanding the Impact of Topic Models
Human-Centered Machine Learning workshop, ACM Conference on Human Factors in
Computing Systems (CHI 2016).
Alison Smith, Tak Yeon Lee, Forough Poursabzi-Sangdeh, Jordan Boyd-Graber,
Kevin Seppi, Niklas Elmqvist, and Leah Findlater
Statistical topic modeling is a common tool for
summarizing the themes in a document corpus. Due to
the complexity of topic modeling algorithms, however,
their results are not accessible to non-expert users.
Recent work in interactive topic modeling looks to
incorporate the user into the inference loop, for
example, by allowing them to view a model then
update it by specifying important words and words that
should be ignored. However, the majority of interactive
topic modeling work has been performed without fully
understanding the needs of the end user and does not
adequately consider challenges that arise in interactive
machine learning. In this paper, we outline a subset of
interactive machine learning design challenges with
specific considerations for interactive topic modeling.
For each challenge, we propose solutions based on
prior work and our own preliminary findings and
identify open questions to guide future work.
-
CTArcade: Computational Thinking with Games in School Age Children
International Journal of Child-Computer Interaction
Tak Yeon Lee, Matthew Louis Mauriello, June Ahn, and Benjamin B. Bederson
We believe that children as young as ten can directly benefit from
opportunities to engage in computational thinking. One approach to provide these opportunities
is to focus on social game play. Understanding game play is common across a range of media and
ages. Children can begin by solving puzzles on paper, continue on game boards, and ultimately
complete their solutions on computers. Through this process, learners can be guided through
increasingly complex algorithmic thinking activities that are built from their tacit knowledge
and excitement about game play. This paper describes our approach to teaching computational
thinking skills without traditional programming—but instead by building on children's existing
game playing interest and skills. We built a system called CTArcade, with an initial game
(Tic-Tac-Toe), which we evaluated with 18 children aged 10–15. The study shows that our
particular approach helped young children to better articulate algorithmic thinking patterns,
which were tacitly present when they played naturally on paper, but not explicitly apparent to
them until they used the CTArcade interface.
-
Experiments on Motivational Feedback for Crowdsourced Workers
International AAAI Conference on Weblogs and Social Media (ICWSM 2013) [20%
acceptance rate]
Tak Yeon Lee, Casey Dugan, Werner Geyer, Tristan Ratchford, Jamie Rasmussen, N.
Sadat Shami, Stela Lupushor
This paper examines the relationship between motivational design and its
longitudinal effects on crowdsourcing systems. In the context of a company internal web site
that crowdsources the identification of Twitter accounts owned by company employees, we designed
and investigated the effects of various motivational features including individual / social
achievements and gamification. Our 6-month experiment with 437 users allowed us to compare the
features in terms of both quantity and quality of the work produced by participants over time.
While we found that gamification can increase workers' motivation overall, the combination of
motivational features also matters. Specifically, gamified social achievement is the best
performing design over a longer period of time. Mixing individual and social achievements turns
out to be less effective and can even encourage users to game the system.
-
CTArcade: Learning Computational Thinking While Training Virtual Characters
Through Game Play
CHI '12 Extended Abstracts on Human Factors in Computing Systems, May 05-10,
2012, Austin, Texas, USA
Tak Yeon Lee, Matthew Louis Mauriello, John Ingraham, Awalin Sopan, June Ahn,
Benjamin B. Bederson
In this paper we describe CTArcade, a web application framework that seeks to
engage users through game play resulting in the improvement of computational thinking (CT)
skills. Our formative study indicates that CT skills are employed when children are asked to
define strategies of common games such as Connect Four. In CTArcade, users can train their own
virtual characters while playing games with it. Trained characters then play matches against
other virtual characters. Based on reviewing the matches played, users can improve their game
character. A basic usability evaluation was performed on the system, which helped to define
plans for improving CTArcade and assessing its design goals.
-
TreeCovery: Coordinated dual treemap visualization for exploring the Recovery Act
Government Information Quarterly (December 2011) doi:10.1016/j.giq.2011.07.004
Rios, M., Sharma, P., Lee, T.Y., Schwarts, R. and Shneiderman, B.
The American Recovery and Reinvestment Act dedicated $787 billion to stimulate
the U.S. economy and mandated the release of the data describing the exact distribution of that
money. The dataset is a large and complex one; one of its distinguishing features is its
bi-hierarchical structure, arising from the distribution of money through agencies to specific
projects and the natural aggregation of awards based on location. To offer a comprehensive
overview of the data, a visualization must incorporate both these hierarchies. We present
TreeCovery, a tool that accomplishes this through the use of two coordinated treemaps. The tool
includes a number of innovative features, including coordinated zooming and filtering and a
proportional highlighting technique across the two trees. TreeCovery was designed to facilitate
data exploration, and initial user studies suggest that it will be helpful in insight
generation. RATB (Recovery Accountability and Transparency Board) has tested TreeCovery and is
considering including the concept in their visual analytics.
-
Optimizing Display Advertisements Based on Historic User Trails
SIGIR 2011 Workshop: Internet Advertising (IA2011)
Gupta, N., Khurana, U., Lee, T.Y., and Nawathe, S.
Effective online display advertising requires a dynamic selection of the
advertisement to be displayed when a web page is fetched. As the goal of displaying
advertisement is to engage the users and obtain clicks, the advertisement which has the highest
probability of click should be displayed. In this paper we address the problem of finding the
most suitable display advertisement option for a user given his/her current browsing session.
Using this historical browsing session information, we mine the association of different
advertisement views, engagements and clicks, and apply Bayesian models to find the likelihood of
an advertisement to be clicked given a specific set of events that describe a user session. A
major challenge in training the model for optimum precision is the sparsity of click events,
hence we propose the use of advertisement engagement as a success event like clicks to train the
model more effectively. Our technique significantly outperforms the baseline technique of using
prior probabilities for selecting advertisements.