Emerson Murphy-Hill interviews me (part 1)

About a month ago, Emerson Murphy-Hill (currently a post-doc at UBC) asked if he could interview me about the challenges of doing HCI research about Software Engineering (and vice versa). I’ll post our interview in two parts: the first, listed here, covers HCI and software engineering research and the second covers publishing in HCI venues.

Q: What are the biggest differences between the HCI and software research?

Both pursuits are very problem-driven: we want things that work, that demonstrably solve issues, and move practice and our understanding of practice forward. Not only that, but both pursuits are largely interested in solving the same problem: we want to find ways of creating software and technology that achieves its requirements and ultimately serves customer and user needs.

Where HCI and software research differ are in their methods. Most HCI research focuses on understanding the context being designed for and using this understanding to drive innovation. Software research often works the other way around, seeking innovative technological approaches to well-established problems, but not often doing formative research to discover new problems. The other difference between the two pursuits is the breadth of their methodological toolboxes. HCI researches will use whatever method is appropriate for a research question, whether that means controlled experiments, interviews, ethnographic field work, or any number of theoretical frameworks and formalisms. Software research tends to be more restricted methodologically, focusing mainly on first order logic and quantitative empiricism. In my view, this restricted set of epistemological tools prevents software researchers from seeing the larger context of the problems we try to solve.

Q: It sounds like you are suggesting that software research could make more of an effort to do formative research to find problems. Are there any areas in software research that you think that we don’t truly understand the nature of the problems?

I think that software research, as a whole, severely underestimates the role of communication, coordination, and management in successful software projects. The importance of this factor has been claimed for decades and studied infrequently for nearly 20 years, but it plays such a minor role in most software engineering research. This is surprising, since most other engineering disciplines focus heavily on the actual human process of engineering different goods.

Another area that is understudied is the notion of require- ments and what they actually mean. Ultimately, the role of software in humanity is to support humanity, but more often then not, software engineers let the medium, rather than humanity, dictate the design. There are interesting connections between Requirements Engineering and HCI, in that both seek to elicit requirements, but using different methods. I’ve seen very little work that bridges these different approaches to software design. Generally, most software research focuses on “getting the design right” rather than “getting the right design.”

Q: I noticed that you mentioned that software research has begun to use quantitative empiricism, but did not mention qualitative empiricism. Do we not use both?

In my experience, software engineering researchers are highly skeptical of qualitative methods. It is quite rare to see a paper at a top conference that uses qualitative methods exclusively. I’ve personally received reviews that suggested I convert my qualitative observations into numbers in order to make them more objective (which, epistemologically, is both ineffective and naive). Unfortunately, there are some questions for which quantification is insufficient. For example, if you want to know how a software development team selects which bugs to address first, what do you measure? Some objects of study are processes and activities, with no single measurable dimension.

I understand why the community is skeptical; we come from quantitative traditions; most software engineering researchers only have a vague idea of what sociologists and anthropologists do. It’s unfortunate that at the moment, to publish research about inherently qualitative phenomenon, one has to create artificial and unhelpful measurements of phenomenon to make it palatable.

Q: Do you have any recommended reading for doing research in HCI?

There’s no small set of reading that would suffice. HCI spans over 50 years, 20 conferences, and at least 20 journals. CHI, the leading HCI conference, is the second largest ACM conference, second only to SIGGRAPH. Deciding what to read can be daunting and there’s really no way to reduce its complexity.

Instead, I’d suggest that learning to do research in HCI is more about choosing which methods you want to excel at. Personally, I focus on user interface design, evaluation, and empirical research, and that only covers a small subset of the kinds of methods that people use in HCI.

I can recommend some books, which offer some perspective on the mindset of HCI researchers. For example, Bill Buxton’s Sketching User Experiences [2] is a fantastic look at what it means to design systems and how finding the right design (the HCI part) can greatly simplify getting the design right (the software engineering part). I very much subscribe to his perspective (which isn’t surprising, since Bill unofficially advised my advisor, Brad Myers, at Toronto).

Q: At what point in the research process should researchers consider what venue to submit to? Should the venue influence how you conduct your research?

There are definitely more experienced people to ask than me! But perhaps I have a fresh perspective on the issue, as I straddle the boundary between the HCI and software engineering worlds. Ideally, researchers would select important, interesting problems and publish the work when its done. The venue should only matter once one knows what the contribution is and which communities would appreciate it.

Unfortunately, the conference culture in both HCI and software engineering tends to incentivize work of limited and conservative scope. This has been discussed across a variety of venues, including HCI articles and conference panels, as well as several ICSE keynotes and papers. That, and a lot of good work gets rejected because its not yet fully formed. I believe that journals, with their multiple rounds of review and lack of deadlines, offer a healthier process with which to vet and disseminate academic research.

Q: Tasks used in HCI studies often appear to require little domain expertise and can be conducted in a short amount of time whereas software studies often require substantial domain expertise and can be difficult to structure to complete in a short amount of time. Is this statement true in your experience and if so, how have you managed the issue?

I don’t think this is a fair characterization of HCI research. In the past, HCI has focused a lot on novice tasks, partly because user interfaces were so bad; there is also a subset of HCI research that focuses in input techniques, which are more amenable to experimentation because of the more limited variance in human motor performance. But in the past few decades, there’s been a broad focus in HCI on supporting experts and expert teamwork in a variety of domains. Designing studies to support these activities are just as or more difficult than designing studies to evaluate software tools. This is one reason why HCI has adopted so many other kinds of methodologies: one can’t design a controlled experiment to learn how first-responders use cell phones to coordinate. We have the same challenges when designing controlled experiments to learn about coordination in software teams.

I deal with this challenge in my own work in a few ways. First, like researchers in all other empirical fields, I carefully design my measurements, stating their limitations and potential confounds, and then move forward despite the threats. The ultimate product of any empirical work is not the one perfectly designed study, but a large collection of studies that repeatedly demonstrate consistent and convergent results across a variety of contexts and with a variety of operationalizations. There is still an attitude in software engineering research that a single study should suffice; we need to move away from that view and start to plan for decades of study and experimentation on fundamental issues.

Another way that I deal with this challenge is to design studies that explain how what my tools do for people and how they do it. For example, I’m designing a study at the moment with James Fogarty and Kayur Patel to evaluate how their integrated classifier development environment helps developers find bugs. The goal of the study is less about showing a difference in success (since success in the real word depends on too many other factors) and more about explaining what the tool does differently than contributes to developers’ success. To do this, we’re asking participants to verbally state changes in their goals, and associating these shifts in goals with the use of different parts of the tool. This way, the study result is not “participants were more successful,” but “participants were more successful because they spent more time confirming fewer hypotheses.” This is the kind of knowledge that helps design other debugging tools.

Q: So do you think that any parts of HCI or software research will have a lasting impact?

Well, this is a controversial topic within HCI, but I personally believe that there is fundamental HCI research and then there are applications of HCI methods (which are actually the methods of other communities, such as cognitive psychology, anthropology, and design). My body of work, for example, is largely an application of HCI methods to the problems of software engineering practice. I view the core areas of HCI as input and output devices and anything else having to do with feedback and interactivity. This latter category has and will continue to have lasting impact.

Software engineering, like HCI, has made several foundational contributions to practice, such as version control, limited forms of model checking, compilers, debuggers, and development environments. However, many of the coordination, planning, and management aspects of software engineering have moved along largely without the help of research. I think the challenge for software engineering research is to recognize that many of the fundamental challenges in practice are human challenges, and that many basic software engineering tools must be designed with these challenges in mind.

One philosophical issue surrounding the future of both applications-driven HCI research and software engineering is whether the domains we study and design for are moving targets. Psychology, medicine, and the natural sciences operate under the assumption that people and nature don’t change in their fundamental nature (or at least very quickly). This makes it possible to advance knowledge with empirical study over the course of 100 years. Can we make the same assumptions for the nature of coordination in software development? Are there really fundamental, unchanging aspects of software engineering practice, or are all of the challenges we observe ephemeral? This is an open question that neither HCI or software research have begun to address.

Q: You mention that doing HCI studies are hard. How might one get started doing an empirical evaluation for the first time, considering both the need to get useful results and the high likelihood of making a mistake?

To really get good at empirical evaluation, a lot of things are necessary. First, find an expert at empirical evaluation who is interested in applying their skills outside of their content area. These might be statisticians, experimental psychologists, or researchers in policy departments. Second, get a good book about epistemology: there’s no end of gentle introductions to the power and perils of measurement. I recommend The Numbers Game [1] for an intuitive sense of the complexity of measuring things. The key thing is to learn to be extremely skeptical about the validity, reliability, and semantics of measurement.

The rest of the challenge is knowing your audience. Do you really need an experimental study to support your claims? Or would finding one person to adopt your tool for a week suffice? Do you really need to demonstrate causality, or are there other more pressing questions that might be interesting to investigate? There are lots of ways to gain confidence that your design choices were good by some measure.

Leave a Reply

Your email address will not be published. Required fields are marked *