Review of Grudin’s “From Tool to Partner: The Evolution of HCI”

Last week there was far too much news I don’t want to hear, so instead of reading news, I read Jonathan Grudin’s new book on the history of HCI. (On my phone. On buses. In the dark. In five minute spurts!). Since you probably haven’t read it yet, I’ll do my best to summarize it here and tell you what I thought.

First, Grudin tackles a lot in this book. He synthesizes no less than the history of AI, HCI, Information Science, and Human Factors, trying to show how these fields emerged, intersected, but rarely engaged each other, despite all of their immense interest in the interaction between people and computing. It’s a massive amount of history about fields emerging at the beginning of the digital age and so the scope can be overwhelming.

Grudin does a reasonable job covering this scope, organizing the book chronologically, but bouncing between different fields, presenting big claims about the assumptions, ideas, and lenses that shaped what the different fields investigated. Throughout, he’s seeking to explain why these fields studied what they did, how that led to their ultimate lack of intermixing, and how that resulted in different fields’ differential impact on practice.

One of the big ideas in the book is the difference between the kind of discretionary computer use that happens in consumer settings and compulsory use that happens in organizational contexts. Grudin theorizes that this is the primary reason why information systems and LIS withered while HCI flowered: discretionary use just became the dominant, visible change in the world, bringing computing to every facet of life, giving HCI a mountain of interesting, diverse things to study and therefore broadening its methods and perspectives, while the world of compulsory inside of organizations moved more slowly, constrained by the difficulty of studying whole organizations and their glacial adoption of consumer trends.

One of the more interesting, perhaps implied ideas, is that these other fields of Management Information Systems, Human Factors, and Library & Information Sciences, despite withering, still have a wealth of knowledge to share about people’s interactions with computers, but the lack of disciplinary intermingling really prevented it from informing some of the big changes that occurred in computing. Google, with its roots as an NSF-funded Library & Information Sciences digital libraries project was one of these few exceptions. Look at the impact that emerged from its interdisciplinary foundations. What would the world look like if our major shifts in computing had been informed by all of these fields instead of primarily computer science?

Zooming to present day, Grudin isn’t sure what to make of the iSchool movement, which seeks to embrace some of these interdisciplinary threads that never quite connected through history. These fields are finding their way together after decades apart, with faculty from computing backgrounds like myself mingling with faculty from these other fields. Will we find ways to combine our disciplinary perspectives into new, more powerful ideas that will shape our computational futures? Or is it too late, with computer science shaping the conversation, but narrowly? I suppose that’s literally up to me and my colleagues to decide.

Grudin is convinced there’s plenty more runway to find out. He predicts a future that goes well beyond interaction, to human-computer integration. In fact, he predicts that future is now, and that we’re only just beginning to figure out how to reason about interactions that infuse computing into our every day decisions and communications. He predicts that understanding people and communication will be key to that, and that interdisciplinary perspectives on communication and information will be key to progress.

Aside from Grudin’s overarching thesis, the book is full of interesting little twists, turns, and origin stories in the history of computing, all told through the lens of interaction. If you’re interested in the history of computing from a research perspective, this is a great entry point to its rich and recent past. I also found it helpful in contextualizing my own epistemologies, my own training, and my interactions my colleagues at my own Information School. If you find yourself in an interdisciplinary setting, I highly recommend it.

The only critique I’ll make is that the book wanders. It doesn’t wander in a particularly frustrating or unhelpful way. It feels more like wandering through a zoo, constantly pulled forward by an interesting bird or a lumbering primate. By the end, you feel like you’ve seen much of the biodiversity in the world, but you’re not quite sure you’ve seen it all, and it all seems a bit artificial. Maybe it’s not possible to recreate a history faithfully, or in a way that feels faithful. Maybe the best we can do is menagerie.

How I applied learning sciences to undergraduate design education

I’m no fan of student evaluations. They’re fraught with gender bias, age bias, and all kinds of construct validity issues. They certainly are not good measures of learning outcomes or teaching quality. At their best, they are good indicators of an instructor’s success at creating a coherent, engaging experience, which is important to learning. And engagement is no small feat in a world that increasingly frames colleges as businesses and students as customers, compelling students to constantly question the value of what they’re learning to their career paths.

Since we nevertheless gather student evaluations every quarter at the University of Washington, I do use them to track my own progress at engaging students in learning. And I’ve usually done pretty well on whatever they’re measuring. Take, for example, my Design Methods course, which is basically an introduction to HCI and Design methods for undergraduates. Since I started teaching it about eight years ago, I’ve generally earned anywhere from a 4.0 to 4.6, which is generally considered by faculty to be excellent. At the University of Washington, these scores are the median of all students average across four prompts on a scale of very poor to excellent (how was the course, how was the content, how were the instructor’s contributions, and how effective was the instructor at teaching). So my generally high scores mean that most of my students believe I can engage them, believe I can explain things to them, and believe that I have sufficient expertise on design. None of this means I can actually do these things well, but pre-tenure, that was good enough for me.

On sabbatical last year, however, I began to read learning sciences and education research more deeply, partly because I’ve been doing more computing education research, and partly because I wanted to become a better teacher. What I found was that while my teaching as adequate, it was far from ideal. While reading through the book How People Learn, I found countless opportunities to produce better learning outcomes, usually without significantly more effort (and sometimes with less effort!).

The source of most of these opportunities was a simpler, but more robust theory of learning. In essence, I learned from learning sciences that effective, efficient learning requires three things:

  • A clear sense of the knowledge to be taught.
  • Deliberate practice of that knowledge (meaning immediate, targeted feedback)
  • Attention, and therefore motivation, on that practice.

That’s it. I learned that the complexity isn’t so much in learning (humans seem to do that quite naturally) but in setting up conditions that predispose people to learning. Getting students motivated, and therefore attending to practice, is hard. And designing effective deliberate practice is hard, often because we don’t know exactly what we’re teaching or what’s hard about what we’re teaching. It’s also hard to scale targeted, immediate feedback to individual learners.

Given these basics, I spent part of my sabbatical trying to redesign my course to achieve better learning outcomes in my Design Methods course. Here are a few of the things I did, applying the theory above.

One of the first and easiest things I did was share with students my theory of learning, to frame how I was engaging with them. I taught Carol Dweck’s work on theory of intelligence, explaining that every student has beliefs about where ability comes from, but those beliefs actually mediate how much people learn. I encouraged them to adopt a growth mindset, remembering that all ability comes from deliberate practice, and that the class would be structured to give them that practice. Second, I told them that as much as it was my job to structure an environment conducive to learning, it would only happen if they engaged, believed in their ability to learn, and listened closely to the feedback I provided.

Next, I tackled the problem of motivating students. I’ve always had some model in my head of what my undergraduates care about, but that model was always based on a few close relationships with undergraduate researchers, generic surveys, or student feedback in evaluations. None of these work that well in providing substantial insight into what motivates my students. To solve this, I spent the first day of class asking students to write a brief essay in class answering the question “Why are you in college and what does design have to do with it?” Then, rather than reading them privately, I had students share them with each other in small groups, and then construct an elaborate whiteboard diagram of their life trajectories and how design fit into it. What we learned was that because my course was a required course, most had little intrinsic motivation to learn design, but they were curious about it and thought it might be useful. Most also had very concrete life goals, including specific careers, visions for where they would live and how much money they needed to make to live there, and what kinds of friends and family they wanted. For most of them, school was a tool for getting them to those futures.

I used this model of my students’ motivations to shape a third pedagogical practice: at the beginning of every class and every in-class activity, I explicitly stated how I thought the day’s activity would contribute to their life goals. Devising these links was not easy and couldn’t be done in advance; I was constantly updating my model of what was motivating my students so I could come up with a single justification that would work best with the whole class. For example, the day I taught heuristic evaluation, I said something to the effect of: “So we’ve talked a lot about UX designers in class so far and how their general responsibility is to envision seamless user interface designs. Some of you want this job, others of you will be working with UX designers to implement their visions. How will they know if their design is good? And how can they know in just a few days, which is the time scale that many designers have to work at? There’s one method invented back in the 1990’s that tried to solve this problem. We’re going to talk about it today, learn its strength and weaknesses, and discuss when it makes sense to use it.” Note that this kind of justification is essentially the same justification that Jakob Nielsen used in his book Usability Engineering. I just needed to link his motivation to students’ individual aspirations.

Another challenge was in motivating students to learn the declarative knowledge about HCI and Design, such as important methods, concepts, histories, and ideas in design. How could I motivate students to read about these things? In addition to use the same strategy above (simply explaining how it linked to students’ own goals), I designed a series of reading exercises that aimed to be frictionless, but also engaging. Twice a week, students would read a short blog-post length chapter that I personally wrote as an introduction to a subject in design. They were short enough that students would read them, but deeply linked, so that throughout the reading, there were multiple followup readings students could do to deepen their knowledge. Then, to motivate students to read them, I held a reading quiz at the beginning of class to verify that they had read it (which had the added benefit of getting them to show up to class on time). I also required a brief summary of a reading of their own they could select, choosing from the readings I linked to, or from any other reading, podcast, or video on the web that concerned the same subject. After the reading quiz, students engaged in “think-pair-share”, turning to a few of their neighbors and explaining what they read and what they found interesting about it. Then, after a few minutes of sharing, I asked for students to voluntarily share the most interesting readings they heard about from their peers. In just about 20 minutes of class, we covered a range of readings, many of which were entirely new to me. I had to be ready to rapidly synthesize and relate the topics they raised to the subject of the day, but this kept me engaged as well. It also reinforced every day that I genuinely did have the expertise to be teaching the subject.

After the reading period, we would engage in an in-class activity. I explained to students that our time together was precious, because it was the only time that we could actually do design together (as design is rarely done alone). For each topic, I carefully designed an activity with a very specific form of deliberate practice, always beginning with a justification and ending with a reflection that tied together the practice they engaged in with feedback on what they did right and wrong in their practice. My role in these activities was to facilitate and closely observe so I could provide this feedback. One example of an activity was a 90-minute usability testing activity in which teams of two designed a paper prototype alarm clock interface, design a task to verify its usability, and conduct a series of usability tests with their classmates. The rules governing this activity were carefully designed to mimic the kind of usability tests that people run in industry, but also to reveal the fundamental scholarly questions behind usability testing (namely, how reliable is the knowledge they produce). I tried to design each of these activities to feel like a game, with some clear notion of the rules and definition of winning, but align these with authentic ideas in practice, and make their authenticity clear to the students.

The result of these readings and activities was that every day, students got to come to class to share what they learned in their selected readings, learn from each other, and then engage with each other with my help to acquire a skill that would help them get closer to their life goals. Almost all students came on time, excited about class, and many left craving more time to go into more depth (which we never had).

I can say with some certainty (both from student evaluations and my own observations) that students were engaged: my median student evaluation score was a 4.9/5.0, the highest I’ve ever received across eight years of teaching and twenty-five courses and the highest I’ve ever seen amongst my colleagues. Unfortunately, what I still can’t say was that they learned any better. We simply don’t know how to measure design skill with any reliability or validity. And so I take it on faith that, given what we know about learning, that as a natural byproduct of deeper, more sustained engagement, the students practiced more and more deliberately the content I gave them.

Now I just have to figure out if it’s the right content! And if students’ perceptions of my teaching skills have anything to do with the quality of their learning. And how to figure out what they’ve learned about design. And a million other unanswered questions about design education!

Assessment is a computing education grand challenge

How do you know what someone knows about computing?

This question is foundational and pops up everywhere. It arises in classrooms, where teachers need to be able to accurately determine what a student has learned, both to help them learn better (through formative assessments) but also to establish a record of how well they’ve learned it (a summative assessment). But it also arises in professional settings such as hiring: when an applicant says they “know” Java, what does that actually mean? What is it predictive of? Surely there are better ways for an employer to know how well someone knows a programming language other than self-report or having passed a class at a university. We don’t even know how well these indicators actually predict ability.

Isn’t this just a matter of writing tests? It turns out that writing good tests is very difficult. It’s not enough to write an exam that asks people to define concepts and solve problems. If the wording of the questions is off, people may get the answers wrong even though they know the answer, or even get the answers right even though they don’t. These are examples of poor test validity, where the test measures something other than the knowledge one is trying to assess. Some tests aren’t reliable, in that using the test repeatedly produces different results for the same individual in different settings. Reliability issues can arise from ambiguous wording, ill-defined concepts, or poorly constructed definitions of correct answers leading to unreliable scoring.

Making a reliable, valid test is a considerable amount of work. Several of the students from Mark Guzdial‘s lab have spent a substantial portion of their time as doctoral students developing reliable, valid tests for measuring how well students can mentally simulate (or trace) the behavior of simple imperative programs (see the FCS1 and SCS1). Even after their rigorous efforts, these assessments are hard to reproduce and sensitive to overuse, making it difficult to scale these efforts to other concepts or other languages.

The implications of unreliable, low validity tests can be severe. Bad tests in introductory programming classes can fail students that actually know quite a lot or pass students that know quite little. This poor signal can trickle down to employers, who might use courses, grades, and other credentials as an indicator of ability. And because tests are garbage-in, garbage-out, all of this happens without a teacher or employer ever really knowing, producing a garbled, sometimes overconfident sense of what students know.

I’ve seen these problems as a student myself. I remember graduating back in 2002 with my undergraduate degree in CS with many of my high performing peers admitting that despite all of their high grades, they still couldn’t sit down in front of an empty code editor and write a program to solve a problem. Sure, they solved lots of problems in class with the help of peers, TAs, and highly scaffolded assignments within the scope of problems their teachers had discussed. But they often didn’t know why their solutions worked. I remember getting partial credit for solutions for regurgitating partial solutions that I really didn’t understand, resulting in inflated grades that miscommunicated the level of my understanding.

Does it matter that developers understand the code that they write if the code still works? If correct code stayed correct and code could be “correct enough,” this might not matter. Unfortunately, correctness matters: programs receive unexpected inputs and developers have to debug, and developers can only do this well with a deep understanding of the semantics of a program’s execution. Moreover, this deep understanding likely would have prevented some of these defects from occurring in the first place.

All this said, there are some people who obviously develop a deep, nuanced, accurate understanding of computation. These people are our best programmers, our computer science faculty, and others who’ve likely devoted their life to eradicating every misconception about computation from their mind through incredible amounts of deliberate practice. I suspect these individuals aren’t confined to the limits of assessment because they’ve learned to self-assess their knowledge. In fact, computing might be unique in that people can actually test their understanding of computation by carefully probing program behavior, using the computer itself as a source of feedback about their understanding. Perhaps this is how people are able to develop robust understandings of computation despite the failures of assessment. This might also explain why CS teachers appear to believe that some students “get it” and some don’t: what’s really going on is that some have an insatiable curiosity about how computers behave, and use that to fuel a limitless quest for more robust knowledge of computing.

Because knowing what computing knowledge is in someone’s head is so hard and so important, I believe it’s a grand challenge of computing education research. If we don’t discover reliable, valid, scalable, replicable ways of knowing what people know about computing—or find a way to give more people an insatiable curiosity about computing—we’ll continue to overlook deficiencies in knowledge, producing defective unreliable code. It’s up to researchers to make these discoveries and up to society to fund it.

Two truths

My first presidential election as an eligible voter was back in 2000. I was one of those annoying Nader supporters who found Gore boring and soft, and preferred Nader’s rage. My feelings on Bush, of course, were a different matter entirely: he seemed stupid, feckless, ignorant. He couldn’t form coherent sentences. Most of all, his disinterest in the truth was frustrating and discouraging. How could I vote for a man that willfully ignored reality?

The years and the wars and the lies dragged on. Americans died, the country split, and cable news helped. Truthiness came to life. Republicans got better at twisting reality into a story that fit their goals, and using words to hide reality: the Clean Air Act, No Child Left Behind, Mission Accomplished. It was 1984 in 2004, hiding lies behind propaganda, falsehoods as reality.

My generation saw Obama as the answer. He talked about the hard truths and what we might do about them. He acknowledged the messy complex realities of our country and sought to implement pragmatic, incremental remedies. The Affordable Care Act was a pure expression of pragmatism: not quite everything we wanted, but a bit better, with a bevy of little changes that aimed to make things a little bit better for some people. Not great, but better. The incrementalist in me swooned.

Of course, all this time, there was another truth, a competing truth, that writhed under Obama’s rule. This truth was an account of the world that was not based in science, in evidence, or in logic, but a truth grounded emotion, experience, and faith. These were truths that accepted science when it was compatible with faith, rather than accepting faith with it was compatible with science. This was an America that was tired of the elites—the economists, the scientists, the secular urban progressives—who claimed truth as their own and rejected anyone with a different epistemology as not only wrong, but also bigoted, ignorant, and backwards.

The country segregated itself by these truths, with the secular elite seeking diversity, inclusion, and progress in the cities, and the rural faithful seeking homogeneity, privacy, and stability in the countryside. We sorted ourselves: not only geographically and economically, but epistemologically. And when housing bubble burst, it was the isolated rural who hurt most, losing not only their fragile local economies, but the also the small trickle of wealth from the growing urban centers upstream. Rural America watched urban America only grow wealthier and more powerful. The secular truth became a cause of suffering, and the religious truth the only cure.

The segregation of urban and rural America, and the segregation of the truths that came with it, left rural America voiceless. The centers of media and journalism were in the cities. With newspapers’ declining revenues going to Silicon Valley, there was even less reason to drive to the country and report on rural America, especially as cities grew and became the center of American vitality. Rural America was not only abandoned by the economy, and by the elites, but by their only remaining voice in the public sphere, the media. Alone, abandoned, and isolated, a justified hate of cities, of sciences, of progress, and of the media festered.

Trump did not cause this. He exploited it. He spoke to a rural America that had been ignored for years and promised to restore everything that had been lost over the past twenty years. He described a truth that everyone in rural America knew: America was falling apart, or at least their America was, and no one else seemed to know it, not the media, not Democrats, and not even Republicans. Something had to be done to restore it.

Who should it be? Certainly not the establishment? The only way to solve a problem is to accept that it exists, and Trump was the only one who did. The lies, the bluster, the hate, the insults, the misogyny, none of these were desirable things. They certainly weren’t Christian behavior. But when it comes down to restoring faith and restoring livelihood, the latter has to come first. Faith is for the fed.

All the while, me and my secular urban elite friends were oblivious. The economy was growing (in cities), fewer people were in poverty (in cities), more people had jobs (in cities), more people have health insurance (in cities), and violent crime was down (in cities). America’s migration to urban centers combined with the steady improvement of cities masked the decline of small American towns. Our aggregate statistics obscured the opposing forces of our economy. The very tools of our secular truth failed us, while our most human senses, our emotions, were blotted out by distance.

Truth matters. It still does. But more importantly, all truths matter. The truths we discover with our minds, but also the truths we discover with our hearts. Our secular urban methods of science and data can only see part of reality because we only answer the questions we ask. We didn’t ask what was happening in rural America. No one did.

Now we know the answer. And now we have to accept that hidden inside our economic recovery was a tragic economic decline. And if we accept the scientific reality behind this decline, we’ll know that it was scientific and technological progress that caused it, centralizing, automating, and digitizing human activity to a degree that place and people no longer mattered, just information. Secular urban progressives robbed rural America of its vitality with science. And now it’s the secular urban progressives who must restore it, making rural America great again.

My SPLASH 2016 Keynote

Above is a practice version of my SPLASH 2016 keynote. If you don’t want to watch the whole 40 minutes, you can read my slides (100 MB of images!). If you don’t want to read the slides, here’s a super condensed version of my argument:

  • The mathematical view of programming languages is powerful and productive
  • However, that view is also narrow, limiting the research questions that we ask
  • Moreover, it limits who participates in computing, because the dominant culture of computing only projects interest the mathematical view.
  • If embracing other views is important for discovery and equity, what other views exist and how can we explore them?
  • These views including PL as power, interface, design, notation, language, communication, but also other surprising lenses such as glue, legalese, infrastructure, and even a path out of poverty.
  • These views, however, also embody values, meaning that by investigating these metaphors for programming languages, we also embrace new values.
  • My work has explored many of these metaphors, including interface, notation, and communication, to great effect.
  • I suggest that every PL researcher consider these new views, but also accept them as valid alternative perspectives for PL research.

The response to the keynote was quite positive! People found the ideas interesting, provoking, thoughtful, and in a few cases, brilliant. It was such a privilege to have the attention of so many great programming language and software engineering researchers. I hope I’ve given them a few tools and ideas about how to consider their future work, and perhaps reconsider their past work.

What does $600K in NSF research funding buy?

Eight years in to my career as a professor I finally wrote my first NSF final report (this would have come earlier, but taking two years of leave to do a startup led to several no cost extensions). Because science and technology has been under attack for quite a while now politically in the United States, this seemed like a great time to step back and consider what NSF funding actually buys America.

The project was funded out of the now canceled “Computing Education for the 21st Century” program at NSF. With my collaborators Margaret Burnett and Catherine Law, I wrote a proposal to investigate whether framing programming as a game would equitably engage learners in more productive learning than other approaches to learning to code. We were one of several teams to be funded, in the amount of $600,000 for three years of research.

Here’s what we did with that money:

  • We designed, implemented, and deployed the Gidget game
  • We designed, built, and deployed the Idea Garden into the Gidget game
  • We designed and built the Idea Garden for JavaScript and the Cloud9 IDE
  • We designed, built, and evaluated a Problem Solving Tutor (not yet published)
  • We designed, built, and evaluated a Programming Language Tutor (not yet published)
  • We studied the role of data representation on engagement
  • We studied the effect of in­-game assessments on engagement
  • We studied the effect of Gidget on attitudes toward learning to code
  • We studied the role of the design principles incorporated into the game on learning
  • We studied the learning gains in the game relative to two other learning paradigms
  • We studied the role of the Idea Garden in engagement and learning
  • We studied the role of self­-regulation in programming problem solving
  • We studied the effect of self-­regulation instruction on programming problem solving
  • We studied the effect of the problem solving tutor on problem solving productivity (not yet published)
  • We conducted a pedagogical analysis of online coding tutorials (to appear)
  • We held four summer camps in Oregon and Washington based on Gidget for high school students, reaching over 80 rural and female high school teens.
  • We held four Gidget open house sessions during CS Education Week (2013­-2016)
  • We held a one day workshop with 25 Native American teen girls.
  • We taught an Upward Bound Web Design course to 11 diverse high school students.
  • We disseminated the results to several Microsoft product teams building computing education products.
  • We served on advisory boards to two NSF­-funded computing education teams
  • We served on a panel on Women in Computing at the Educause conference.
  • We served on a panel on Women in HCI at CHI.
  • We held a workshop at CHI about gender-­inclusiveness issues in software.
  • We disseminated results to
  • We created an extensive Computing Education Research FAQ.
  • We attended a Dagstuhl workshop on assessment in computing education.

Across all of these activities, we taught over 10,000 people how to code via Gidget, ranging from ages 13-80, half of them girls and women. We trained 4 post docs, 6 Ph.D. students, 2 masters students, 18 undergraduates and 6 high schoolers in how to do computer science and computing education research. We produced Gidget, an online game for learning to code, that will be available for at least the next decade as a public resource. We published about 20 research papers, contributing an evidence base for better teaching and learning of computing through online learning technologies.

Is all of this worth $600K? One way to judge this is to quantify how much all of this education cost. If we look just at the 36 students we mentored, the public spent an average of $16K per student to each them rigorous research skills. If we include the 10,000 players of Gidget, plus the future players of Gidget for the next decade growth, the public spent an average of $2 per player to teach them a bit about computer programming and potentially engage them in future learning. From this perspective, the grant was one big education subsidy, promoting the development of highly-skilled STEM workers.

Another way to judge this is to anticipate the future impact of this training. Many of the Gidget players will be more likely to pursue STEM education because of playing the game (according to our research), which may have a net impact on the growth of the economy. Of the 36 students we trained in research that have graduated, many are faculty, UX researchers, and software engineers, filling much needed jobs in industry. The downstream impact of all of this training may be to fill unmet needs in the economy, allowing it to grow more efficiently.

Of course, the other import way to assess the return on investment of the work is to predict the long-term impact of the knowledge we produced. We’ve already disseminated the work to, which is reaching tens of millions of learners in high school through their curriculum. Our research essentially serves to ensure that the learning those students are already doing is more effective than it would have been otherwise. That ultimately means a better, smarter, more effective STEM workforce in the future, which ultimately impacts the growth and productivity of the U.S. economy.

Across the 243 million U.S. taxpayers, each contributed a median of about 1/10th of one penny for this research to happen. What’s the return of that fraction of a penny? Will every American be more than 1/10th of a penny richer in 20 years because of our work?

Clearly, this exercise in trying to model and predict the impact of science funding is hopelessly fraught with reductive ideas about science. It even plays into the framing that house Republicans have used to attack science, accepting the premise that NSF is an investment in America, as opposed to something more idealistic, such as the betterment and survival of humanity. But in reflecting on all of these activities and the actual impacts they’ve had on the world already, I find the sheer scale of potential impact to be compelling in its own right and well worth the price. I can’t wait to see in 10 years what these impacts might be!

The invisibility of prior knowledge

When you watch an Olympic sprinter run 50 yards in 5 seconds, what’s your first thought?

  1. That must have taken an incredible amount of practice, or
  2. Wow, that is some incredible DNA.

Now, we know both nature and nurture matter. But in watching sprinters, we see nurture matter because we can see sprinters practice. Olympics broadcasts show us hours of practice. We see their coach. We know that Nike has sponsored their thousands and thousands of hours and hundreds of pairs of shoes. We know that as much as someone might be born with a genetic head start, the only way to really get to the top is to practice more and better than anyone else in the world.

And yet, for other kinds of human ability, few people, if ever, consider the role of practice, instead attributing ability to genetics. This is especially true in software. People assume Bill Gates must have been a genius. The news frames Mark Zuckerburg like a boy prodigy. Hacker culture of the 90’s, and to a large extent still today, divides people up into “real” coders and posers, treating computing as if it is something natural, innate, inborn, and gifted to a privileged few.

The reality, of course, is that the majority of variation in ability in computing and every other field is due to practice, not genetics. As K. Anders Ericsson studied for years, most of the variation in expert performance is explained by how well and how much people practice a skill. Coding (clearly) isn’t something people are born knowing how to do, nor is it likely something people are born with a predisposition for. It is something people learn, and it is our experiences, our other skills, and our environment that develop and sustain the motivation to learn, and likely are predispositions to learn things.

Why do people gravitate so easy to theories of ability grounded in genetics rather than practice? I think it’s because practice, and in particular, the prior knowledge that practice produces, is invisible. When you meet someone, you can’t see what they know, how well they know it, how many years they’ve been practicing it, or how well they’ve been practicing it. In fact, even when scientists like myself try reallyreally hard to measure what someone knows, we struggle immensely to reliably and accurately capture ability. It’s really only in a narrow set of domains, like sports, where we’ve created elaborate systems of structured measurement in order to quantify ability.

This invisibility of prior knowledge, and the attribution of ability to innate qualities rather than practiced skill, has many consequences throughout software engineering and computing education. When a company tries to hire someone, they have very weak measurements of what an engineer knows, and have to rely on pretty pathetic coding tests that likely correlate with little of actual skill. Worse yet, when a CS1 teacher gets a classroom of new students, they often attribute success in the class not to the quality of the practice they have provided to students, or to the vast differences in practice that students engaged in prior to class, but instead, divide students up into those who “get it” and those who don’t. Because hiring managers and teachers can’t see into each person’s mind, they can’t comprehend the vast complexity of prior knowledge that shapes each individual’s behavior.

Because of these challenges, measuring knowledge of computing really is one of the most pressing and important endeavors of computing education research. We need robust, precise instruments that tell us what someone knows and can do. We need the decathlon of coding, helping us observe ability across a range of skills, each event finely tuned to reveal the practice that lurks beneath someone’s cognition. Only then will we have any clue how to support and develop skills in computing and know that we’re doing it successfully.

So far, the closest thing to this in computing education research are the series of language independent assessments of CS1 program tracing skills that have come out of Mark Guzdial‘s lab. These are great progress, but we need so much more. My former Ph.D. student Paul Li did a study of software engineering expertise, finding dozens of attributes of engineering skill, none of which we know how to adequately measure. Just as lenses revolutionized astronomy and biology, we need instruments that allow us to peer into people’s computational thinking to revolutionize the learning of computing.

Ready to help? Come do a Ph.D. with me or the dozens of other faculty in the world trying to see invisible things in people’s heads. Let’s figure out how to transform humanity’s understanding and utilization of computing by seeing what they know about it.

A defense of sabbatical

This is my last day of sabbatical. I should be preparing for class next week, drafting the sections of that grant I’m helping on, writing meta reviews for that conference, and finding an instructor for that masters program I chair. But the only thing really on my mind is how wonderfully pivotal the last 6 months of paid professional leave have been to my role as a researcher and a teacher.

Sabbatical in academia has a long and turbulent history. It’s not really in great shape right now. Many private universities still guarantee a full year of sabbatical to tenure track faculty, but others have abandoned it entirely, or have eroded it so heavily that it’s given out sparingly. My university is somewhere in the middle: I can get 2/3rd’s of my salary for 9 months every 7 years if I’m in good standing. That’s pretty solid, and our university is committed to keeping it that way.

But is it really worth it? Isn’t sabbatical just a year of paid vacation? Why should a university, especially a public university, release professors from teaching and service, only to have them sit on a beach for 9 months pretending to think hard about research? What does the world get for this investment?

There might be a few beach sabbaticals, but most of my colleagues don’t do that. Mine certainly wasn’t like that. Instead, mine was open time to chart the course of the next 7 years of my professional life in research, teaching, and even service. This is a huge privilege—who in industry gets to do that?—but I also think it’s essential to the job.

Here’s why. As a scholar, my job is to think about the coming decades of human civilization. My discoveries should stand the test of time. Some of them might not even be relevant for a couple of decades. And the discoveries that I teach students should be robust to change as well, preparing students not just for today, but for the next two to three decades of their careers. Even service, which is about running the university and the global research enterprise, can be about the future, investing in new academic programs, improvements to peer review, the efficiency of federal funding, and the political forces that cause it to rise and fall. Every single one of these jobs has a time horizon of at least 1 year out, if not 5, 10, or even 100 years out.

Most of faculty life leaves no room for this type of thinking. Getting that grant, publishing that paper, advising that doctoral student, teaching that class: there is so much detail in each of these, the long view on discovery and learning is a mere backdrop to the daily work. Sabbatical is a time to deal directly with these longer term concerns, framing and structuring a professor’s next 6 years of work.

When I started my sabbatical this past January, that’s how I set out to use it. My highest level goal was to figure out what my mission would be for the next six years and how I would accomplish it. My lower level goals involved choosing the work I’d do this year to help me prepare me to accomplish my mission. I took 6 months of leave at 75% salary, plus I had my three months of summer. I funded the rest of my salary at out NSF grants.

The mission I chose was this: advance and mature what we know about computing education to prepare the world to understand the disruptive changes coming from computing. I wrote about this on this blog, so I won’t go into detail about it here. Instead, I want to share the results of my sabbatical year, to give you a sense of how I spent all of this tuition and taxpayer subsidy. Here we go:

  • I wrote a retrospective on my three years as a startup CTO and cofounder (in review soon!). This is has transformed how I understand the software industry, how I teach about it, and how I see my efforts in computing education relating to industry.
  • I wrote an NSF proposal, framing my lab’s work on how to teach programming problem solving. Even though it was rejected, it’s helped us map fundamental questions about the nature of problem solving in programming and how to teach it.
  • I attended a week long Dagstuhl workshop on computing education research, deepening my relationships with researchers in this growing field. That week was pivotal in connecting me to the small world of computing education researchers, which has empowered me to become an advocate for the field in other areas of HCI, software engineering, and policy making.
  • I started several new student projects ranging from evaluations of coding tutorials, programming language tutors, problem solving tutors, investigations into coding bootcamps, and studies of equity in computing education. I think we’re doing some really exciting, powerful work, and can’t wait to share it with the world in the coming years.
  • I recruited new Ph.D. students to my lab from computer science, information science and education, developing a new pipeline.
  • I wrote a now widely trafficked FAQ on computing education research, which has been important in helping not only students see a participating in research, but has also helped my colleagues in HCI, Software Engineering, and Programming Languages understand it’s importance.
  • I recruited three undergraduate researchers to my lab to support the new projects, mentoring them on graduate school, research, and software engineering careers.
  • I redesigned my faculty website to better convey my focus, contributions, and impact in research for the next seven years.
  • I developed new collaborations with our new iSchool faculty Jason Yip, Katie Davis, and Negin Dahya, learning about new research areas of learning science, identity, and education. This has stretched my expertise, teaching me not only about new foundations of learning and education, but it has made me a more effective and inclusive teacher.
  • I partnered with a large group of computing education researchers to plan big NSF project on advanced learning technologies for computing education. Here’s to hoping we get the funding to fuel those disruptive innovations I mentioned above.
  • I started working with Richard Ladner‘s AccessComputing project to further equity in access to computing education amongst people with diverse physical and cognitive abilities. This connected me to yet another research community of accessibility researchers, and led to some exciting work by my student Amanda Swearngin on web accessibility that has the power to bring the web to everyone regardless of physical ability.
  • I read and wrote a lot about privilege, searching for the underlying privileges in my own life that allowed me to succeed academically. This has been emotionally draining, but empowering, helping me to see my role as a teacher from a more structural, systems view, and making me more excited about the leadership roles I’ll inevitably take on as senior faculty.
  • I attended CHI 2016 in San Jose, reconnecting with the research community after two years of startup life, reminding me of how wonderfully diverse and interdisciplinary our community is.
  • I taught my first high school computer science course to understand more about the challenges of teaching CS electives in schools, and made 11 mentoring relationships with South Seattle teens that I hope will reshape their paths toward college.
  • I went to the Snowbird conference to advocate for computing education research, connecting with hundreds of chairs and deans. This connected me to dozens of leaders at universities around North America, but also gave me a larger view of the barriers that computing education research will face in maturing.
  • I went to ICER 2016 and strengthened my ties with the growing computing education research community, finding partners in my long-term mission.
  • I connected with, Microsoft, and Google efforts in computing education, disseminating my research and others’ to their product design and policy efforts.
  • I redesigned my design thinking course, INFO 360, to incorporate everything I learned about learning this year. It’ll be a better class, while taking less time to teach. It will also be a solid foundation for the other sections of the course, and perhaps HCI courses around the world.
  • I got married, went on a beautiful honeymoon to Croatia and Slovenia, and bought a house in Seattle’s crazy housing market. Don’t worry, no public dollars in any of those.

What’s the ROI of my six months of 2/3rd’s salary to the university and the public? Was all of the above worth ~$80K in salary, benefits and guest faculty in my absence? I think so. In the short term, I mentored and taught dozens of students and made important discoveries that have already impacted efforts at, Microsoft, and Google. And like I claimed above, this is a long term investment. Because I had this sabbatical, in 10 years, you’ll start to see more effective and inclusive teaching of computer science, which means a more computing literate humanity and a more effective workforce of software engineers. I hope this will mean a better, safer world that creates a computing infrastructures and institutions that reflect all of us, rather than just the privileged few. I predict that the $80,000 the university spent on my time will easily return at least 10x in economic productivity over the next decade.

Yes, not every sabbatical is like this. There are some faculty that have extended vacations and get paid a small portion of their salary to do so. Sometimes, rest is what busy professors need to be great researchers and teachers. That said, I think that every sabbatical has the potential to be extremely valuable to society, and it’s a professor’s responsibility to make it so.

Disagree? I want to here from you!

ICER 2016 trip report

ICER 2016 (the ACM International Computing Education Research conference) just ended a few hours ago and I’m enjoying a quiet Sunday afternoon in Melbourne, reflecting on what I learned. Since you probably didn’t get to attend, here’s my synthesis of what I found notable.

First, a meta comment. As I’ve noted in past years, I still find ICER to consistently be the most inclusive, participatory, rigorous, and constructive computer science research conferences I attend (specifically relative to my multiple experiences at CHI, UIST, CSCW, ICSE, FSE, OOPSLA, SIGCSE, and VL/HCC). There are a few specific norms that I think are responsible for this:

  • Attendees sit at round tables and after each talk, discuss the presentation with each other for 5 min. After, people ask questions that emerged. This filters out nasty and boring questions, but can also lead to powerful exchange of expertise and interesting new ideas. It also forces attendees to pay attention, lest they lose social capital from having nothing to say.
  • Session chars explicitly celebrate new attendees, first time speakers, first time authors publicly, creating a welcoming spirit to many new attendee
  • The program committee regularly accepts excellent replications, creating a tone of scientific progress rather than cult of identity
  • The conference gives two awards: one for rigor and one for provocative risk taking, incentivizing both kinds of research necessary for progress
  • The conference has a culture of senior faculty as mentors to the community, not to just their students. All doctoral consortium all the time.
  • The end of each conference is an open session for getting feedback on research designs for ongoing work and new ideas, creating a community feeling to discovery.

There are always things to improve about a conference, but most of the things I’d improve about ICER are things that other conferences should do too: shorter talks, more networking, move to a revise and resubmit journal-style model, allow authors of journal papers to present, and find ways to include authors of rejected work.

Now, to the content. The program was diverse and rigorous. A number of papers further reinforced that perceptions, stereotypes, and beliefs are the powerful forces not only in engagement in computing education, but also learning. Shitanshu Mishra showed that this is true in India as well, where culture creates even stronger stereotypes around the primacy of computer science as the “best” degree to pursue. Kara Behnke provided some nice evidence that AP CS Principles, with its focus on the relationship between CS and the world, powerfully changed these perceptions, and reshaped the conceptions of computing that students took into their post-secondary studies, whereas Sabastian Ericson talked about the role of 1-year industry experiences during college on students reframing of the skills they were acquiring in school. This year’s best paper award was by Alex Lishinki et al., providing convincing evidence that self-efficacy is reshaped by course performance, and that this reshaping occurs differently among men and women and exams and projects.

Many papers investigated problem solving through different lenses. Our own paper explored student self-regulation, which we found was infrequent and shallow, but predictive of problem solving success. Efthimia Aivaloglou presented an investigation into Scratch programs, replicating prior work that showed that most Scratch programs are simple, lack much use of programming constructs, and lack conditionals almost entirely, suggesting that the majority of use lacks many of the features of programming found in other learning contexts. Several also presented more evidence that students don’t understand programming language semantics very well, learn programming language constructs at very different rates (and sometimes not at all), but that subgoal labeled examples and peer assistance can be powerful methods for increasing problem solving success and learning.

My favorite paper, and the paper that won the John Henry award for provocative risk-taking in research, was a study by Elizabeth Patitsas et al., which gathered convincing evidence from almost two decades of intro CS grades at UBC that despite broad beliefs among faculty that grades are bimodal because of the existence of a “geek gene,” there is virtually no empirical evidence of this. In fact, grades were quite normally distributed. Elizabeth also found evidence that the more teachers believed in a “geek gene”, the more they were label distributions as bimodal (even when those distributions were not bimodal, statistically). This is strong evidence that not only do students’ prior beliefs powerfully shape learning, but teachers’ prior beliefs (and confirmation bias in particular) powerfully shape teachers attitudes toward their students.

The conference continues to raise the bar on rigor, and continues to grow in submissions and accepted papers. It’s exciting to be part of a such a dedicated, talented community. If you’re thinking about doing some computing education research, or already are and are publishing it elsewhere, consider attending ICER 2017 in Tacoma next year, August 17-20. My lab is planning several submissions, and UW is likely to bring a large number of students and faculty. I hope to see you there!

Textbooks are awesome

I’ve been writing a lot about big ideas like research, policy, and expertise lately. Today, I’d like to take a step down from the big ideas and lightheartedly discuss something real, tangible, and ubiquitous that I love: textbooks.

Yeah, those big, heavy, out of date, expensive printed books that contain a substantial portion of all of the human knowledge ever discovered. Not those e-books, not these e-textbooks, and definitely not websites masquerading as textbooks. I’m talking about the pile of books in millions of students bags, desks, and bookshelves.

Awesome? How are textbooks anything but terrible, let alone awesome? Let me count the ways:

  • You can read at your own pace. Slides, videos, and other time-based media out of your control are a pain to navigate. Textbooks move at exactly the pace you want to move.
  • You can see everything all at once. All the content is there, always. There are no animations, transitions, or other segmentation of content that you have to wait for, and so you can browse it.
  • You can access any piece of content at any time. Want to skip ahead to chapter 5? Go for it! No need to wait for the professor to get to week 7 of class, for the e-book to “unlock” the next section, or for that slide transition to finish animating. Satisfy your curiosity now.
  • You can memorize the location of any content. No need to recall which day a teacher discussed an idea and ask for their slides. No need to search for that video you remember on Kahn Academy and try to find that segment that was particularly instructive. In fact, there’s probably still a stain from that meatball sub on the page that will give you a subtle cue of the places you’ve already been and what content it was related to.
  • The screens are HUGE. Some of these buggers are up to 16″, which is basically like having a large laptop to view your content.
  • The screens very high contrast. You can read in pretty much any light except for no light. There’s no glare. And that brings me to…
  • The battery life is infinite. Crank that screen brightness up all the way, cuz this thing will last forever. (Unless you drop it in the toilet).
  • The information density is off the charts when compared to slides, handouts, and whiteboards. You can fit text, images, diagrams, commentaries, citations, and a million other kinds of content that these other media are terrible at supporting.
  • You can zoom into content by just moving your head. No buttons, no keyboard shortcuts to memorize, no awkwardly standing up in the middle of class to get closer to the projector screen. It’s like a VR headset: you just move your body and it works.

Yes, the Internet has some advantages. It has a lot more information and you can carry it around more easily. It can also be updated more easily, which is handy, since science and knowledge are always evolving. But isn’t a textbook plus a smartphone even better? Imagine an augmented reality textbook application that allows you to see edits from newer editions or interactive content in diagrams. Imagine social commentary on textbooks for your current page. Imagine scanning a citation and seeing the research paper the statement was based on.

Who does research on these things? Where’s my augmented reality textbook app? Do any of these features better promote learning? Where is my textbook hoverboard?