What five years of early career research funding buys the world

Whenever I close out a grant, I like to reflect on what I achieved with the money. Well, to be clear, NSF likes me to do that too, in the form of a final report of project outcomes. And as it should be: the average American gave me a tenth of a penny to do some research. What did it buy them?

This particular grant was my CAREER award, granted in 2009. This grant is given out to a select few faculty each year who “have the potential to serve as academic role models in research and education and to lead advances in the mission of their department or organization.” Really though, it’s an award given out for important research by new faculty.

In my final outcomes report, I described my work like this (note that these reports are intended for the general public and aren’t supposed to require any expertise to understand):

When software companies release software, there are only a few ways for them to learn about problems that users experience. They can wait for users to report problems, which leads to large amounts of unstructured text that is difficult to aggregate and analyze. They can also automatically monitor for easily detectable problems such as crashes, errors, and performance issues. The broad set of usability and usefulness issues that arise, however, are difficult to monitor and aggregate, making it difficult for teams to improve software for users.

I then went on to summarize the discoveries and impact of the work:

Across the seven years of the project, we made numerous discoveries about this problem. We learned how developers, designers, and product managers evolve software, finding that many ignore feedback that comes through technical support channels, that feedback from users often comes from highly technical users, and that developers do engage with user feedback, they often view it as irrelevant minority opinion. We also found that when developers discuss these issues, they tend to ignore evidence, relying instead on anecdote, speculation, and hyperbole. We also discovered that the most expert software engineers are more rational and evidence-based in their decision making and assessment of feedback, relying on objective data sources to inform their product decisions. However, we also found that expert engineers require substantial interpersonal skills to persuade less experienced developers who rely on less objective decision-making practices.

We invented many approaches to address these problems. One was a way for users to request help while using software without having to express their problem. It dynamically creates a repository of frequently asked questions, predicts which questions a user will have based on their context, and provides structured data to software teams about which questions users have and where. This data can then be used to make more evidence-based decisions about how to improve software. In addition to this, we invented new algorithms for mining software feedback from technical support forums and for automatically detecting usability problems in software without even having to release software.

Were all of the facts above worth the $600K that received over 7 years (including 2 years of “no cost extensions” while I was on leave)? When they’re summarized as they are above, it’s hard to judge, since facts alone probably aren’t the most valuable thing to anyone in the general public—they’re more useful to us academics trying to build larger truths about software engineering. The questionable value of intermediate scientific discoveries is why NSF also requires reports to describe “broader impacts”. I described mine like this:

We disseminated this work in diverse ways. We co-founded a software startup called AnswerDash that sells the help technology, raising venture capital, and to date have created dozens of jobs, while increasing the sales of numerous companies, indirectly creating more jobs. At the time of this grant’s expiration, over 10 million people have used the product to seek help. We also shared our discoveries through multiple articles in popular press, through a webinar reaching over 30,000 software engineers, and through a podcast reaching over 10,000 software engineers. The PI also developed a new software engineering course and wrote a free online book to support the course, which summarizes the forty year history of research on human aspects of software engineering. The grant also supported the professional development of the PI, directly supported the research of four doctoral students (two of whom are now faculty), and trained over a dozen undergraduates about research, several of whom pursued graduate degrees.

Tech transfer? Teaching 40,000 software engineers? A new course and a new textbook? Those are pretty good, right?

Then there are the things that the general public wouldn’t really care about at all, but that I care about as an academic:

  • The grant supported the general research infrastructure at the University of Washington, including buildings, electricity, staff, and other expenses associated with the research. This is called “overhead”, and while it’s generally supposed to cover research related expenses, it supports highly coupled resources like buildings, which inadvertently also support the educational mission of the university.
  • My lab published 25 papers with the funding, spanning HCI, Software Engineering, and Computing Education venues. Four of those papers receive best paper awards.
  • These papers have already been cited over 350 times by other researchers in the world, impacting the ideas and directions of other researchers.
  • I was invited to give my first keynote at SPLASH 2016, which challenged me to think bigger about programming languages and equity.

Most importantly, because I wasn’t spending as much time fundraising all of these years, I was able to focus on becoming a better teacher, a better researcher, a better mentor, and a better leader. Without the support of the CAREER grant, there’s no way I’d have achieved the level of success and impact that I have at this point in my career. And there’s no way I’d be a position to resume frantic fundraising now without failing at my teaching, mentorship, and leadership duties. Because of the grant, I’m a more productive, effective, prolific, and impactful public intellectual, which ultimately helps the hundreds of students I teach every year be more productive, effective, and impactful people.

All that cost the average American a tenth of a penny (and given our current tax brackets, more like a penny for upper middle class Americans and everyone else basically nothing). Is the world that much better for its investment?

In this case, clearly yes: me and my co-founders (my colleague Jake Wobbrock and our former Ph.D. student Parmit Chilana) convinced a venture capitalist to invest $2.54 million in a local U.S. company that created more than two dozen jobs instead of investing somewhere else in the world. Even if you don’t care about anything above except for direct financial returns, that’s a $1,940,000 profit on a $600,000 investment—a 323% return!

Take that Trump-kins. Research beats stock market when done right.

A glimpse into state-level CS education policy implementation

This past Tuesday I had the privilege of attending the Washington State Computer Science Leadership Team, a group of leaders in the state of Washington responsible for devising and implementing K-12 CS education policy. From a computing education research perspective, it was an exciting chance to both observe a state try to systematically implement significant changes to public education, but also a unique opportunity to help shape policy by disseminating computing education research findings.

The meeting was held in Facebook Seattle’s current offices in Westlake, which is interesting in its own right. The Facebook leaders who sponsored the space have a clear interest strong local computing education, but they represented one piece of a much larger effort public/private partnership in state education policy. The room had STEM education representatives from nearly more than a dozen of our state’s school district offices, some directly representing districts, and other representing public educational services that serve multiple districts. There were also educational non-profits such as the Washington STEM, Code.org, Pacific Northwest National Labs, the Pacific Science Center, the University of Washington (Stuart Reges and myself), Seattle Pacific University, and for-profit organizations like Facebook.

The meeting itself was a mix of updates and planning. The updates were both exciting and intimidating. There are districts like Bellevue Schools doing very impressive things to incorporate CS teacher training and CS courses with small amounts of resources. And then there were the scary numbers that only about 10% of Washington state schools offer some form of computing education and less than 1% of Washington state students are engaging in them. That’s pretty far from universal access and even further from universal engagement. So far, the vast majority of teachers were unfamiliar with the CSTA or CSTA curriculum frameworks. The scale of the dissemination effort required for all of these is astounding, even at the scale of a relatively small state like Washington, with only about 1 million students.

Because I had to leave for afternoon meetings, I missed the afternoon planning, but I had plenty of chance in the morning for conversations with several attendees, laying the foundation for research dissemination with several folks. The interesting challenge from a research perspective is finding ways to pitch research in light of all of the other existing challenges in this massive change, such as money for teacher training and salaries, curriculum, and other resources. Discoveries have to be incredibly clear, concise, and adoptable to have any chance of being adopted amongst all of this other change. All that said, researchers should be involved at every level: in policy planning, policy implementation, teacher training, curriculum framework development, and technology design. These efforts can be a great way of disseminating research, but also discovering new research opportunities.

Review of Grudin’s “From Tool to Partner: The Evolution of HCI”

Last week there was far too much news I don’t want to hear, so instead of reading news, I read Jonathan Grudin’s new book on the history of HCI. (On my phone. On buses. In the dark. In five minute spurts!). Since you probably haven’t read it yet, I’ll do my best to summarize it here and tell you what I thought.

First, Grudin tackles a lot in this book. He synthesizes no less than the history of AI, HCI, Information Science, and Human Factors, trying to show how these fields emerged, intersected, but rarely engaged each other, despite all of their immense interest in the interaction between people and computing. It’s a massive amount of history about fields emerging at the beginning of the digital age and so the scope can be overwhelming.

Grudin does a reasonable job covering this scope, organizing the book chronologically, but bouncing between different fields, presenting big claims about the assumptions, ideas, and lenses that shaped what the different fields investigated. Throughout, he’s seeking to explain why these fields studied what they did, how that led to their ultimate lack of intermixing, and how that resulted in different fields’ differential impact on practice.

One of the big ideas in the book is the difference between the kind of discretionary computer use that happens in consumer settings and compulsory use that happens in organizational contexts. Grudin theorizes that this is the primary reason why information systems and LIS withered while HCI flowered: discretionary use just became the dominant, visible change in the world, bringing computing to every facet of life, giving HCI a mountain of interesting, diverse things to study and therefore broadening its methods and perspectives, while the world of compulsory inside of organizations moved more slowly, constrained by the difficulty of studying whole organizations and their glacial adoption of consumer trends.

One of the more interesting, perhaps implied ideas, is that these other fields of Management Information Systems, Human Factors, and Library & Information Sciences, despite withering, still have a wealth of knowledge to share about people’s interactions with computers, but the lack of disciplinary intermingling really prevented it from informing some of the big changes that occurred in computing. Google, with its roots as an NSF-funded Library & Information Sciences digital libraries project was one of these few exceptions. Look at the impact that emerged from its interdisciplinary foundations. What would the world look like if our major shifts in computing had been informed by all of these fields instead of primarily computer science?

Zooming to present day, Grudin isn’t sure what to make of the iSchool movement, which seeks to embrace some of these interdisciplinary threads that never quite connected through history. These fields are finding their way together after decades apart, with faculty from computing backgrounds like myself mingling with faculty from these other fields. Will we find ways to combine our disciplinary perspectives into new, more powerful ideas that will shape our computational futures? Or is it too late, with computer science shaping the conversation, but narrowly? I suppose that’s literally up to me and my colleagues to decide.

Grudin is convinced there’s plenty more runway to find out. He predicts a future that goes well beyond interaction, to human-computer integration. In fact, he predicts that future is now, and that we’re only just beginning to figure out how to reason about interactions that infuse computing into our every day decisions and communications. He predicts that understanding people and communication will be key to that, and that interdisciplinary perspectives on communication and information will be key to progress.

Aside from Grudin’s overarching thesis, the book is full of interesting little twists, turns, and origin stories in the history of computing, all told through the lens of interaction. If you’re interested in the history of computing from a research perspective, this is a great entry point to its rich and recent past. I also found it helpful in contextualizing my own epistemologies, my own training, and my interactions my colleagues at my own Information School. If you find yourself in an interdisciplinary setting, I highly recommend it.

The only critique I’ll make is that the book wanders. It doesn’t wander in a particularly frustrating or unhelpful way. It feels more like wandering through a zoo, constantly pulled forward by an interesting bird or a lumbering primate. By the end, you feel like you’ve seen much of the biodiversity in the world, but you’re not quite sure you’ve seen it all, and it all seems a bit artificial. Maybe it’s not possible to recreate a history faithfully, or in a way that feels faithful. Maybe the best we can do is menagerie.

How I applied learning sciences to undergraduate design education

I’m no fan of student evaluations. They’re fraught with gender bias, age bias, and all kinds of construct validity issues. They certainly are not good measures of learning outcomes or teaching quality. At their best, they are good indicators of an instructor’s success at creating a coherent, engaging experience, which is important to learning. And engagement is no small feat in a world that increasingly frames colleges as businesses and students as customers, compelling students to constantly question the value of what they’re learning to their career paths.

Since we nevertheless gather student evaluations every quarter at the University of Washington, I do use them to track my own progress at engaging students in learning. And I’ve usually done pretty well on whatever they’re measuring. Take, for example, my Design Methods course, which is basically an introduction to HCI and Design methods for undergraduates. Since I started teaching it about eight years ago, I’ve generally earned anywhere from a 4.0 to 4.6, which is generally considered by faculty to be excellent. At the University of Washington, these scores are the median of all students average across four prompts on a scale of very poor to excellent (how was the course, how was the content, how were the instructor’s contributions, and how effective was the instructor at teaching). So my generally high scores mean that most of my students believe I can engage them, believe I can explain things to them, and believe that I have sufficient expertise on design. None of this means I can actually do these things well, but pre-tenure, that was good enough for me.

On sabbatical last year, however, I began to read learning sciences and education research more deeply, partly because I’ve been doing more computing education research, and partly because I wanted to become a better teacher. What I found was that while my teaching as adequate, it was far from ideal. While reading through the book How People Learn, I found countless opportunities to produce better learning outcomes, usually without significantly more effort (and sometimes with less effort!).

The source of most of these opportunities was a simpler, but more robust theory of learning. In essence, I learned from learning sciences that effective, efficient learning requires three things:

  • A clear sense of the knowledge to be taught.
  • Deliberate practice of that knowledge (meaning immediate, targeted feedback)
  • Attention, and therefore motivation, on that practice.

That’s it. I learned that the complexity isn’t so much in learning (humans seem to do that quite naturally) but in setting up conditions that predispose people to learning. Getting students motivated, and therefore attending to practice, is hard. And designing effective deliberate practice is hard, often because we don’t know exactly what we’re teaching or what’s hard about what we’re teaching. It’s also hard to scale targeted, immediate feedback to individual learners.

Given these basics, I spent part of my sabbatical trying to redesign my course to achieve better learning outcomes in my Design Methods course. Here are a few of the things I did, applying the theory above.

One of the first and easiest things I did was share with students my theory of learning, to frame how I was engaging with them. I taught Carol Dweck’s work on theory of intelligence, explaining that every student has beliefs about where ability comes from, but those beliefs actually mediate how much people learn. I encouraged them to adopt a growth mindset, remembering that all ability comes from deliberate practice, and that the class would be structured to give them that practice. Second, I told them that as much as it was my job to structure an environment conducive to learning, it would only happen if they engaged, believed in their ability to learn, and listened closely to the feedback I provided.

Next, I tackled the problem of motivating students. I’ve always had some model in my head of what my undergraduates care about, but that model was always based on a few close relationships with undergraduate researchers, generic surveys, or student feedback in evaluations. None of these work that well in providing substantial insight into what motivates my students. To solve this, I spent the first day of class asking students to write a brief essay in class answering the question “Why are you in college and what does design have to do with it?” Then, rather than reading them privately, I had students share them with each other in small groups, and then construct an elaborate whiteboard diagram of their life trajectories and how design fit into it. What we learned was that because my course was a required course, most had little intrinsic motivation to learn design, but they were curious about it and thought it might be useful. Most also had very concrete life goals, including specific careers, visions for where they would live and how much money they needed to make to live there, and what kinds of friends and family they wanted. For most of them, school was a tool for getting them to those futures.

I used this model of my students’ motivations to shape a third pedagogical practice: at the beginning of every class and every in-class activity, I explicitly stated how I thought the day’s activity would contribute to their life goals. Devising these links was not easy and couldn’t be done in advance; I was constantly updating my model of what was motivating my students so I could come up with a single justification that would work best with the whole class. For example, the day I taught heuristic evaluation, I said something to the effect of: “So we’ve talked a lot about UX designers in class so far and how their general responsibility is to envision seamless user interface designs. Some of you want this job, others of you will be working with UX designers to implement their visions. How will they know if their design is good? And how can they know in just a few days, which is the time scale that many designers have to work at? There’s one method invented back in the 1990’s that tried to solve this problem. We’re going to talk about it today, learn its strength and weaknesses, and discuss when it makes sense to use it.” Note that this kind of justification is essentially the same justification that Jakob Nielsen used in his book Usability Engineering. I just needed to link his motivation to students’ individual aspirations.

Another challenge was in motivating students to learn the declarative knowledge about HCI and Design, such as important methods, concepts, histories, and ideas in design. How could I motivate students to read about these things? In addition to use the same strategy above (simply explaining how it linked to students’ own goals), I designed a series of reading exercises that aimed to be frictionless, but also engaging. Twice a week, students would read a short blog-post length chapter that I personally wrote as an introduction to a subject in design. They were short enough that students would read them, but deeply linked, so that throughout the reading, there were multiple followup readings students could do to deepen their knowledge. Then, to motivate students to read them, I held a reading quiz at the beginning of class to verify that they had read it (which had the added benefit of getting them to show up to class on time). I also required a brief summary of a reading of their own they could select, choosing from the readings I linked to, or from any other reading, podcast, or video on the web that concerned the same subject. After the reading quiz, students engaged in “think-pair-share”, turning to a few of their neighbors and explaining what they read and what they found interesting about it. Then, after a few minutes of sharing, I asked for students to voluntarily share the most interesting readings they heard about from their peers. In just about 20 minutes of class, we covered a range of readings, many of which were entirely new to me. I had to be ready to rapidly synthesize and relate the topics they raised to the subject of the day, but this kept me engaged as well. It also reinforced every day that I genuinely did have the expertise to be teaching the subject.

After the reading period, we would engage in an in-class activity. I explained to students that our time together was precious, because it was the only time that we could actually do design together (as design is rarely done alone). For each topic, I carefully designed an activity with a very specific form of deliberate practice, always beginning with a justification and ending with a reflection that tied together the practice they engaged in with feedback on what they did right and wrong in their practice. My role in these activities was to facilitate and closely observe so I could provide this feedback. One example of an activity was a 90-minute usability testing activity in which teams of two designed a paper prototype alarm clock interface, design a task to verify its usability, and conduct a series of usability tests with their classmates. The rules governing this activity were carefully designed to mimic the kind of usability tests that people run in industry, but also to reveal the fundamental scholarly questions behind usability testing (namely, how reliable is the knowledge they produce). I tried to design each of these activities to feel like a game, with some clear notion of the rules and definition of winning, but align these with authentic ideas in practice, and make their authenticity clear to the students.

The result of these readings and activities was that every day, students got to come to class to share what they learned in their selected readings, learn from each other, and then engage with each other with my help to acquire a skill that would help them get closer to their life goals. Almost all students came on time, excited about class, and many left craving more time to go into more depth (which we never had).

I can say with some certainty (both from student evaluations and my own observations) that students were engaged: my median student evaluation score was a 4.9/5.0, the highest I’ve ever received across eight years of teaching and twenty-five courses and the highest I’ve ever seen amongst my colleagues. Unfortunately, what I still can’t say was that they learned any better. We simply don’t know how to measure design skill with any reliability or validity. And so I take it on faith that, given what we know about learning, that as a natural byproduct of deeper, more sustained engagement, the students practiced more and more deliberately the content I gave them.

Now I just have to figure out if it’s the right content! And if students’ perceptions of my teaching skills have anything to do with the quality of their learning. And how to figure out what they’ve learned about design. And a million other unanswered questions about design education!

Assessment is a computing education grand challenge

How do you know what someone knows about computing?

This question is foundational and pops up everywhere. It arises in classrooms, where teachers need to be able to accurately determine what a student has learned, both to help them learn better (through formative assessments) but also to establish a record of how well they’ve learned it (a summative assessment). But it also arises in professional settings such as hiring: when an applicant says they “know” Java, what does that actually mean? What is it predictive of? Surely there are better ways for an employer to know how well someone knows a programming language other than self-report or having passed a class at a university. We don’t even know how well these indicators actually predict ability.

Isn’t this just a matter of writing tests? It turns out that writing good tests is very difficult. It’s not enough to write an exam that asks people to define concepts and solve problems. If the wording of the questions is off, people may get the answers wrong even though they know the answer, or even get the answers right even though they don’t. These are examples of poor test validity, where the test measures something other than the knowledge one is trying to assess. Some tests aren’t reliable, in that using the test repeatedly produces different results for the same individual in different settings. Reliability issues can arise from ambiguous wording, ill-defined concepts, or poorly constructed definitions of correct answers leading to unreliable scoring.

Making a reliable, valid test is a considerable amount of work. Several of the students from Mark Guzdial‘s lab have spent a substantial portion of their time as doctoral students developing reliable, valid tests for measuring how well students can mentally simulate (or trace) the behavior of simple imperative programs (see the FCS1 and SCS1). Even after their rigorous efforts, these assessments are hard to reproduce and sensitive to overuse, making it difficult to scale these efforts to other concepts or other languages.

The implications of unreliable, low validity tests can be severe. Bad tests in introductory programming classes can fail students that actually know quite a lot or pass students that know quite little. This poor signal can trickle down to employers, who might use courses, grades, and other credentials as an indicator of ability. And because tests are garbage-in, garbage-out, all of this happens without a teacher or employer ever really knowing, producing a garbled, sometimes overconfident sense of what students know.

I’ve seen these problems as a student myself. I remember graduating back in 2002 with my undergraduate degree in CS with many of my high performing peers admitting that despite all of their high grades, they still couldn’t sit down in front of an empty code editor and write a program to solve a problem. Sure, they solved lots of problems in class with the help of peers, TAs, and highly scaffolded assignments within the scope of problems their teachers had discussed. But they often didn’t know why their solutions worked. I remember getting partial credit for solutions for regurgitating partial solutions that I really didn’t understand, resulting in inflated grades that miscommunicated the level of my understanding.

Does it matter that developers understand the code that they write if the code still works? If correct code stayed correct and code could be “correct enough,” this might not matter. Unfortunately, correctness matters: programs receive unexpected inputs and developers have to debug, and developers can only do this well with a deep understanding of the semantics of a program’s execution. Moreover, this deep understanding likely would have prevented some of these defects from occurring in the first place.

All this said, there are some people who obviously develop a deep, nuanced, accurate understanding of computation. These people are our best programmers, our computer science faculty, and others who’ve likely devoted their life to eradicating every misconception about computation from their mind through incredible amounts of deliberate practice. I suspect these individuals aren’t confined to the limits of assessment because they’ve learned to self-assess their knowledge. In fact, computing might be unique in that people can actually test their understanding of computation by carefully probing program behavior, using the computer itself as a source of feedback about their understanding. Perhaps this is how people are able to develop robust understandings of computation despite the failures of assessment. This might also explain why CS teachers appear to believe that some students “get it” and some don’t: what’s really going on is that some have an insatiable curiosity about how computers behave, and use that to fuel a limitless quest for more robust knowledge of computing.

Because knowing what computing knowledge is in someone’s head is so hard and so important, I believe it’s a grand challenge of computing education research. If we don’t discover reliable, valid, scalable, replicable ways of knowing what people know about computing—or find a way to give more people an insatiable curiosity about computing—we’ll continue to overlook deficiencies in knowledge, producing defective unreliable code. It’s up to researchers to make these discoveries and up to society to fund it.

Two truths

My first presidential election as an eligible voter was back in 2000. I was one of those annoying Nader supporters who found Gore boring and soft, and preferred Nader’s rage. My feelings on Bush, of course, were a different matter entirely: he seemed stupid, feckless, ignorant. He couldn’t form coherent sentences. Most of all, his disinterest in the truth was frustrating and discouraging. How could I vote for a man that willfully ignored reality?

The years and the wars and the lies dragged on. Americans died, the country split, and cable news helped. Truthiness came to life. Republicans got better at twisting reality into a story that fit their goals, and using words to hide reality: the Clean Air Act, No Child Left Behind, Mission Accomplished. It was 1984 in 2004, hiding lies behind propaganda, falsehoods as reality.

My generation saw Obama as the answer. He talked about the hard truths and what we might do about them. He acknowledged the messy complex realities of our country and sought to implement pragmatic, incremental remedies. The Affordable Care Act was a pure expression of pragmatism: not quite everything we wanted, but a bit better, with a bevy of little changes that aimed to make things a little bit better for some people. Not great, but better. The incrementalist in me swooned.

Of course, all this time, there was another truth, a competing truth, that writhed under Obama’s rule. This truth was an account of the world that was not based in science, in evidence, or in logic, but a truth grounded emotion, experience, and faith. These were truths that accepted science when it was compatible with faith, rather than accepting faith with it was compatible with science. This was an America that was tired of the elites—the economists, the scientists, the secular urban progressives—who claimed truth as their own and rejected anyone with a different epistemology as not only wrong, but also bigoted, ignorant, and backwards.

The country segregated itself by these truths, with the secular elite seeking diversity, inclusion, and progress in the cities, and the rural faithful seeking homogeneity, privacy, and stability in the countryside. We sorted ourselves: not only geographically and economically, but epistemologically. And when housing bubble burst, it was the isolated rural who hurt most, losing not only their fragile local economies, but the also the small trickle of wealth from the growing urban centers upstream. Rural America watched urban America only grow wealthier and more powerful. The secular truth became a cause of suffering, and the religious truth the only cure.

The segregation of urban and rural America, and the segregation of the truths that came with it, left rural America voiceless. The centers of media and journalism were in the cities. With newspapers’ declining revenues going to Silicon Valley, there was even less reason to drive to the country and report on rural America, especially as cities grew and became the center of American vitality. Rural America was not only abandoned by the economy, and by the elites, but by their only remaining voice in the public sphere, the media. Alone, abandoned, and isolated, a justified hate of cities, of sciences, of progress, and of the media festered.

Trump did not cause this. He exploited it. He spoke to a rural America that had been ignored for years and promised to restore everything that had been lost over the past twenty years. He described a truth that everyone in rural America knew: America was falling apart, or at least their America was, and no one else seemed to know it, not the media, not Democrats, and not even Republicans. Something had to be done to restore it.

Who should it be? Certainly not the establishment? The only way to solve a problem is to accept that it exists, and Trump was the only one who did. The lies, the bluster, the hate, the insults, the misogyny, none of these were desirable things. They certainly weren’t Christian behavior. But when it comes down to restoring faith and restoring livelihood, the latter has to come first. Faith is for the fed.

All the while, me and my secular urban elite friends were oblivious. The economy was growing (in cities), fewer people were in poverty (in cities), more people had jobs (in cities), more people have health insurance (in cities), and violent crime was down (in cities). America’s migration to urban centers combined with the steady improvement of cities masked the decline of small American towns. Our aggregate statistics obscured the opposing forces of our economy. The very tools of our secular truth failed us, while our most human senses, our emotions, were blotted out by distance.

Truth matters. It still does. But more importantly, all truths matter. The truths we discover with our minds, but also the truths we discover with our hearts. Our secular urban methods of science and data can only see part of reality because we only answer the questions we ask. We didn’t ask what was happening in rural America. No one did.

Now we know the answer. And now we have to accept that hidden inside our economic recovery was a tragic economic decline. And if we accept the scientific reality behind this decline, we’ll know that it was scientific and technological progress that caused it, centralizing, automating, and digitizing human activity to a degree that place and people no longer mattered, just information. Secular urban progressives robbed rural America of its vitality with science. And now it’s the secular urban progressives who must restore it, making rural America great again.

My SPLASH 2016 Keynote

Above is a practice version of my SPLASH 2016 keynote. If you don’t want to watch the whole 40 minutes, you can read my slides (100 MB of images!). If you don’t want to read the slides, here’s a super condensed version of my argument:

  • The mathematical view of programming languages is powerful and productive
  • However, that view is also narrow, limiting the research questions that we ask
  • Moreover, it limits who participates in computing, because the dominant culture of computing only projects interest the mathematical view.
  • If embracing other views is important for discovery and equity, what other views exist and how can we explore them?
  • These views including PL as power, interface, design, notation, language, communication, but also other surprising lenses such as glue, legalese, infrastructure, and even a path out of poverty.
  • These views, however, also embody values, meaning that by investigating these metaphors for programming languages, we also embrace new values.
  • My work has explored many of these metaphors, including interface, notation, and communication, to great effect.
  • I suggest that every PL researcher consider these new views, but also accept them as valid alternative perspectives for PL research.

The response to the keynote was quite positive! People found the ideas interesting, provoking, thoughtful, and in a few cases, brilliant. It was such a privilege to have the attention of so many great programming language and software engineering researchers. I hope I’ve given them a few tools and ideas about how to consider their future work, and perhaps reconsider their past work.

What does $600K in NSF research funding buy?

Eight years in to my career as a professor I finally wrote my first NSF final report (this would have come earlier, but taking two years of leave to do a startup led to several no cost extensions). Because science and technology has been under attack for quite a while now politically in the United States, this seemed like a great time to step back and consider what NSF funding actually buys America.

The project was funded out of the now canceled “Computing Education for the 21st Century” program at NSF. With my collaborators Margaret Burnett and Catherine Law, I wrote a proposal to investigate whether framing programming as a game would equitably engage learners in more productive learning than other approaches to learning to code. We were one of several teams to be funded, in the amount of $600,000 for three years of research.

Here’s what we did with that money:

  • We designed, implemented, and deployed the Gidget game
  • We designed, built, and deployed the Idea Garden into the Gidget game
  • We designed and built the Idea Garden for JavaScript and the Cloud9 IDE
  • We designed, built, and evaluated a Problem Solving Tutor (not yet published)
  • We designed, built, and evaluated a Programming Language Tutor (not yet published)
  • We studied the role of data representation on engagement
  • We studied the effect of in­-game assessments on engagement
  • We studied the effect of Gidget on attitudes toward learning to code
  • We studied the role of the design principles incorporated into the game on learning
  • We studied the learning gains in the game relative to two other learning paradigms
  • We studied the role of the Idea Garden in engagement and learning
  • We studied the role of self­-regulation in programming problem solving
  • We studied the effect of self-­regulation instruction on programming problem solving
  • We studied the effect of the problem solving tutor on problem solving productivity (not yet published)
  • We conducted a pedagogical analysis of online coding tutorials (to appear)
  • We held four summer camps in Oregon and Washington based on Gidget for high school students, reaching over 80 rural and female high school teens.
  • We held four Gidget open house sessions during CS Education Week (2013­-2016)
  • We held a one day workshop with 25 Native American teen girls.
  • We taught an Upward Bound Web Design course to 11 diverse high school students.
  • We disseminated the results to several Microsoft product teams building computing education products.
  • We served on advisory boards to two NSF­-funded computing education teams
  • We served on a panel on Women in Computing at the Educause conference.
  • We served on a panel on Women in HCI at CHI.
  • We held a workshop at CHI about gender-­inclusiveness issues in software.
  • We disseminated results to code.org.
  • We created an extensive Computing Education Research FAQ.
  • We attended a Dagstuhl workshop on assessment in computing education.

Across all of these activities, we taught over 10,000 people how to code via Gidget, ranging from ages 13-80, half of them girls and women. We trained 4 post docs, 6 Ph.D. students, 2 masters students, 18 undergraduates and 6 high schoolers in how to do computer science and computing education research. We produced Gidget, an online game for learning to code, that will be available for at least the next decade as a public resource. We published about 20 research papers, contributing an evidence base for better teaching and learning of computing through online learning technologies.

Is all of this worth $600K? One way to judge this is to quantify how much all of this education cost. If we look just at the 36 students we mentored, the public spent an average of $16K per student to each them rigorous research skills. If we include the 10,000 players of Gidget, plus the future players of Gidget for the next decade growth, the public spent an average of $2 per player to teach them a bit about computer programming and potentially engage them in future learning. From this perspective, the grant was one big education subsidy, promoting the development of highly-skilled STEM workers.

Another way to judge this is to anticipate the future impact of this training. Many of the Gidget players will be more likely to pursue STEM education because of playing the game (according to our research), which may have a net impact on the growth of the economy. Of the 36 students we trained in research that have graduated, many are faculty, UX researchers, and software engineers, filling much needed jobs in industry. The downstream impact of all of this training may be to fill unmet needs in the economy, allowing it to grow more efficiently.

Of course, the other import way to assess the return on investment of the work is to predict the long-term impact of the knowledge we produced. We’ve already disseminated the work to code.org, which is reaching tens of millions of learners in high school through their curriculum. Our research essentially serves to ensure that the learning those students are already doing is more effective than it would have been otherwise. That ultimately means a better, smarter, more effective STEM workforce in the future, which ultimately impacts the growth and productivity of the U.S. economy.

Across the 243 million U.S. taxpayers, each contributed a median of about 1/10th of one penny for this research to happen. What’s the return of that fraction of a penny? Will every American be more than 1/10th of a penny richer in 20 years because of our work?

Clearly, this exercise in trying to model and predict the impact of science funding is hopelessly fraught with reductive ideas about science. It even plays into the framing that house Republicans have used to attack science, accepting the premise that NSF is an investment in America, as opposed to something more idealistic, such as the betterment and survival of humanity. But in reflecting on all of these activities and the actual impacts they’ve had on the world already, I find the sheer scale of potential impact to be compelling in its own right and well worth the price. I can’t wait to see in 10 years what these impacts might be!

The invisibility of prior knowledge

When you watch an Olympic sprinter run 50 yards in 5 seconds, what’s your first thought?

  1. That must have taken an incredible amount of practice, or
  2. Wow, that is some incredible DNA.

Now, we know both nature and nurture matter. But in watching sprinters, we see nurture matter because we can see sprinters practice. Olympics broadcasts show us hours of practice. We see their coach. We know that Nike has sponsored their thousands and thousands of hours and hundreds of pairs of shoes. We know that as much as someone might be born with a genetic head start, the only way to really get to the top is to practice more and better than anyone else in the world.

And yet, for other kinds of human ability, few people, if ever, consider the role of practice, instead attributing ability to genetics. This is especially true in software. People assume Bill Gates must have been a genius. The news frames Mark Zuckerburg like a boy prodigy. Hacker culture of the 90’s, and to a large extent still today, divides people up into “real” coders and posers, treating computing as if it is something natural, innate, inborn, and gifted to a privileged few.

The reality, of course, is that the majority of variation in ability in computing and every other field is due to practice, not genetics. As K. Anders Ericsson studied for years, most of the variation in expert performance is explained by how well and how much people practice a skill. Coding (clearly) isn’t something people are born knowing how to do, nor is it likely something people are born with a predisposition for. It is something people learn, and it is our experiences, our other skills, and our environment that develop and sustain the motivation to learn, and likely are predispositions to learn things.

Why do people gravitate so easy to theories of ability grounded in genetics rather than practice? I think it’s because practice, and in particular, the prior knowledge that practice produces, is invisible. When you meet someone, you can’t see what they know, how well they know it, how many years they’ve been practicing it, or how well they’ve been practicing it. In fact, even when scientists like myself try reallyreally hard to measure what someone knows, we struggle immensely to reliably and accurately capture ability. It’s really only in a narrow set of domains, like sports, where we’ve created elaborate systems of structured measurement in order to quantify ability.

This invisibility of prior knowledge, and the attribution of ability to innate qualities rather than practiced skill, has many consequences throughout software engineering and computing education. When a company tries to hire someone, they have very weak measurements of what an engineer knows, and have to rely on pretty pathetic coding tests that likely correlate with little of actual skill. Worse yet, when a CS1 teacher gets a classroom of new students, they often attribute success in the class not to the quality of the practice they have provided to students, or to the vast differences in practice that students engaged in prior to class, but instead, divide students up into those who “get it” and those who don’t. Because hiring managers and teachers can’t see into each person’s mind, they can’t comprehend the vast complexity of prior knowledge that shapes each individual’s behavior.

Because of these challenges, measuring knowledge of computing really is one of the most pressing and important endeavors of computing education research. We need robust, precise instruments that tell us what someone knows and can do. We need the decathlon of coding, helping us observe ability across a range of skills, each event finely tuned to reveal the practice that lurks beneath someone’s cognition. Only then will we have any clue how to support and develop skills in computing and know that we’re doing it successfully.

So far, the closest thing to this in computing education research are the series of language independent assessments of CS1 program tracing skills that have come out of Mark Guzdial‘s lab. These are great progress, but we need so much more. My former Ph.D. student Paul Li did a study of software engineering expertise, finding dozens of attributes of engineering skill, none of which we know how to adequately measure. Just as lenses revolutionized astronomy and biology, we need instruments that allow us to peer into people’s computational thinking to revolutionize the learning of computing.

Ready to help? Come do a Ph.D. with me or the dozens of other faculty in the world trying to see invisible things in people’s heads. Let’s figure out how to transform humanity’s understanding and utilization of computing by seeing what they know about it.

A defense of sabbatical

This is my last day of sabbatical. I should be preparing for class next week, drafting the sections of that grant I’m helping on, writing meta reviews for that conference, and finding an instructor for that masters program I chair. But the only thing really on my mind is how wonderfully pivotal the last 6 months of paid professional leave have been to my role as a researcher and a teacher.

Sabbatical in academia has a long and turbulent history. It’s not really in great shape right now. Many private universities still guarantee a full year of sabbatical to tenure track faculty, but others have abandoned it entirely, or have eroded it so heavily that it’s given out sparingly. My university is somewhere in the middle: I can get 2/3rd’s of my salary for 9 months every 7 years if I’m in good standing. That’s pretty solid, and our university is committed to keeping it that way.

But is it really worth it? Isn’t sabbatical just a year of paid vacation? Why should a university, especially a public university, release professors from teaching and service, only to have them sit on a beach for 9 months pretending to think hard about research? What does the world get for this investment?

There might be a few beach sabbaticals, but most of my colleagues don’t do that. Mine certainly wasn’t like that. Instead, mine was open time to chart the course of the next 7 years of my professional life in research, teaching, and even service. This is a huge privilege—who in industry gets to do that?—but I also think it’s essential to the job.

Here’s why. As a scholar, my job is to think about the coming decades of human civilization. My discoveries should stand the test of time. Some of them might not even be relevant for a couple of decades. And the discoveries that I teach students should be robust to change as well, preparing students not just for today, but for the next two to three decades of their careers. Even service, which is about running the university and the global research enterprise, can be about the future, investing in new academic programs, improvements to peer review, the efficiency of federal funding, and the political forces that cause it to rise and fall. Every single one of these jobs has a time horizon of at least 1 year out, if not 5, 10, or even 100 years out.

Most of faculty life leaves no room for this type of thinking. Getting that grant, publishing that paper, advising that doctoral student, teaching that class: there is so much detail in each of these, the long view on discovery and learning is a mere backdrop to the daily work. Sabbatical is a time to deal directly with these longer term concerns, framing and structuring a professor’s next 6 years of work.

When I started my sabbatical this past January, that’s how I set out to use it. My highest level goal was to figure out what my mission would be for the next six years and how I would accomplish it. My lower level goals involved choosing the work I’d do this year to help me prepare me to accomplish my mission. I took 6 months of leave at 75% salary, plus I had my three months of summer. I funded the rest of my salary at out NSF grants.

The mission I chose was this: advance and mature what we know about computing education to prepare the world to understand the disruptive changes coming from computing. I wrote about this on this blog, so I won’t go into detail about it here. Instead, I want to share the results of my sabbatical year, to give you a sense of how I spent all of this tuition and taxpayer subsidy. Here we go:

  • I wrote a retrospective on my three years as a startup CTO and cofounder (in review soon!). This is has transformed how I understand the software industry, how I teach about it, and how I see my efforts in computing education relating to industry.
  • I wrote an NSF proposal, framing my lab’s work on how to teach programming problem solving. Even though it was rejected, it’s helped us map fundamental questions about the nature of problem solving in programming and how to teach it.
  • I attended a week long Dagstuhl workshop on computing education research, deepening my relationships with researchers in this growing field. That week was pivotal in connecting me to the small world of computing education researchers, which has empowered me to become an advocate for the field in other areas of HCI, software engineering, and policy making.
  • I started several new student projects ranging from evaluations of coding tutorials, programming language tutors, problem solving tutors, investigations into coding bootcamps, and studies of equity in computing education. I think we’re doing some really exciting, powerful work, and can’t wait to share it with the world in the coming years.
  • I recruited new Ph.D. students to my lab from computer science, information science and education, developing a new pipeline.
  • I wrote a now widely trafficked FAQ on computing education research, which has been important in helping not only students see a participating in research, but has also helped my colleagues in HCI, Software Engineering, and Programming Languages understand it’s importance.
  • I recruited three undergraduate researchers to my lab to support the new projects, mentoring them on graduate school, research, and software engineering careers.
  • I redesigned my faculty website to better convey my focus, contributions, and impact in research for the next seven years.
  • I developed new collaborations with our new iSchool faculty Jason Yip, Katie Davis, and Negin Dahya, learning about new research areas of learning science, identity, and education. This has stretched my expertise, teaching me not only about new foundations of learning and education, but it has made me a more effective and inclusive teacher.
  • I partnered with a large group of computing education researchers to plan big NSF project on advanced learning technologies for computing education. Here’s to hoping we get the funding to fuel those disruptive innovations I mentioned above.
  • I started working with Richard Ladner‘s AccessComputing project to further equity in access to computing education amongst people with diverse physical and cognitive abilities. This connected me to yet another research community of accessibility researchers, and led to some exciting work by my student Amanda Swearngin on web accessibility that has the power to bring the web to everyone regardless of physical ability.
  • I read and wrote a lot about privilege, searching for the underlying privileges in my own life that allowed me to succeed academically. This has been emotionally draining, but empowering, helping me to see my role as a teacher from a more structural, systems view, and making me more excited about the leadership roles I’ll inevitably take on as senior faculty.
  • I attended CHI 2016 in San Jose, reconnecting with the research community after two years of startup life, reminding me of how wonderfully diverse and interdisciplinary our community is.
  • I taught my first high school computer science course to understand more about the challenges of teaching CS electives in schools, and made 11 mentoring relationships with South Seattle teens that I hope will reshape their paths toward college.
  • I went to the Snowbird conference to advocate for computing education research, connecting with hundreds of chairs and deans. This connected me to dozens of leaders at universities around North America, but also gave me a larger view of the barriers that computing education research will face in maturing.
  • I went to ICER 2016 and strengthened my ties with the growing computing education research community, finding partners in my long-term mission.
  • I connected with code.org, Microsoft, and Google efforts in computing education, disseminating my research and others’ to their product design and policy efforts.
  • I redesigned my design thinking course, INFO 360, to incorporate everything I learned about learning this year. It’ll be a better class, while taking less time to teach. It will also be a solid foundation for the other sections of the course, and perhaps HCI courses around the world.
  • I got married, went on a beautiful honeymoon to Croatia and Slovenia, and bought a house in Seattle’s crazy housing market. Don’t worry, no public dollars in any of those.

What’s the ROI of my six months of 2/3rd’s salary to the university and the public? Was all of the above worth ~$80K in salary, benefits and guest faculty in my absence? I think so. In the short term, I mentored and taught dozens of students and made important discoveries that have already impacted efforts at code.org, Microsoft, and Google. And like I claimed above, this is a long term investment. Because I had this sabbatical, in 10 years, you’ll start to see more effective and inclusive teaching of computer science, which means a more computing literate humanity and a more effective workforce of software engineers. I hope this will mean a better, safer world that creates a computing infrastructures and institutions that reflect all of us, rather than just the privileged few. I predict that the $80,000 the university spent on my time will easily return at least 10x in economic productivity over the next decade.

Yes, not every sabbatical is like this. There are some faculty that have extended vacations and get paid a small portion of their salary to do so. Sometimes, rest is what busy professors need to be great researchers and teachers. That said, I think that every sabbatical has the potential to be extremely valuable to society, and it’s a professor’s responsibility to make it so.

Disagree? I want to here from you!