The black hole of software engineering research

Over the last couple of years as a startup CTO, I’ve made a point of regularly bringing software engineering research into practice. Whether it’s been bleeding edge tools, or major discoveries about process, expertise, or bug triage, it’s been an exciting chance to show professional engineers a glimpse of what academics can bring to practice.

The results have been mixed. While we’ve managed to incorporate much of the best evidence into our tools and practices, most of what I present just isn’t relevant, isn’t believed, or isn’t ready. I’ve demoed exciting tools from research, but my team has found them mostly useless, since they aren’t production ready. I’ve referred to empirical studies that strongly suggest the adoption of particular practices, but experience, anecdote, and context have usually won out over evidence. And honestly, many of our engineering problems simply aren’t the problems that software engineering researchers are investigating.

Why is this?

I think the issue is more than just improving the quality and relevance of research. In fact, I think it’s a system-level issue between the interaction between academia and industry. Here’s my argument:

  • Developers aren’t aware of software engineering research.
  • Why aren’t they aware? Most explicit awareness of research findings comes through coursework, and most computer science students take very little coursework in software engineering.
  • Why don’t they take a lot of software engineering? Software engineering is usually a single required course, or even just an elective. There also aren’t a large number of software engineering masters programs, to whom much of the research might be disseminated.
  • Why are there so few courses? Developers don’t need a professional masters degree in order to get high paying engineering jobs (unlike other fields, like HCI, where professional masters programs are a dominant way to teach practitioners the latest research and engage them in the academic community). This means fewer needs for software engineering faculty, and fewer software engineering Ph.D. students.
  • Why don’t students need coursework to get jobs? There’s huge demand for engineers, even complete novice ones, and many of them know enough about software engineering practice through open source and self-guided projects to quickly learn software engineering skills on the job.
  • Why is it sufficient to learn on the job? Most software engineering research focuses on advanced automated tools for testing and verification. While this is part of software engineering practice, there are many other aspects of software engineering that researchers don’t investigate, limiting the relevance of the research.
  • Why don’t software engineering researchers investigate more relevant things? Many of the problems in software engineering aren’t technical problems, but people problems. There aren’t a lot of faculty or Ph.D. students with the expertise to study these people problems, and many CS departments don’t view social science on software engineering as computer science research.
  • Why don’t faculty and Ph.D. students have the expertise to study the people problems? Faculty and Ph.D. students ultimately come from undergraduate programs that inspire students to pursue a research area. Because there aren’t that many opportunities to learn about software engineering research, there aren’t that many Ph.D. students that pursue software engineering research.

The effect of this vicious cycle? There’s are few venues for disseminating software engineering research discoveries, and few reasons for engineers to study the research themselves.

How do we break the cycle? Here are a few ideas:

  1. Software engineering courses need present more research. Show off the cool things we invent and discover!
  2. Present more relevant research. Show the work that changes how engineers do their job.
  3. Present and offer opportunities to engage in research. We need more REU students!

This obviously won’t solve all of the problems above, but its a start. At the University of Washington, I think we do pretty well with 1) and 3). I occasionally teach a course in our Informatics program on software engineering that does a good job with 2). But there’s so much more we could be doing in terms of dissemination and impact.

23 thoughts on “The black hole of software engineering research

  1. Only posting a year later… but this is one of the first pages I found when searching for the topic and I have opinions dammit 🙂 !

    Some remarks:

    * I think the best comparison here might be to *medicine*, go read some papers of trauma orthopedics. Lots of this sort of stuff in very pragmatic low level policy in the context of an institution. You get papers on things like what sort of screw you should use.

    * Lots of people here seem to mention “people problems”, I would be inclined to call these practical problems. The idea of badly researched and ill-founded sociology being used within companies strike me as a little scary.

    * Small teams working on changing idiosyncratic work isn’t really the best friend of robust quantitative method.

    * Software engineering *does* have an “academic” literature: google, blogs, hacker news, and comments sections, books and github, stackoverflow, conferences and conference talks. It has it’s downsides compared to other forms of professional literature: most of the pieces written tend to be “case series”, you don’t journals and peer review process to ensure notability and impose standard. In other ways it’s very laudable: open-access, very well indexed by google, high-readership.

    * The line between literature and code can be a little fuzzy. Best practices can be implemented as code and shared on github, or provided on websites. Witness git, wikis, code reviews, testing, linting, code coverage analysis. The ease with which other people’s best practices can be adopted with code is *insane* compared to other fields.

    * You might be surprised to know that many businesses can be bad at implementing best practice…

    * I think your focus on getting the academic research into education, might be slightly misplaced. It has a whiff of paternalism to it. The problem rather strikes me as academic research completely failing to interact with a field of endeavor with a *very* active communication process.

    • Wonderful points! Thanks for sharing your thoughts.

      I think one of the most important of your points is the importance of the massive, global conversation amongst developers online. I don’t know any academics who dismiss this conversation as unimportant or irrelevant. They see it as essential to reflective practice and, in many ways, the origin of many (but not all) of the ideas in software engineering.

      In interesting ways, the tension between research and practice isn’t about differences in questions, it’s about differences in methods, and ultimately epistemology (how we decide we know something to be true). In all of my experiences in industry, the tolerance for uncertainty is appropriately high: there are decisions to make, software to build, people to train, people to fire. There’s no time to wait for certainty. In (the idealized) academia, tolerance for uncertainty is quite low. We will spend decades developing confidence in some claim before we’ll declare it true. In my view, both of these different tolerances are critical to practice: one to ensure movement, the other to ensure progress. It’s when academia tries to be applied and relevant that it fails, and when industry tries to declare truth that it fails.

      Education is where we teach the biggest of ideas, the truest of things. It’s not my ideas that paternalistic, but education itself. How could it not be? Students consent to having their thoughts shaped by others. It’s a benevolent, consensual form of brainwashing 🙂

  2. Maybe we should follow a model close to the one used by medicine. One group develops a “drug” (in our case, a tool, a framework, a method, an approach) a publishes a paper, using students or professional software developers as volunteers. Then other research group evaluates the “drug” in real life. Only after this second phase a “drug” could be “relased to the public” (ie, disseminated to real software developers).

  3. “While we’ve man­aged to incor­po­rate much of the best evi­dence into our tools and prac­tices, most of what I present just isn’t rel­e­vant, isn’t believed, or isn’t ready. I’ve demoed excit­ing tools from research, but my team has found them mostly use­less, since they aren’t pro­duc­tion ready. I’ve referred to empir­i­cal stud­ies that strongly sug­gest the adop­tion of par­tic­u­lar prac­tices”

    Please provide details in 1 or more articles of the above

    • Happily:

      We’ve adopted the best evidence-based practices from agile such as pair programming (Williams, L., Kessler, R. R., Cunningham, W., & Jeffries, R. (2000). Strengthening the case for pair programming. IEEE software, (4), 19-25.)

      I’ve discussed my work on software engineering expertise to inform our hiring and retention practices. (Li, P. L., Ko, A. J., & Zhu, J. (2015, May). What makes a great software engineer?. In Proceedings of the 37th International Conference on Software Engineering-Volume 1 (pp. 700-710). IEEE Press. http://faculty.washington.edu/ajko/papers/Li2015GreatEngineers.pdf)

      We rely heavily on static analysis and code coverage tools based on decades of software engineering research to find functional defects, vulnerabilities, and style inconsistencies (e.g., Ayewah, N., Hovemeyer, D., Morgenthaler, J. D., Penix, J., & Pugh, W. (2008). Using static analysis to find bugs. Software, IEEE, 25(5), 22-29. Chicago).

      We invest heavily in carefully planned remote work, based on evidence that it does not lead to substantial differences in defect rates (Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista, Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall and Brendan Murphy, International Conference on Software Engineering).

      Those are just a few examples.

  4. This is a very interesting idea. As a software developer, I’d love to be exposed to some of the current literature on software engineering. Do you have any recommendations for recent papers and/or research where a practitioner might get started?

  5. Nice post. I agree with pretty much everything you say.

    I’m often distressed about the quality of the empirical studies I see in software engineering. First, they so often ignore context, and context is everything in software as they are so many domain/organizational/product variables.

    Second, they are often defined and carried out by grad students who ‘study’ the engineers from the outside. It would seem to me to be a much better model (both academically, and for the organizations involved) for the grad students to work as part of the team. They’d understand context better, get some real development skills and experience, and I hope design and conduct better studies!

    And of course, so many empirical studies produce conflicting results. I wonder why 🙂

  6. A major aspect of this problem is the general volatility of the technical side of software engineering. Software engineers are so often working against the upstream current of change related to the languages, frameworks and tools that time to understand the movement of software engineering practices (pragmatic ones at that) is close to nil. Just consider all that is happening in the container and dependency management spaces – architecturally requirements are driving things way faster than things can be researched.

    I sort of blame “agile” and “Agile” for this too. Not that there wasn’t or hasn’t been great upside to the pragmatic practices coming from there – but many software developing organizations still haven’t really figured out how to go from there. Or worse, are still struggling with agile and delivering (in a well understand operations environment).

    I do think the tools landscape is getting better for software engineering – wholly attributable to the commercial space – not academia. So it is interesting to consider why there aren’t Software Engineering PhD programs – at least not many. Or are those problem areas (research ideas) better suited for commercial opportunities?

    • To be clear, there are many software engineering labs in many Computer Science departments (I’m part of one: UW PLSE). Also, many of the commercial tools, especially the SaaS applications out there for things like devops and continuous integration, are based on long term open source projects that leverage much of the software engineering research done in the 80’s and 90’s. It’s not that software engineering research has no impact, but that it’s sparse and slow. I believe this is because academics (like myself), don’t spend a lot of time disseminating our results.

  7. I’ve seen this so often, some academic thinks who knows everything about something, but lacking real world practice. Those engineers probably have good reasons to behave like they do, and you likely where never interested in the fact they solve problems all the time but make less fuzz about it. For them a problem is just to be fixed, its not their thesis. Engineers are high level workers, while academics are high level dreamers. There is not much wrong with that because an academic mostly live from patents, and often not for a companies customers. Have you ever been in the middle of the night solving things at a customer site. Engineers do this lots of times.
    I think its important for an academic to stay in dream area, ea come with improvements ea new solar cell types, new programming languages, new chips.
    But dont get into the lane of engineers, you can make tools for engineers.
    But if your tools are unusable take a helicopter view upon what’s your goal.

  8. My one cent,

    I would also add that it is rare that students get to work on large collaborative projects in undergraduate or graduate school. It is an experience which you only get at companies during internships. So is it almost impossible to know understand and know the best practices in software engineering outside of academia.

  9. You can even go further and indite much of “CS Systems Research” as not that relevant to practice. Academics often focus on topics that they can write papers about, not whether their results affect the outside word.

    For example there have been tens of thousands of papers on functional languages, type theory etc. and very little of it is much use in practice since programming without state seems to miss something essential!

    If you really look at where innovations come from – it almost always begins with the practitioners who are situated to see the upcoming problems first.

    The greatest problem of real-world practitioners: how to deal with the burgeoning complexity of systems gets little academic attention. It even gets little economic attention since SW tools have been declared unprofitable and not-a-fit market for new companies to enter (just use free tools). See John Maynard Keynes talking about problems like these as “obvious” failures of capitalism.

    If someone could figure out a way to make SW tools profitable and a source for new companies, then we might see things develop differently.

    • > For exam­ple there have been tens of thou­sands of papers on func­tional lan­guages, type the­ory etc. and very lit­tle of it is much use in prac­tice since pro­gram­ming with­out state seems to miss some­thing essential!

      Straw man alert!

  10. There was a paper posted on Hacker News a while back by Djikstra. He argued that computer science is fundamentally different from other sciences. In short, we’re learning as we go, and only after something has been established do we sit down and formalize rules around it.

    In electrical engineering, the physical properties of copper are known, and to not study them when trying to build something means ruin. By contrast, what are the properties of software written to be consumed over the web, that is highly scalable, available, and replicable, at the scale of Twitter? It’s a brand new problem.

    The problem I see is, how does academia keep up and be a relevant force in the ecosystem of software development?

  11. Hi,

    My analysis of software engineering (vs. computer science) in academics is that it is largely solving problems irrelevant to actual software development. Let me hit a few high points:

    * Pressman et al “Software Engineering” was largely pointless as a textbook; it helped me hit a couple points in internship interviews. My understanding at the time (2005) was that Pressman was the leading text. *Waste* of time. Tons of software development models, none relating to any team I’ve ever been on – or heard about being on.

    * Bugs don’t matter much in practice outside of specialized fields. Any academic work towards minimizing bugs is going to founder on the rock of disinterest. This blog post (http://www.drmaciver.com/2015/10/the-economics-of-software-correctness/) goes into the situation. Depressing, but, true. Verification, debugging, etc, those are all subjects that are largely “meh”. As an example, research-grade debuggers from the 80s had features that Visual Studio & IntelliJ still don’t have. Le sigh. Verification? That requires math, and math is hard.

    * The academic software engineering / reengineering tools tend towards the “nifty trick regarding code base” variety; they don’t seem to have meaningful value. Aggravating to hear, I’m sure, and I’m sorry, but I can’t say otherwise at present. 🙁

    * software engineers are hostile to academics as a rule. Anecdotes always win over data. As someone with a MS in computer science… this grinds my gears, constantly.

    * software engineering isn’t computer science, it’s social science around the management of creating unique artifacts.

    If you want to grab coffee and talk more, send me an email; I live near UW and work near Pioneer Square.

    Regards,
    Paul

  12. I think this is an issue with the low status of software engineers. If my boss discourages me from running unit tests it doesn’t matter if I can prove that they are effective in preventing bugs. If the executives decide that users can live with an ineffective user experience/interface, then it doesn’t matter if I find 10 papers that tell me that font sizes should be based on user preference or that the flow of an application should work a certain way. More painfully, using a modeling language like Alloy or TLA+ rarely happens so all that research that uses those and explores them might as well be defunded since it’s not relevant to industry 🙁

    If we were treated as professionals with a higher status as a doctor, lawyer, or professional engineer would be treated, then the research could be more directly applied.

    I agree that we need more social science related research in the field, however I think the ACM is doing a good job providing the journals and conferences for that kind of research. There’s whole conferences and magazines covering effective computer science teaching methods for example.

  13. I think you’re right that the hardest problems in software engineering are people problems, which haven’t traditionally been seen as research, but I think 2015 is a great time to bring hard analysis techniques to soft skills.

    For example, heart rate variability is known to predict episodes at stress. Thanks to Fitbit, Apple Watch, and others, we can now measure HRV continuously. https://en.wikipedia.org/wiki/Heart_rate_variability

    Let’s say you strapped Fitbits to an entire team of software developers, correlated HRV to their GitHub log, bash history, PagerDuty outages, and Slack channels—could you identify the major causes of stress? Likely communication breakdowns? How does a person’s emotional state correspond to their rate of bugs? Can you tell when a team has “gelled”? Is C++ more stressful than Javascript, or the other way around?

    I think if the software engineering field had rigorously proven ways to lower a team’s stress and make people happier and more productive, a lot of engineering manager’s ears might perk up. At least, mine sure would!

    • There’s an increasing amount of work on the people problems and human factors of software engineering. I do a lot of it; so do several researchers at North Carolina State University. There are lots of great little labs developing their capacity to study what actually happens in software engineering and invent tools, methods, and processes that have real impact on software engineering.

  14. This is part of why I think it’s so important for software engineering researchers to have actual development experience in industry — not just internships in research labs, but actual time working on a development team. It’s really the only way to find out what real software development is like, and I don’t see a better way to ground one’s research in the problems of the real world.

    • I couldn’t agree more. Even as someone who studies actual software development pretty closely, working as a CTO/product manager/developer for the past three years has taught me a lot of things, particularly about the dominance of individuals and personality in process and decision making.

Leave a Reply

Your email address will not be published. Required fields are marked *