Wednesday, August 9, 2017

On Contempt Culture, a Reply to Aurynn Shaw

I saw an interesting presentation recorded and delivered on LinkedIn on contempt culture by Aurynn Shaw, delivered this year at PyCon.  I had worked with Aurynn on projects back when she used to work for Command Prompt.  You can watch the video below:



Unfortunately comments on a social media network are not sufficient for discussing nuance so I decided to put this blog post together.  In my view she is very right about a lot of things but there are some major areas where I disagree and therefore wanted to put together a full blog post explaining what I see as an alternative to what she rightly condemns.

To start out, I think she is very much right that there often exists a sort of tribalism in tech with people condemning each others tools, whether it be Perl vs PHP (her example) or vi vs emacs, and I think that can be harmful.  The comments here are aimed at fostering a sort of inclusive and nuanced conversation that is needed.

The Basic Problem

Every programming culture has norms, and many times groups from outside those norms tend to be condemned in some way or another. There are a number of reasons for this.  One is competition and the other is seeking approval in one's in group.   I think one could take her points further and argue that in part it is about an effort to improve the relative standing of one's group relative to others around it.

Probably the best example we can come up with in the PostgreSQL world is the way MySQL is looked at.  A typical attitude is that everyone should be using PostgreSQL and therefore people choosing MySQL are optimising for the wrong things.

But where I would start to break with Aurynn's analysis would be when we contrast how we look at MySQL with how we look at Oracle.  Oracle, too, has some major oversights (empty string being null if it is a varchar, no transactional DDL, etc).  Almost all of us may dislike the software and the company.  But people who work with Oracle still have prestige.  So bashing tools isn't quite the same thing as bashing the people who use them.  Part of it, no doubt, is that Oracle is more established, is an older player in the market, and therefore there is a natural degree of prestige that comes from working with the product.  But the question I have is what can we learn from that?

Some time ago, I wrote a the most popular blog post in the history of this blog.  It was a look at the differences in design between MySQL and PostgreSQL and was syndicated on DZone, featured in Hacker News, and otherwise got a fairly large review.   In general, aside from a couple of historical errors, the PostgreSQL-using audience loved the piece.  What surprised me though was that the MySQL-users also loved the piece.  In fact one comment that appeared (I think on Reddit) said that I had expressed why MySQL was better.

The positive outpouring from MySQL users, I think, came from the fact that I sympathetically looked at what MySQL was designed to do and what market it was designed for (applications that effectively own the database), describing how some things I considered misfeatures actually could be useful in that environment, but also being brutally honest about the tradeoffs.

Applying This to Programming Language Debates

Before I start discussing this topic, it is worth a quick tour of my experience as a software developer.

The first language I ever worked with was BASIC on a C64.  I then dabbled in Logo and some other languages, but the first language I taught myself professionally was PHP.  From there I taught myself some very basic Perl, Python, and C.  For a few years I worked with PHP and bash scripting, only to fall into doing Perl development by accident.  I also became mildly proficient in Javascript.

My PostgreSQL experience grew out of my Perl experience.  And about 3 years ago I was asked to start teaching Python courses.  I rose to this challenge.  Around the same time, I had a small project where we used Java and quickly found myself teaching Java and now I feel like I am moderately capable in that language.   I am now teaching myself Haskell (something I think I could not have done before really mastering Python). So I have worked with a lot of languages.  I can pick up new languages with ease.  Part of it is because I generally seek to understand a language as a product of its own history and the need it was intended to address.

As we all know different programming languages are associated with stereotypes.  Moreover, I would argue that stereotypes are usually imperfect understandings that out-group people have of in-group dynamics, so dismissing stereotypes is often as bad as simply accepting them.

PHP as a case study, compared to C.

I would like to start with an example of PHP, since this is the one specifically addressed in the talk and it is a language I have some significant experience writing software in.

PHP often is seen to be insecure because it is easy to write insecure software in the language.  Of course it is easy to write insecure software in any language, but certain vulnerabilities are a particular problem in PHP due to lexical structure and (sometimes) standard library issues.

Lexically, the big issue with PHP is the fact that the language is designed to be a preprocessor to SGML files (and in fact it used to be called the PHP Hypertext Preprocessor).  For this reason everything, PHP is easy to embed in SGML PI tags (so you can write a PHP template as a piece of valid HTML).  This is a great feature but it makes cross site scripting particularly easy to overlook.  A lot of the standard library in the 1990's had really odd behaviour, though much of this has been corrected.

Aurynn is right to point to the fact that these were exacerbated by a flood of new programmers during the rise of PHP, but one thing she does not discuss in the talk is how software and internet security were also changing during the time.  In essence, the late 1990's saw the rise of SSH (20k users in 1995 to over 2M in 2000), the end of transmission of passwords across the internet in plain text, the rise of concern about SQL injection and XSS, and so forth.  PHP's basic features were in place just before this really got going, and adding to this a new developer community, and you have a recipe for security problems.  Of course, today, PHP has outgrown a lot of this and PHP developers today have codified best practices to deal with a lot of the current threats.

If we contrast this with C as programming language, C has even more glaring lexical issues regarding security, from double free bug possibilities to buffer overruns.  C, however, is a very unforgiving language and consequently, it doesn't tend to be a language that has a large, novice developer community. At the same time, a whole lot of security issues come out of software in C.

Conclusions

There is no such thing as a perfect tool (database, programming language, etc).  As we grow as professionals, part of that process is learning to better use the strengths of the technologies we work with and part of it is learning to overcome the oversights and problems of the tools as well.

It is further not the case that just because a programmer primarily uses a tool with real oversights that this reflects poor judgment from the programmer.  Rather this process of learning can have the opposite impact.  C programmers tend to be very knowledgeable because they have to be.  The same is true for Javascript programmers for very different reasons.  And one doesn't have to validate all language design decisions in order to respect others.

Instead of attacking developers of other languages, my recommendation is, when you see a problem, to neutrally and respectfully point it out, not from a position of superiority but a position of respectful assistance and also to understand that often what may seem like poor decisions in the design of a language may in fact have real benefits in some cases.

For example, Java as a language encourages mediocrity of code. It is very easy to become a mediocre Java developer.  But once you understand Java as a language, this becomes a feature because it means that the barrier to understanding and debugging (and hence maintaining!) code is reduced, and once you understand that you can put emphasis instead on design and tooling.    This, of course, also has costs since it is easy for legacy patterns to emerge in the tooling (JavaBeans for example) but it allows some really amazing frameworks, such as Spring.

On the other extreme, Javascript is a language characterised by shortcuts taken during the initial design stage (for time constraint reasons) and some of those cause real problems, but others make hard things possible.  Javascript makes it, also, very easy to be a bad Javascript programmer.  But perhaps for this reason I have found that professional Javascript programmers tend to be extremely knowledgeable, and have had to work very hard to master software development in the language, and they usually bring to the table great insights into computing problems generally.

So what I would recommend that people take away is the idea that in fact we do grow out of hardship, and that problems in tools are overcome over time.  So for that reason discussing real shortcomings of tools while at the same time respecting communities and their ability to grow and overcome problems is important.

5 comments:

  1. Any thoughts on the contempt culture that exists between DBAs and developers?

    ReplyDelete
    Replies
    1. See my comment below for a general overview. However having thought about it overnight now, I have a more specific response.

      I am ruthlessly pragmatic when it comes to decisions of software and systems engineering. I am also very conservative when it comes to design -- believing it is better to stick with what has worked rather than try out the new and the cool.

      One of the things that is often missed I think is the fact that databases fill a bunch of different roles and it is easy, once you have a hammer, to try to use the same hammer everywhere. An ERP system is a classical example of where a single database with multiple apps makes most sense (though app developers try hard to make this not the case) but commercial software often relies on the idea that access through software means commercial control, license fees, and all of that. Similarly there are cases where you *do* want to force things to go through an application for data validation reasons (even if that application is a series of stored procedures).

      I think to a large extent two things are required. The first is to look at the problems and solutions. The second is to understand that we have a bunch of different approaches and limits, and often part of the problem is that application developers don't really understand what the database can and cannot do for them (truth be told, sometimes database administrators get this wrong too).

      I think that, like a lot of political problems, an important aspect is making sure that you actually get everyone together and talking about new features etc and how these will impact various groups. And one aspect is that if the database service offers services tot he application, one thing we as database administrators need to do see what we can do to help.

      Delete
  2. That's an interesting question, actually. And I wonder how much of that drives people to try things like Mongo where most of your data management problems are handed back up to the developers. But I have seen very little in this area that is more striking than the contempt between "application owns the database" folks and "database offers applications an information service." And I think that also drives a lot of the arguments between MySQL and PostgreSQL, as well as between db people and, say, ddd people.

    ReplyDelete
  3. -- And I wonder how much of that drives people to try things like Mongo where most of your data management problems are handed back up to the developers.

    well. in my experience it's not "handed back" but ripped out of relational control by COBOL/VSAM coders, often working in more recent languages, who seek to perpetuate the smart-code/dumb-data protocol of 1955.

    ReplyDelete
    Replies
    1. I have my doubts as to whether anyone can effectively use map-reduce or aggregation functionalities of Mongo without having some pretty good relational experience first. But one is still defining a pipeline in Javascript that gets run in the MongoDB instances so it isn't really smart code/dumb data in the same way.

      Delete