AstroTRENDS: Weasel words

Credit: Cliff

I added a bunch of new keywords to AstroTRENDS, mostly suggested by friends and people in the community who had read my Facebook post.

A thought I had yesterday is the following: has the astronomical literature become more speculative, and perhaps less committed to audacious claims, in recent times? It is difficult to test this hypothesis  by merely querying ADS for abstract keywords. It would certainly be better served by a natural-language processing analysis of the full text, although this is just my uninformed speculation.

A much simpler way is to search for the so-called “weasel words” (such a funny way of describing them from a non-native speaker POV!). Matthew Might (a CS professor from the University of Utah) has a really interesting article about the different abuses of language that are common among technical writers, and he created some automated tools for detecting them. It’s a great read. (There’s even an emacs minor mode called writegood based on his recommendations, which I will be testing for sure). Although I don’t necessarily agree with a strict adherence to all of his points, there are certainly some great pieces of advice there.

Taking his post as a reference, I added a new “weasel words” pseudo-keyword to AstroTRENDS. The “weasel words” keyword shows the result of an ADS query of refereed abstracts containing the following boolean expression:

Could OR Possibly OR Might OR Maybe OR Perhaps OR Quite OR Fairly OR Various OR Very OR Several OR Exceedingly OR Vastly OR Interestingly OR Surprisingly OR Remarkably OR Clearly OR Significantly OR Substantially OR Relatively OR Completely OR Extremely

We can easily disagree on whether using these words in an abstract constitutes “weaseling”, or has any sort of nefarious purpose (I certainly pepper my writing with more than my fair share of those). It is still an interesting exercise to verify whether usage of those words has increased over time. The following plot shows the fraction of articles containing those words (i.e. number of articles containing the words normalized by the total article count) each year.

chart-2

 

Keeping all the caveats above in mind, there is a definite upward, pretty linear-by-eye trend going on. I’m not sure whether it has to do with simple evolution of language and style, less boastful writing, an accident of fate/bug on my part, or some other factor.

This is of course a super-shallow analysis that would require far more insight than what I offered in this post, but it’s still intriguing. I tried to altavista whether this is well-known, but have come empty handed so far. Any ideas?

You can play with the interactive plot itself by clicking this link.

UPDATE: Ben Weiner made a really good point on the Facebook astronomer group.   He suggests that an additional, alternative explanation could simply be that abstracts have become, on average, more verbose with time, which would explain the higher frequency of fluffy adjectives and adverbs. This could be checked with a control set of non-weasel words… which I will definitely try.


How did this post do with writegood-mode? Pretty nicely… but I got a grade of “11” on Hemingway, with about 9 out 24 sentences being hard to read.  Oh well.
Weasel image credit: Cliff

4 thoughts on “AstroTRENDS: Weasel words”

    1. Hey Michael,

      I love ADS2 and have been using it for a bit, though I did not try full text search — I only kept to the very lightest searching of the keywords using ADS classic for this data. An API like that offered by Arxiv would be really awesome!

Leave a Reply to Michael Kurtz Cancel reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>