Wordsmithing the State of the Union

George Washington's first state of the union address, manuscript notes
Photo caption

George Washington's first state of the union address, manuscript notes, January 1790

Wikimedia Commons

(January 20, 2015)

How likely is President Barack Obama to invoke the concepts “freedom” “health” or “budget” in his 2015 State of the Union address?

Some data-crunching by an NEH grantee gives us a clue.  Ben Schmidt, a historian at Northeastern University and member of the HathiTrust + Bookworm Project—an NEH grant-funded effort to apply big data visual analytics to the 3.9 billion pages of digitized materials within the HathiTrust Digital Library – applied the Bookworm text analysis tool to the 224 State of the Union addresses delivered since George Washington to see how the language employed by the presidents reflects historical concerns and circumstances.

The results, published in The Atlantic, show that presidential references to the idea of “freedom” didn’t take off as a rhetorical staple until F.D.R.’s 1941 State of the Union address, and that since that time it has most frequently been employed by Republican orators.  By contrast, in the first century of the Republic, it was far more common to hear presidents wax about the concept of “public.” Usage of that word in the State of the Union address peaked with Martin Van Buren but fell relatively out of use by 20th century presidents – a shift that might be explained by the fact that presidents now commonly address the public directly, instead of exhorting Congress on the public’s behalf, Schmidt suggests.

You can compare the language used by various presidential administrations online using the interactive chart published at The Atlantic, and also explore an interactive visual representation of the 1,410 different places in the world referenced by presidents in State of the Union addresses--- both produced using the Bookworm digital analysis tool.

Bookworm, an open-source text analysis and visualization tool, enables custom analysis of language usage trends across massive repositories of digitized texts.  It has in the past also been applied to Chronicling America, the online database of historic American newspapers supported by the National Endowment for the Humanities and the Library of Congress.

Media Contacts:
Paula Wasley:

Funding information

The HathiTrust + Bookworm Project is supported by a 2014 NEH Digital Humanities Implementation Grant. Directed by J. Stephen Downie at the University of Illinois at Urbana-Champaign, in collaboration with partners at Indiana University, Northeastern University, and Baylor College of Medicine, the project seeks to enhance the Bookworm analytical tool by applying it the 3.9 billion pages of digitized materials held in the HathiTrust Digital Library. The project will provide computational access to the HathiTrust corpus and enable open-source improvements to Bookworm code to increase the tool's utility for other large text projects.