2022
-
The One About Tools
Remember when we would get all up in arms about whether or not you needed to code in order to participate in DH? Here’s my take: No.* No because all of our interactions with computing machine are mediated - you are not closer to the “truth” by being able to work in a programming language. That said, 1. you do need to engage with the computational processes (I’m over the “I’ve had an idea for a website / database / tool; who will build it for me?”) and 2. eventually the easy-to-use tools will fail you or won’t do what you want, and then you’ll be learning code. So participate with the easy-to-use tools because eventually, if you stick with it, the fun questions will require you to learn more.
But I didn’t start there. It has always been important to me to learn the code. This has been driven in part by necessity (there were few tools in 2011 were not either prohibitively expensive or overly simplistic) and by a desire to understand as far as I could the underlying process. Code has been my route into computational thinking, into learning how to split large programs into discrete, tractable, iterable chunks. Has it taken a long time to gain momentum this way? Yes. Do I regret it? Not for a minute. And I want others to have the opportunity for the experience of learning how to think through their ideas and their research practices computationally.
-
The Tie that Binds for HistoInformatics2021
On September 30th, I participated in the HistoInformatics2021 Workshop, connected to the ACM/IEEE-CS Joint Conference on Digital Libraries. The workshop focused on a range of issues around data and informatics in historical research and I am very grateful for the opportunity to learn from the other papers and for the questions and feedback I received.
My paper focused on ways historians might use Katherine Bode’s model of Scholarly Editions of Literary Systems for computational text analysis in historical research.
The formal paper is published at http://ceur-ws.org/Vol-2981/.
-
Omeka for the Student Scholar
With software such as WordPress, Wix, and Weebly, it is relatively easy to create a website and publish content on the web. These tools help businesses and individuals draw attention to their products, services, and ideas, without having to know too much about the technologies that make up the internet.
Omeka is another software program for creating web content, but built by historians for historical content. It is used to build sites such as the Humboldt Redwoods Project and the Colored Conventions Project. These sites feature rich collections of images and documents, information about those items, and place those items within their historical contexts.
-
Reviving the Blog
The end of 2020 is as good a time as any to finally turn my attention back to this blog and try to revive the practice of recording work in progress. Since 2018, I taught a summer DH course at UC Berkeley, finished and defended the dissertation, birthed twin boys, endured 2 years (and counting) of sleep deprivation, moved from Portland, Ore. to Tuscaloosa AL, and completed my first semester of teaching at the University of Alabama. Also in there, a couple conference presentations, two chapters in edited collections in the works, and a couple of short articles. It has been a whirlwind.
-
Using pyLDAvis with Mallet
Update: I think I have finally figured out the trick to smoothing for the document-topic matrix - really I was making things more difficult than I needed to. Because I optimized the topic distribution, the
alpha
value IS the list of values, which can be added to the matrix of document-topic counts without any additional trouble or manipulation. I’ve updated the code and the visualization below to reflect this change. As a result, there is more variation in topic size (which makes sense) and the number match the doc-topics output from MALLET (which I should have been using to verify my results all along.) -
Ways to Compute Topics over Time, Part 4
This is part of a series of technical essays documenting the computational analysis that undergirds my dissertation, A Gospel of Health and Salvation. For an overview of the dissertation project, you can read the current project description at jeriwieringa.com. You can access the Jupyter notebooks on Github.
This is the last in a series of posts which constitute a “lit review” of sorts, documenting the range of methods scholars are using to compute the distribution of topics over time. The strategies I am considering are:
- Average of topic weights per year (First Post)
- Smoothing or regression analysis (Second Post)
- Prevalence of the top topic per year (Third Post)
- Proportion of total weights per year (Final Post)
To explore a range of strategies for computing and visualizing topics over time from a standard LDA model, I am using a model I created from my dissertation materials. You can download the files needed to follow along from https://www.dropbox.com/s/9uf6kzkm1t12v6x/2017-06-21.zip?dl=0.
-
Ways to Compute Topics over Time, Part 3
This is part of a series of technical essays documenting the computational analysis that undergirds my dissertation, A Gospel of Health and Salvation. For an overview of the dissertation project, you can read the current project description at jeriwieringa.com. You can access the Jupyter notebooks on Github.
-
Ways to Compute Topics over Time, Part 2
This is part of a series of technical essays documenting the computational analysis that undergirds my dissertation, A Gospel of Health and Salvation. For an overview of the dissertation project, you can read the current project description at jeriwieringa.com. You can access the Jupyter notebooks on Github.
-
Ways to Compute Topics over Time, Part 1
This is part of a series of technical essays documenting the computational analysis that undergirds my dissertation, A Gospel of Health and Salvation. For an overview of the dissertation project, you can read the current project description at jeriwieringa.com. You can access the Jupyter notebooks on Github.
-
Know Your Sources (Part 2)
This is part of a series of technical essays documenting the computational analysis that undergirds my dissertation, A Gospel of Health and Salvation. For an overview of the dissertation project, you can read the current project description at jeriwieringa.com. You can access the Jupyter notebooks on Github.