Critical Computing & Cultural Data: Building Principles Through Published Research

By now you have created your initial datasets and started to consider how they might be augmented at scale. But before you dive deeper into computational methods for your final project, it’s essential to understand the broader landscape of how scholars use computation with cultural data and to develop a critical eye for when and why computation is actually worthwhile.

In this group assignment, you will research published peer-reviewed work that applies computational methods to cultural data in your area of focus. Your goal is not just to understand what others have done, but to evaluate it critically: Does this computational method actually capture the cultural complexity of the data, or does it flatten it? Is computation being used to make a meaningful argument, or is it just generating output? When is augmentation through computation worthwhile, and when is it unnecessary or even harmful?

The goal for this group assignment is to help you both start thinking through the next steps for the semester long project, and also start considering how your group will develop a set of collective principles for the final project.

Individual Component: Deep-Dive Into Computational Methods in Published Research

Each group member will be responsible for finding one peer-reviewed article that applies computational methods to data in your cultural area of focus. Your goal is to understand the method and critically evaluate how well it serves the research. Each person should read their article and form their own assessment before group discussion.

Finding Your Article

Your article should meet the following criteria:

Peer-reviewed and published in a reputable scholarly venue (journal, conference, or edited volume)
Published in the last 15 years (2011–2026)
Includes publicly available code (on GitHub, supplementary materials, or a repository like Papers with Code)
Applies computational methods to cultural data relevant to your group’s focus (music, literature, social media, archives, visual culture, etc.). The methods could be for augmenting the data (e.g., cleaning, structuring, enriching) or for analyzing the data (e.g., finding patterns, making arguments).

Pay attention to the disciplinary orientation of the work—is it Digital Humanities? Computational Social Science? Information Science? Critical Data Studies? The discipline shapes how computation is framed and valued.

Where to Search

You can use the following resources to find articles, but feel free to explore other venues as well:

Papers with Code — Great for recent work with code
Google Scholar — Search for your topic + computational methods
Journal of Cultural Analytics — Central venue for computational cultural analysis
Journal of Digital History — Historical work using computation
Journal of Computational Literary Studies — Literary data and text analysis
Reviews in DH — Peer-reviewed digital humanities work

You might also use large aggregators like Google Scholar or Semantic Scholar to find articles, but make sure that what you are looking at is actually peer-reviewed and has code publicly available. It is acceptable if the article does not release its data, but it must release its code.

Article Summary

Include: Author(s), title, publication venue, year, DOI/URL, and link to code repository.

Part 1: AI Summary (1–2 paragraphs)

Use an AI tool (ChatGPT, Claude, etc.) to generate a 1–2 paragraph summary of your article. Paste the AI-generated summary here with your prompt, it will serve as your baseline. This is not meant to replace your reading, but to give you a starting point for your critical assessment. You will be critiquing this AI summary in the next section, so make sure to include the exact prompt you used to generate it.

Part 2: Your Critical Assessment (1–2 paragraphs)

Now read your article carefully and write your own assessment that addresses these key questions:

What is the Data?

What cultural data is being analyzed? (e.g., poems, tweets, film scripts, museum catalogs)
Where does it come from? How was it collected, gathered, or created? What is the scale?
What might be missing? What aspects of the cultural phenomenon doesn’t the data capture?
How well does the data represent the cultural phenomena? Does it capture complexity or flatten important dimensions?

How is Computation Used and Why?

What computational methods are being used? Focus on what the method does conceptually.
What is the primary purpose? Is computation augmenting the data (cleaning, enriching), analyzing it (patterns, arguments), sharing it (storytelling, visualization), or some combination?
How do the article’s goals shape the data? How does the data shape the article’s claims?
Is computation necessary here? Would the same insights be possible without it?

Part 3: What AI Missed (1 paragraph)

Compare your reading to the AI summary. What did the AI get right? More importantly, what did it miss or gloss over when describing how computation is used with the data? Did the AI oversimplify? Miss nuance about the tradeoffs? Make assumptions that weren’t warranted? Or do you think the AI summary was actually pretty good at capturing the computational method and its relationship to the data? This section is a chance to critically evaluate how well AI can understand and summarize computational methods in cultural research.

Bonus: You are encouraged but not required to explore the code repository associated with your article. If you do, include a brief reflection on what you learned from exploring the code that reading the article alone didn’t teach you.

Group Component: Charting Broader Horizons Across Computational Methods & Cultural Data

Once all group members have researched their articles, you will come together to do something that neither AI nor any individual researcher can do alone: read across a set of papers and ask what they collectively reveal about how computation gets used with cultural data in your area of focus.

You will have time to work on this in class, but I also encourage you to work asynchronously. The goal is not a polished document but genuine collaborative analysis—what do these papers, taken together, actually teach us?

Group Process

Step 1: Share Your Critical Findings

Each group member should give a brief overview of their article to the group—what the data is, what computational method was used, how well it served the research, and what the AI summary got right or wrong. Do this before you try to synthesize anything. The point is to build a shared map of the landscape before you start drawing conclusions.

As you share, take notes. You are looking for both convergences (things multiple articles do similarly) and divergences (places where articles make meaningfully different choices).

Step 2: Map How Computation Is Being Used

Look across your articles through one specific lens: what role is computation playing? Most computational work with cultural data falls somewhere in this rough spectrum:

Augmentation — computation helps produce or improve the data itself (cleaning, structuring, enriching, scaling up manual work)
Analysis — computation finds patterns, tests hypotheses, or makes arguments from the data
Communication — computation drives how the work is shared or experienced (interactive visualizations, scrollytelling, public-facing tools)

These categories are not mutually exclusive, but they represent meaningfully different orientations toward what computation is for. As a group, place each article on this spectrum and discuss: does the role computation plays match how the authors describe their goals? Are there cases where computation is doing one thing but being presented as another?

Step 3: Find Trends, Divergences, and Silences

Now go deeper. Look across your articles and try to identify:

Trends — What do most of these papers have in common? Shared assumptions about scale, evidence, which methods are worth using?
Divergences — Where do papers make genuinely different choices? What is at stake in those differences?
Silences — What do almost none of the papers address? What questions get avoided or assumed away? Common silences include: who defines the categories a model is trained on; what the data excludes and why; whether the method was actually necessary; what happens to nuance at scale.

Note where your group agrees and where you genuinely disagree, both are worth documenting.

Submission & Grading

Create a single GitHub Markdown file in your group’s repository that documents your collective analysis, titled critical-computational-cultural-data-research.md which you can place any where in your group’s repository (though ideally in a dedicated folder for group work). It should include:

Individual article summaries — each member commits their own (AI summary + critical assessment + what AI missed)
Group landscape mapping — the group should detail where each article falls on the augmentation-analysis-communication spectrum, with specific examples from the articles to justify placements. It should also include any relevant trends, divergences, and silences that the group identified across the articles, with specific examples to support those observations. It should be noted in the group landscape mapping, which group members contributed to each section and who agrees or disagrees on each point. There is no set length for the group landscape mapping, and you are welcome to use tables or diagrams if that helps, but it should be clear and well-organized. Also please note if you use AI to help with any part of the group landscape mapping, and if so, include the prompt and your critique of the AI-generated content.

This document does not need to be long or polished. The goal is that someone reading it can follow your group’s thinking as you develop your assessments of the landscape.

Both components are Pass/Fail. For the Individual Component, you need to complete all four sections present (bibliographic info, AI summary, your critical assessment, what AI missed) and commit it yourself to get full marks. For the Group Component, you need to explicitly contribute to the document and discussion, and that must be noted in the document for you to get full marks.

Finally, we will again be doing short 10-minute in-class presentations. Rather than detailing the full assignment, you should select a subset of your landscape analysis that you think is most relevant to the class and/or that you would like feedback on. Everyone should be prepared to answer questions, even if not everyone presents.

Guidelines & Reminders

Dividing Labor

Each person is responsible for finding and summarizing at least one article
Everyone should participate in the group debate and writing
Document your labor clearly in GitHub — Use commits, comments, and a brief “Contributors” section so it’s clear who did what

Using AI as a Critical Research Tool

AI plays an intentional role in this assignment. Not to replace your thinking, but to teach you how to use it responsibly:

In the individual component: - Use AI to generate a draft summary of your article (this is efficient) - Then critique it — What did AI get right? What did it miss about how computation is used with the data? This forces you to read carefully and think critically.

In the group component: - You can use AI to help with specific comparative tasks if useful. For instance, asking it to summarize how two articles differ on a particular dimension, or to draft a synthesis paragraph that you then critique and revise. - But the core analytical work––identifying trends, naming divergences, and especially finding silences––requires your own reading and judgment. So be sure to double check AI-generated content.

Why This Matters

Your final project will ask you to make decisions about how (or whether) to use computation with your cultural data. Those decisions should be informed by understanding the broader rsearch landscape: what choices others have made, what tensions they’ve navigated, what tradeoffs they’ve accepted.

By studying published research, you’re learning to ask critical questions: When does computation add value? When does it obscure more than it reveals? What gets lost when we automate? What does computation help us see that we couldn’t see before?

The landscape analysis you do here is groundwork. You’re not yet deciding what you will do, but you’re building the knowledge base that will allow you to make those decisions wisely.

Resources

Key Journals & Publications

Example Projects (Not exhaustive—find your own!)

Lincoln A. Mullen, America’s Public Bible: A Commentary — Using computation to track biblical references across American texts
Melanie Walsh & Maria Antoniak, The Goodreads Classics — Analyzing how classic literature is discussed on Goodreads
Ben Lee, Newspaper Navigator — Using computer vision to make historical newspapers searchable by visual content
Ruth Ahnert, Sebastian Ahnert, & Kim Albrecht, Tudor Networks — Network analysis of Tudor-era correspondence
Amanda Henley, Matthew Jansen, et al., On The Books: Jim Crow and Algorithms of Resistance — Text mining historical legal documents to uncover systemic racism

Search Tools

Papers with Code — Find papers with public code
Google Scholar
Semantic Scholar
JSTOR, Web of Science, WorldCat (check if your institution provides access)

If you have questions, reach out to the instructors on Slack. Happy researching!