IS310 - Culture As Data Spring 2026 – GETting Cultural Data From APIs

So far in class we have briefly mentioned APIs (for example, the Spotify API), but haven’t yet discussed what they are or how to use them. This week we will start to work through the basics of using APIs to get data from the web.

What is an API?

API stands for Application Programming Interface, but what does that mean exactly?

While according to Wikipedia,

An application programming interface is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build or use such a connection or interface is called an API specification.

But while that is technically correct, it probably leaves you with more questions than answers.

What is an API?

API Workflow

In this figure, you’ll notice that we have a web browser that is using the “internet cloud” to make requests and get responses from an API that is connected to a web server and a database. This might seem complex, but we’ve seen a similar relationship when we talked about how the web works. We learned previously about how web browsers use HTTP to make requests and get responses from web servers, this is essentially the same concept. APIs are just a way to get data from a server, similar to how we get web pages from a server, but instead of storing the data in a webpage (aka an HTML document), the data is stored in a database. For example, when you use a weather app, or really any app on your phone, that app is using an API to get the weather data from a server.

Databases might also sound intimidating, but they are just a way to store data, similar to .csv or .json files. The difference is that databases are designed to store large amounts of data and to allow that data to be accessed and manipulated quickly, usually using something called SQL (Structured Query Language), which is a programming language that allows you to interact with databases.

Rather than going to a URL in your browser, an API lets us send a similar request to a server. To help us understand APIs, let’s explore a bit of their origins and development.

Origins of APIs

Web 2.0

History of APIs

From the chapter “A Brief History of APIs” by Jakob Jünger

Post-API Age & The End of the Social Web

Tweetdeck Interface

The growth in these platforms then led to the rise of third-party applications that used these APIs. For example, you might have heard of TweetDeck, which is/was an incredibly popular Twitter client that allowed users to manage multiple Twitter accounts and to schedule tweets and was bought by Twitter in 2011. Similarly, Google introduced the OpenSocial initiative in 2007 to try and standardize API interactions, but it largely failed, with most platforms retaining unique APIs. Facebook and Twitter in particular prioritized control over data access and usage through increasingly strict terms of service.

While TweetD TweetDeck was built using the Twitter API, which allowed developers to access data about tweets, users, and trends. This era of APIs was characterized by a proliferation of third-party applications and platforms, as well as a push for standardization. Google and others attempted to standardize API interactions with initiatives like OpenSocial, but most platforms retained unique APIs, with Facebook and Twitter prioritizing control over data access and usage through increasingly strict terms of service. Part of this focus on control was due to the rise of data privacy concerns, especially after events like the revelations of the Facebook–Cambridge Analytica data scandal in 2018. For those unfamiliar, this scandal was when it was revealed through a New York Times and The Guardian investigation that the political consulting firm Cambridge Analytica had harvested data from millions of Facebook users without their consent and used it to target political ads during the 2016 US presidential election.

Many Users of APIs

API Users from Amelia Acker and Adam Kreisberg’s article “Social Media Data Archives in an API-Driven World”

Understanding this history is helpful because it gives us a sense of the many users of APIs, as summarized in the table.

This table comes from Amelia Acker and Adam Kreisberg’s article “Social Media Data Archives in an API-Driven World” which explores how APIs are limiting access to social media data and the implications of this for archiving and research; a topic we’ve discussed in class.[^4] You’ll notice in this table they identify a number of different users of APIs, including individual account holders on a platform (so if you’ve ever tried to export your social media data for example), developers who build applications that use APIs, develop researchers which are researchers who use APIs to access data for their research and especially for those who study these platforms, and lastly, developer stewards who are often archivists and preservation specialists who use APIs to collect and preserve data. While these groupings are not definitive or exhaustive, they give us a sense of the many different users of APIs and the many different ways APIs are used. However, it is critical to understand that very few APIs are designed to be used by this many different groups of users, and that instead the majority of APIs are designed to be used by a single group of users, often developers building applications since that is the most profitable use of APIs.

Working with APIs

The One API

If we go to the about page https://the-one-api.dev/about, we can learn that the project was created in 2019 by Ulrike Exner and Mateusz Kikmunter, who are both developers.

Making an API Request

In your is310-coding-assingments, create a new folder called api-getting-data and then create a new script called first_api_script.py. In this script, import the requests library and then create a variable called url that is the base url for the LOTR API.

import requests

url = 'https://the-one-api.dev/v2/book'

How could we use requests with this url? How could we check that the request work?

Making an API Request

response = requests.get(url)
print(response.status_code)

Hopefully we are all seeing 200 responses, but if you are seeing a 404 or 403 response, you might need to authenticate with the API. We will discuss this more in the next section, but for now, let’s try to print out the response.

Getting JSON

In our web scraping lesson, we used the .text method to print out the response, but for APIs, we often use the .json() method. Let’s try that out.

print(response.json())

You should see something like the following:

{
    'docs': 
        [
            {
                '_id': '5cf5805fb53e011a64671582',
                'name': 'The Fellowship Of The Ring'
            },
            {
                '_id': '5cf58077b53e011a64671583', 
                'name': 'The Two Towers'
            },
            {
                '_id': '5cf58080b53e011a64671584', 
                'name': 'The Return Of The King'
            }
        ],
    'total': 3,
    'limit': 1000,
    'offset': 0,
    'page': 1,
    'pages': 1
}

JavaScript Oject Notation

Breaking Down JSON

Based on this data, we can see that The One API has returned data about the three books in the Lord of the Rings series. Each book has an _id and a name. The books themselves are returned in a list for the key docs, and then there are some other keys that provide information about the data that was returned, including total, limit, offset, page, and pages. Total tells us how many items were returned, limit tells us how many items could be returned per page, offset tells us where in the data we are (similar to indexing), page tells us what page we are on, and pages tells us how many pages of data there are.

{
    'docs': 
        [
            {
                '_id': '5cf5805fb53e011a64671582',
                'name': 'The Fellowship Of The Ring'
            },
            {
                '_id': '5cf58077b53e011a64671583', 
                'name': 'The Two Towers'
            },
            {
                '_id': '5cf58080b53e011a64671584', 
                'name': 'The Return Of The King'
            }
        ],
    'total': 3,
    'limit': 1000,
    'offset': 0,
    'page': 1,
    'pages': 1
}

Authentication & Endpoints

Using Endpoints

Book URL in Browser

Using Endpoints?

Characters URL in Browser

Using Endpoints?

url = 'https://the-one-api.dev/v2/character'
response = requests.get(url)
print(response.status_code)

This should return a 401 status code, which means that we are not authorized to access this data. This is because the LOTR API requires us to authenticate before we can access data about characters or quotes. This is a common feature of APIs, as it allows the API to track who is accessing their data and to limit access to certain users.

Authentication

API Keys

api_key = "API KEY HERE"
url = 'https://the-one-api.dev/v2/character'
authorization_headers = {
    'Authorization: Bearer ' + api_key
}

API Keys

Now let’s update our requests.get method to include these headers.

response = requests.get(url, headers=authorization_headers)
print(response.status_code)

Storing API Keys: Environment Variables

To create an environment variable, you open your terminal and type the following, replacing api_key with your API key.

For Macs/WSL:

export THE_ONE_API_KEY="YOUR_API_KEY_HERE"

For Windows/PowerShell:

setx THE_ONE_API_KEY "YOUR_API_KEY_HERE"

Storing API Keys: Environment Variables

Now you can access these environment variables in your Python script by using the os library.

import os
the_one_api_key = os.environ['THE_ONE_API_KEY']
print(the_one_api_key)

`apikey` Library

This is a bit more work though, and one easier option is to use a Python library for storing API keys, called apikey https://github.com/ulf1/apikey.

In your terminal, type pip install "apikey>=0.2.4" to install the library. Then import it into your script and write:

import apikey

apikey.save("THE_ONE_API_KEY", "YOUR_API_KEY_HERE")

the_one_api_key = apikey.load("THE_ONE_API_KEY")

Making API Requests

Now that we have our api key stored securely, let’s try to get data about characters from The One API. We can do this by updating our url variable to include the /character endpoint.

the_one_api_key = apikey.load("THE_ONE_API_KEY")
authorization_headers = {
    'Authorization': 'Bearer ' + the_one_api_key
}
url = 'https://the-one-api.dev/v2/character'
response = requests.get(url, headers=authorization_headers)
if response.status_code == 200:
    print(response.json())
else:
    print(response.status_code)

More Data

{'docs': [{'_id': '5cd99d4bde30eff6ebccfbbe',
   'name': 'Adanel',
   'wikiUrl': 'http://lotr.wikia.com//wiki/Adanel',
   'race': 'Human',
   'birth': None,
   'gender': 'Female',
   'death': None,
   'hair': None,
   'height': None,
   'realm': None,
   'spouse': 'Belemir'},
  {'_id': '5cd99d4bde30eff6ebccfbbf',
   'name': 'Adrahil I',
   'wikiUrl': 'http://lotr.wikia.com//wiki/Adrahil_I',
   'race': 'Human',
   'birth': 'Before ,TA 1944',
   'gender': 'Male',
   'death': 'Late ,Third Age',
   'hair': None,
   'height': None,
   'realm': None,
   'spouse': None},
  {'_id': '5cd99d4bde30eff6ebccfbc0',
   'name': 'Adrahil II',
   'wikiUrl': 'http://lotr.wikia.com//wiki/Adrahil_II',
...
 'total': 933,
 'limit': 1000,
 'offset': 0,
 'page': 1,
 'pages': 1}

Processing JSON With Python

Since the api response is in json format, we can work with it similar to working with a dictionary. For example, we can see all the keys in the response by using the .keys() method.

response.json().keys()

This should show the following:

dict_keys(['docs', 'total', 'limit', 'offset', 'page', 'pages'])

How would we see total number of characters?

Processing JSON With Python

We can access the total key to see how many characters are in the database.

response.json()['total']

This should return 933, which means that there are 933 characters in the database. We can also loop through the docs key to see each character.

for character in response.json()['docs']:
    print(character)

Processing JSON With Python

So if we only wanted to see the data about a certain character, like Galadriel, we could loop through the characters and print out the data for Galadriel.

for character in response.json()['docs']:
    if character['name'] == 'Galadriel':
        print(character)

Which would return the following data:

{'_id': '5cd99d4bde30eff6ebccfd06', 'name': 'Galadriel', 'wikiUrl': 'http://lotr.wikia.com//wiki/Galadriel', 'race': 'Elf', 'birth': 'YT 1362', 'gender': 'Female', 'death': 'Still alive: Departed over the sea on ,September 29 ,3021', 'hair': 'Golden', 'height': 'Tall', 'realm': 'Eregion,Lothlórien,Caras Galadhon', 'spouse': 'Celeborn'}

Query Parameters

While we can do this using Python, we could also change our URL to only get data about Galadriel. We can do this by adding a query parameter to our URL.

url = 'https://the-one-api.dev/v2/character?name=Galadriel'
response = requests.get(url, headers=authorization_headers)
print(response.json())

Query Parameters

Returning to the API’s documentation https://the-one-api.dev/documentation#5, we can see that we can use query parameters for sorting, filtering, and pagination data from The One API.

url = 'https://the-one-api.dev/v2/character?name!=Galadriel&race=Elf'
response = requests.get(url, headers=authorization_headers)
print(f"Total number of elves besides Galadriel: {response.json()['total']}")

Query Parameters

Finally, we can also use the id in each data returned from the API to get more specific data. For example, if we wanted to get data about all the movie quotes of Galadriel, we could first get the id of Galadriel and then use that id to get data about her quotes.

url = 'https://the-one-api.dev/v2/character?name=Galadriel'
response = requests.get(url, headers=authorization_headers)
galadriel_id = response.json()['docs'][0]['_id']
quote_url = f'https://the-one-api.dev/v2/character/{galadriel_id}/quote'
response = requests.get(quote_url, headers=authorization_headers)
print(response.json())

`time` & Rate Limiting

import time

url = 'https://the-one-api.dev/v2/character?name=Galadriel'
response = requests.get(url, headers=authorization_headers)
galadriel_id = response.json()['docs'][0]['_id']
quote_url = f'https://the-one-api.dev/v2/character/{galadriel_id}/quote'
response = requests.get(quote_url, headers=authorization_headers)
print(response.json())
time.sleep(10)

Python API Wrappers

So far we have been using the requests library to make our API calls, which is usually how you should work with APIs. However, occasionally, developers will create Python libraries to work with APIs, which can make working with APIs easier. These libraries are called API wrappers, and they are essentially Python libraries that provide a set of functions to work with an API.

NRH-LOTR

Indeed, a developer named Nathanial Hapeman has created one for The One API, which you can see here https://pypi.org/project/nrh-lotr/0.0.3/. If you want to try out this library, all you have to do is type pip install nrh-lotr==0.0.3 in your terminal.

LOTR Library

NRH-LOTR Structure

We can also see how the library is organized if we inspect it after installing it:

LOTR Library

NRH-LOTR Structure

And you can see how’s it making requests to the API by looking at the source code.

LOTR Library Source

Fixing the Source Code

In our class on Thursday, we ran into an issue with installing and using the nrh-lotr library, with this issue appearing in the terminal:

What Went Wrong?

Pydantic is a data validation library that checks whether the data you receive from an API matches the types you declared in your model. When the API returned a value for rottenTomatoesScore of 66.33333333 (a decimal), but the model was expecting an int (whole number), Pydantic rejected it. You can’t fit a float into an integer without losing information, so Pydantic threw a validation error.

The Fix

The library developer assumed Rotten Tomatoes scores would be integers, but the actual API returns percentage scores with decimal values. The solution was simple: change the type annotation from int to float to allow decimal numbers:

class Movie(pydantic.BaseModel):
    """A LotR movie title."""

    id: str = pydantic.Field(None, alias="_id")
    name: str
    runtimeInMinutes: int
    budgetInMillions: int
    boxOfficeRevenueInMillions: float # Changed from int to float
    academyAwardNominations: int
    academyAwardWins: int
    rottenTomatoesScore: float  # Changed from int to float

Using NRH-LOTR

# First grab an api key from: https://the-one-api.dev/documentation#3
# Then put it in an env var like: `export API_KEY=SOME_API_KEY`
# Or insert it directly into the LOTR class as depicted below
from lotr import LOTR, Movie, Quote

# Movie basics.
lotr = LOTR("YOUR_API_KEY")
# lotr = LOTR() # if using env var
movies = lotr.movies(limit=5)

Using NRH-LOTR

However, if you scroll down further in the documentation, you’ll notice that this library only makes requests for the following endpoints:

/movie
/movie/{id}
/movie/{id}/{quote}
/quote
/quote/{id}

Such limited functionality means that we couldn’t use this library to get data about characters or books, which is a major limitation. These types of limitations are common when working with API wrappers, as they are often created by developers who are not affiliated with the API itself and who may not have the time or resources to create a full-featured library.

Europeana Digital Library

Rather than use this library, we’re going to try one that has been designed for the Europeana digital library https://www.europeana.eu/en.

This library was first proposed in 2005 by six European heads of state who signed a letter asking the EU to support the development of a European Digital Library and it was initially released in 2008.[^5] It provides access to millions of digitized items from European museums, libraries, archives, and galleries, and is a great resource for researchers, students, and the general public. The library has undergone a transformation since its initial release, with the first few years focusing on encouraging heritage institutions to provide their digital material to the platform.

The newer version, released in 2016, has more of a focus on encouraging reuse in different ways and curating the collection through exhibitions and collections.

Europeana API Example

Europeana Items

Europeana Postcard

Sofia Kretzulescu - 1852 - National Heritage Institute, Bucharest, Romania - CC BY-SA. https://www.europeana.eu/item/1190/INP_postcards_6415?utm_source=api&utm_medium=api&utm_campaign=bkV8GDrrp

Europeana Metadata

Europeana Postcard Metadata

Europeana APIs

Europeana API Key

To do this, we first need to get an API key from Europeana, which you can do by signing up for an account here https://pro.europeana.eu/pages/get-api.

Europeana API

`pyeuropeana`

You can see the documentation for this library here https://rd-europeana-python-api.readthedocs.io/en/stable/index.html and the GitHub repository here https://github.com/europeana/rd-europeana-python-api/tree/master.

Quickstart

If we go to the Quickstart page in the documentation https://rd-europeana-python-api.readthedocs.io/en/stable/usage.html#, we can see that we can install the library with the following command:

pip install pyeuropeana

Authentication with `pyeuropeana`

Then if we scroll down to Authentication we can start to see how we can use the library to authenticate with the Europeana API.

Europeana Library

Authentication with `pyeuropeana`

import apikey
import os

apikey.save("EUROPEANA_API_KEY", "YOUR_API_KEY_HERE")
europeana_api_key = apikey.load("EUROPEANA_API_KEY")
os.environ['EUROPEANA_API_KEY'] = europeana_api_key

Getting errors? Check out these steps on the lesson page

`pyeuropeana` Functionality

This is the example from the documentation:

import pyeuropeana.apis as apis
import pyeuropeana.utils as utils

# use this function to search our collections
result = apis.search(
   query = '*',
   qf = '(skos_concept:"http://data.europeana.eu/concept/base/48" AND TYPE:IMAGE)',
   reusability = 'open AND permission',
   media = True,
   thumbnail = True,
   landingpage = True,
   colourpalette = '#0000FF',
   theme = 'photography',
   sort = 'europeana_id',
   profile = 'rich',
   rows = 1000,
   ) # this gives you full response metadata along with cultural heritage object metadata

   # use this utility function to transform a subset of the cultural heritage object metadata
   # into a readable Pandas DataFrame
dataframe = utils.search2df(result)

`pyeuropeana` Functionality

Let’s try a simpler example first though:

import pyeuropeana.apis as apis

response = apis.search(query="Galadriel")
print(response)

Which should show the following output:

{'apikey': 'tondflanrino',
 'success': True,
 'requestNumber': 999,
 'itemsCount': 0,
 'totalResults': 3,
 'items': [{'completeness': 0,
   'country': ['Italy'],
   'dataProvider': ['Internet Culturale'],
   'dcCreator': ['http://data.europeana.eu/agent/159584',
    'Karunesh',
    'Karunesh'],
   'dcCreatorLangAware': {'def': ['http://data.europeana.eu/agent/159584',
     'Karunesh'],
    'en': ['Karunesh']},
   'dcTitleLangAware': {'def': ['Galadriel'],
    'en': ['Galadriel'],
    'it': ['Galadriel']},
   'edmConcept': ['http://data.europeana.eu/concept/soundgenres/Music'],
   'edmConceptLabel': [{'def': 'Musik'},
    {'def': 'Music'},
    {'def': 'Musica'},
    {'def': 'Muzyka'},
    {'def': 'Musique'},
    {'def': 'Música'}],
   'edmConceptPrefLabelLangAware': {'de': ['Musik'],
...
  'profile': None,
  'rows': 12,
  'cursor': '*',
  'callback': None,
  'facet': None}}

`pyeuropeana` Search API

We can also see in the Search API documentation what keys we should expect for our initial search request https://europeana.atlassian.net/wiki/spaces/EF/pages/2385739812/Search+API+Documentation#Response, as well as what data should return for each item https://europeana.atlassian.net/wiki/spaces/EF/pages/2385739812/Search+API+Documentation#Metadata-Sets. For example, if we wanted to see the item in the browser we would just need to use the guid key.

response['items'][0]['guid']

`pyeuropeana` Entity API

We could also try out the Entity API, which allows us to get more information about a specific entity in the collection. According to the api’s documentation https://europeana.atlassian.net/wiki/spaces/EF/pages/2324561923/Entity+API+Documentation, the Europeana collection has these types of entities:

a person (or “agent”), for instance Lili Boulanger or Claude Debussy;
a topic (or “concept”) like Art Nouveau, migration or Musique Concrète
a place, for instance Perpignan, Bratislava or Arnhem
a time period, for instance the 21st century

In the pyeuropeana documentation, there’s a section of tutorials, including one for using the Entity API https://rd-europeana-python-api.readthedocs.io/en/stable/tutorials_source/entity_api_tutorial.html.

`pyeuropeana` Entity API

According to this tutorial, we can use the entity.suggest method to get suggestions for a specific entity. For example, if we wanted to get suggestions for the entity Galadriel, we could use the following code:

response = apis.entity.suggest(
    text = 'Galadriel',
   TYPE = 'agent',
)
print(response)

Did it work?

`pyeuropeana` Entity API

Likely see there’s no results for this query. We could try a different entity, like Tolkien, and see if we get any results.

response apis.entity.suggest(
   text = 'Tolkien',
   TYPE = 'agent',
)
print(response)

This should give the following output:

',
 'ugc': [False],
 'year': ['1989']}
{'@context': ['https://www.w3.org/ns/ldp.jsonld',
  'http://www.europeana.eu/schemas/context/entity.jsonld'],
 'type': 'ResultPage',
 'total': 3,
 'items': [{'id': 'http://data.europeana.eu/agent/60065',
   'type': 'Agent',
   'isShownBy': {'id': 'http://pbc.gda.pl/Content/20559/03.mp3',
    'type': 'WebResource',
    'source': 'http://data.europeana.eu/item/0940417/_nnbqsf5',
    'thumbnail': 'https://api.europeana.eu/api/v2/thumbnail-by-url.json?uri=http%3A%2F%2Fpbc.gda.pl%2FContent%2F20559%2F03.mp3&type=SOUND'},
   'prefLabel': {'en': 'J. R. R. Tolkien'},
   'altLabel': {'en': ['J-R-R Tolkien',
     'Tolkien',
     'John Ronald Reuel Tolkien',
     'John Tolkien',
     'J.R.R Tolkien',
     'J.R.R. Tolkien',
     'John R. R. Tolkien']},
   'dateOfBirth': '1892-01-03',
   'dateOfDeath': '1973-09-02'},
  {'id': 'http://data.europeana.eu/agent/60339',
   'type': 'Agent',
   'prefLabel': {'en': 'Christopher Tolkien'},
   'altLabel': {'en': ['Christopher John Reuel Tolkien',
     'Christopher Reuel Tolkien',
...
   'dateOfDeath': '2020-01-16'},
  {'id': 'http://data.europeana.eu/agent/79852',
   'type': 'Agent',
   'prefLabel': {'en': 'Tim Tolkien'},
   'dateOfBirth': '1962-09-01'}]}

`pyeuropeana` Entity API

We could also look for concepts or places, like Literature or London, to see what results we get. However, to do that we need to change the TYPE parameter to concept or place.

response_concept = apis.entity.suggest(
    text = 'Literature',
   TYPE = 'concept',
)

response_place = apis.entity.suggest(
    text = 'London',
   TYPE = 'place',
)

GETting Culture Across APIs Homework

Time for the next homework assignment where you put this all together!

GETting Cultural Data From APIs

What is an API?

What is an API?

Origins of APIs

Web 2.0

History of APIs

Post-API Age & The End of the Social Web

Many Users of APIs

Working with APIs

The One API

Making an API Request

Making an API Request

Getting JSON

JavaScript Oject Notation

Breaking Down JSON

Authentication & Endpoints

Using Endpoints

Using Endpoints?

Using Endpoints?

Authentication

Authentication

API Keys

API Keys

Storing API Keys: Environment Variables

Storing API Keys: Environment Variables

apikey Library

Making API Requests

More Data

Processing JSON With Python

Processing JSON With Python

Processing JSON With Python

Query Parameters

Query Parameters

Query Parameters

time & Rate Limiting

Python API Wrappers

NRH-LOTR

NRH-LOTR Structure

NRH-LOTR Structure

Fixing the Source Code

What Went Wrong?

The Fix

Using NRH-LOTR

Using NRH-LOTR

Europeana Digital Library

Europeana API Example

Europeana Items

Europeana Metadata

Europeana APIs

Europeana API Key

pyeuropeana

Quickstart

Authentication with pyeuropeana

Authentication with pyeuropeana

pyeuropeana Functionality

pyeuropeana Functionality

pyeuropeana Search API

pyeuropeana Entity API

pyeuropeana Entity API

pyeuropeana Entity API

pyeuropeana Entity API

GETting Culture Across APIs Homework

`apikey` Library

`time` & Rate Limiting

`pyeuropeana`

Authentication with `pyeuropeana`

Authentication with `pyeuropeana`

`pyeuropeana` Functionality

`pyeuropeana` Functionality

`pyeuropeana` Search API

`pyeuropeana` Entity API

`pyeuropeana` Entity API

`pyeuropeana` Entity API

`pyeuropeana` Entity API