Aim

To take the Correspondance of Pope Gregory VII (1075-1085) and tabulate it (according to date, addressee etc), so as to allow visualisations — particularly maps.

Creating the data

The letters are recorded in 9 (?) published registers, available in PDF. The major work is to systemmatically record details of each letter in a spreadsheet, looking something like this:

##    year  month  day  ...   recipient Location of pope clerical/secular
## 0  1073  April   23  ...      abbot              Rome                c
## 1  1073  April   23  ...      prince             Rome                s
## 2  1073  April   26  ...  archbishop             Rome                c
## 3  1073  April   29  ...      bishop             Rome                c
## 4  1073  April   28  ...     duchess             Rome                s
## 
## [5 rows x 9 columns]

This tabulation creates a resource from which multiple possible analyses and visualisations can be built.

Consistent labelling is essential to create a coherant data set. It is worth thinking about what items we want to ‘score’ about each letter, before tabulating the data.

There are multiple letter 4s (“Version of the previous letter”). Do we want to explicitly code duplicates?
Do we want to record recipient locations as in the letters, or in the modern names (e.g. Constantinople or Istanbul)?
Only city names in the “place” column, no country

Adding location information

Next we need to itentify each letter with a recipient location (in longitude and latitude)

Open questions

Do we want the option to identify recipient locations as regions (e.g. “The Judges of Sardinia”)?
Who (and how) is going to check the geocoding?

These choices may not need to be made until after the letters have been tabulated. Tom may be able to perform some magic to pull longitude and latitude for most of the recipient locations off the internet (using some kind of wikidata query? Not sure yet).

Anyway, after tabulation and location coding, we have something like this (done by hand in this case):

##    year  month  day  Book  ... latitude longitude latitude_pope longitude_pope
## 0  1073  April   23     1  ...  41.4916   13.8159       41.9028        12.4964
## 1  1073  April   23     1  ...  40.6824   14.7681       41.9028        12.4964
## 2  1073  April   26     1  ...  44.4184   12.2035       41.9028        12.4964
## 3  1073  April   29     1  ...  43.7696   11.2558       41.9028        12.4964
## 4  1073  April   28     1  ...  44.5751   10.4551       41.9028        12.4964
## 
## [5 rows x 13 columns]

Creating A Map

We have tabulated and geocoded the information from the first book of letters (~80 letters, April 1073- to April 1074). Already this allows us to play around with some possible visualisations. First, a map

\(~\) \(~\)

Other analyses

But now we have the data, additional analyses become possible.

Like seeing the most frequent recepient types. Here is a list of all recipient types, along with the frequency they appear in book 1. For fun I am also showing the code which generates the list.

df_locs['recipient'].value_counts()

## bishop        23
## archbishop    10
## duke           9
## abbot          7
## countess       6
## king           5
## people         3
## count          2
## prince         2
## queen          1
## princes        1
## cleric         1
## nobles         1
## baron          1
## barons         1
## judges         1
## emperor        1
## knight         1
## duchess        1
## judge          1
## canons         1
## monks          1
## Name: recipient, dtype: int64

Note how “baron” and “barons” are counted as seperate entries. It is possible to catch these things after data tabulation, but it is easier if we anticipate and enter the original data in a way that means letters we think are the same are tabulated as such.

In this case, you could imagine having a single column “recipient type” and another “recipient number”, so that “baron” and “barons” both got recorded as “baron” (under recipient type) and as “solo” and “multiple” (under recipient number, respectively). \(~\)

Here are the top 15 most frequent receipient places:


df_locs['place'].value_counts().head(15)

## Prague      6
## Canossa     6
## Cluny       3
## Reims       3
## Poitiers    3
## Pavia       3
## Milan       3
## Carthage    2
## Cagliari    2
## Bouillon    2
## London      2
## Lyon        2
## Genoa       1
## Beauvais    1
## Salzburg    1
## Name: place, dtype: int64

\(~\)

Here is a map of numerical average of all the longtidude and latitudes of the receipients. Note it isn’t clear what, if any, sense this makes, but it is suggestive of a kind of “center of gravity” of targets of papal correspondance.

\(~\)

Any suggestions for other possible analyses we can anticipate making?

\(~\)

Colophon

This is an HTML page served by Github pages, generated from an RMarkdown document which uses the reticulate package to allow embedding of Python code. Files are all held in the github repositry: github.com/tomstafford/PapalCorrespondence/

Mapping Papal Letters

Tom Stafford, Charles West

May 2020

Aim

Creating the data

Adding location information

Creating A Map

Other analyses

Colophon