To take the Correspondance of Pope Gregory VII (1075-1085) and tabulate it (according to date, addressee etc), so as to allow visualisations — particularly maps.
The letters are recorded in 9 (?) published registers, available in PDF. The major work is to systemmatically record details of each letter in a spreadsheet, looking something like this:
## year month day ... recipient Location of pope clerical/secular
## 0 1073 April 23 ... abbot Rome c
## 1 1073 April 23 ... prince Rome s
## 2 1073 April 26 ... archbishop Rome c
## 3 1073 April 29 ... bishop Rome c
## 4 1073 April 28 ... duchess Rome s
##
## [5 rows x 9 columns]
This tabulation creates a resource from which multiple possible analyses and visualisations can be built.
Consistent labelling is essential to create a coherant data set. It is worth thinking about what items we want to ‘score’ about each letter, before tabulating the data.
Next we need to itentify each letter with a recipient location (in longitude and latitude)
Open questions
These choices may not need to be made until after the letters have been tabulated. Tom may be able to perform some magic to pull longitude and latitude for most of the recipient locations off the internet (using some kind of wikidata query? Not sure yet).
Anyway, after tabulation and location coding, we have something like this (done by hand in this case):
## year month day Book ... latitude longitude latitude_pope longitude_pope
## 0 1073 April 23 1 ... 41.4916 13.8159 41.9028 12.4964
## 1 1073 April 23 1 ... 40.6824 14.7681 41.9028 12.4964
## 2 1073 April 26 1 ... 44.4184 12.2035 41.9028 12.4964
## 3 1073 April 29 1 ... 43.7696 11.2558 41.9028 12.4964
## 4 1073 April 28 1 ... 44.5751 10.4551 41.9028 12.4964
##
## [5 rows x 13 columns]
We have tabulated and geocoded the information from the first book of letters (~80 letters, April 1073- to April 1074). Already this allows us to play around with some possible visualisations. First, a map
\(~\) \(~\)
But now we have the data, additional analyses become possible.
Like seeing the most frequent recepient types. Here is a list of all recipient types, along with the frequency they appear in book 1. For fun I am also showing the code which generates the list.
df_locs['recipient'].value_counts()
## bishop 23
## archbishop 10
## duke 9
## abbot 7
## countess 6
## king 5
## people 3
## count 2
## prince 2
## queen 1
## princes 1
## cleric 1
## nobles 1
## baron 1
## barons 1
## judges 1
## emperor 1
## knight 1
## duchess 1
## judge 1
## canons 1
## monks 1
## Name: recipient, dtype: int64
Note how “baron” and “barons” are counted as seperate entries. It is possible to catch these things after data tabulation, but it is easier if we anticipate and enter the original data in a way that means letters we think are the same are tabulated as such.
In this case, you could imagine having a single column “recipient type” and another “recipient number”, so that “baron” and “barons” both got recorded as “baron” (under recipient type) and as “solo” and “multiple” (under recipient number, respectively). \(~\)
Here are the top 15 most frequent receipient places:
df_locs['place'].value_counts().head(15)
## Prague 6
## Canossa 6
## Cluny 3
## Reims 3
## Poitiers 3
## Pavia 3
## Milan 3
## Carthage 2
## Cagliari 2
## Bouillon 2
## London 2
## Lyon 2
## Genoa 1
## Beauvais 1
## Salzburg 1
## Name: place, dtype: int64
\(~\)
Here is a map of numerical average of all the longtidude and latitudes of the receipients. Note it isn’t clear what, if any, sense this makes, but it is suggestive of a kind of “center of gravity” of targets of papal correspondance.
\(~\)
Any suggestions for other possible analyses we can anticipate making?
\(~\)
This is an HTML page served by Github pages, generated from an RMarkdown document which uses the reticulate package to allow embedding of Python code. Files are all held in the github repositry: github.com/tomstafford/PapalCorrespondence/