Midterm: Origins of Tate Artists

Nithin Poreddy

Bar Chart of 1930-1949 Tate Art Collection Data

Introduction

What I wanted to explore was where most of the artists that contributed to the Tate Art Collection were originally from who were born between 1930 – 1949. Given that the Tate collection is housed in the UK, I excluded all artists originally from the UK as I was more interested internationally which country had more citizens contribute to the collection. I used RawGraphs.io to create the bar chart visualization to look for any trends or spikes in data that could be interesting to analyze in the fields of history and art history.

Sources

I used the tate-art-data-1930-1949.csv data set found in the shared Google Drive, which delineates the artists born between 1930 – 1949 who contributed to the collection and provides links to their works. I used OpenRefine and Excel for data cleaning and organization and RawGraphs.io for data visualization.

Process

I first uploaded the data into OpenRefine in order to remove variables I wasn’t planning on using for analysis and visualization purposes. I removed all of the artists that were from regions in the UK as they were not important for what I was looking at, as stated above. In terms of variable columns, I kept the “placeofBirth” and “name” columns as those were the variables I was most interested in, but I also kept the “yearofBirth” and “gender” columns just in case I wanted to do any further comparisons. I followed the OpenRefine activity we did in class to make sure I made the data machine-readable and organized. I was trying to see if I could group artists by their country of origin and not have it by city/town, but I couldn’t do it in OpenRefine.

I then exported the edited data to Excel to see if I could fix the data there. In looking at the data, I realized that I would I have to manually change each entry to just list countries as the graphing software would graph each city/town as separate values even if they were located in the same country. I went ahead and did that and also cleared any data points with missing information; I changed the file type of the Excel spreadsheet so that it would work with the graphing software and it resulted in some data loss.

Presentation

I saved everything and uploaded the data to RawGraphs.io to create a bar chart; I looked on Flourish for other graphical displays but none of them worked with what I was trying to visualize, plus a bar graph worked best for what I was analyzing. After creating the graph, I adjusted the width of the graph so that all the x-axis labels were readable, increased padding to create more separation in the columns, and changed the color of the columns. For some reason, RawGraphs would not let me add a y-axis label, but there is an x-axis label at least (placeofBirth). The y values are just the numbers of artists from each country. Because the graph is so big, downloading and embedding it into this webpage distorted the quality and it isn’t super readable. Making the graph bigger also doesn’t help. I tried different methods such as downloading it as a different file and even trying other visualizations tools, but nothing seemed to solve the issue. The countries with the biggest number of contributors were the US, Canada, France, Deutschland, and Italy.

Significance

This project taught me a lot about what really goes into creating data visualizations that are used in bigger digital humanities projects. With my approach, it’s important to take note of the process of condensing the dataset to the variables you actually want to use. For me, that was the process that took the most time as I had to make sure all the data variables and values were labelled appropriately so that the visualization would come out like I wanted. However, the process is essential for focusing on the idea one wants to study and makes the visualization process a lot easier.

I also want to note that even though my visualization is just a simple bar chart, it still remains as one of the best ways to display the topic I was analyzing. I’m sure there are more aesthetic ways to model this data with my particular idea of study, but I still feel that it is important to value readability and interpretation to the general public over appearance. I know my graph isn’t super readable due to the its size and resolution but I’m just saying this generally. There is a balance that can be achieved but for the times that it can’t be reached, I think choosing easy interpretation is better. In reference to mine, I definitely could work on finding more visualization tools that can model what I want to study and result in a higher resolution, more compact graph.

In relation to the digital humanities in general, my project focuses on studying the origins of artists who’ve impacted the world of art through their contributions to the Tate Art Collection. I’m not using any statistical analysis or studying anything scientific but instead charting the distribution of the origins of the Tate collection artists born between 1930 – 1949. I aimed to create an easy-to-understand graph of a historical aspect of the Tate Art Collection, aligning with one of the main objectives of the digital humanities: using innovative tools to create interpretable visual representations of historical data pertaining to the arts, history, and the humanities.