This class is oversubscribed for Spring 2023 for a limited number of slots.
Priority will be given to students to pre-registered for the class and then to students who contact the instructor (rahmed@mit.edu) to express interest before the first week of classes. There will be a limited number of spaces available for students who did not pre-register.
In order to participate in the lottery, you must attend the first day of class (M 2/6, 2:30-4 pm). We will distribute a questionnaire and determine enrollment based on answers to the questionnaire and a lottery.
Computers allow scholars and artists to study and play with media such as texts, images, audio, and numerical datasets with unprecedented scale and speed. These affordances open a world of opportunity for cultural production: artists can sketch, remix, and make on machines, and an individual scholar can access and analyze more and more varied cultural artifacts than ever before.
But what does it mean to model, create, or analyze these media on a computer? The humanities and arts are built on the fundamental understanding that nothing is binary, but computers only understand 1s and 0s!
What happens when we digitally encode culture?
This course explores this question, in the technical sense of how we represent these media as bits on a hard drive, and by considering the consequences of doing so. Students will learn the history and current practice of digitally encoding text, images, audio, and tabular datasets, along with the cultural and social issues implicit in these systems. They will apply computational methods for manipulating and analyzing encoded media, drawing from a wide range of practices including computational linguistics, audio processing, computer vision, and machine learning. In doing this work, students will confront underlying issues of what is lost and gained when we encode culture, and equip themselves to think critically about their own computing work.
After taking this course, you should be able to:
Lecture 1: What are close and distant reading? (Guest lecturer from Literature)
Lecture 2: History and current practices of character encoding. ASCII. Unicode.
Lab 1: Working with text in Python programs. Character encoding “by hand.” Text manipulation with standard string library. Basic text analysis with standard library. Weird art moment: Unicode art.
Lecture 3: Machine reading. Introduction to the Python Natural Language Toolkit.
Lab 2: Text analysis using NLTK. Visualization of results using matplotlib.
Lecture 4: How do musicians listen? (Guest lecturer from Music)
Lecture 5: What is digital audio? WAV Files. Sampling.
Lab 3: Exploring audio with NumPy. Audio manipulation. Weird art moment: mangling audio in NumPy.
Lecture 6: Metadata. Compression. Introduction to librosa.
Lab 4: Audio analysis using librosa.
Lecture 7: How do historians view photos? (Guest lecturer from History)
Lecture 8: What is a digital image?
Lab 5: Image processing. Filters and edge detection. Weird art moment: image mangling through bit manipulation and terrible masks.
Lecture 9: Object detection and image similarity. Introduction to TensorFlow.
Lab 6: Image analysis using TensorFlow.
Lecture 10: How do social scientists collect and analyze data? (Guest lecturer from Social Sciences)
Lecture 11: History and current practices of structured data. CSV. Excel. XML. JSON. Databases.
Lab 7: Loading and manipulating structured data using Python standard library. Visualization in matplotlib. Make and parse your own data format.
Lecture 12: Gathering data: web scraping, using web APIs. k-nearest neighbors and logistic regression.
Lab 8: Create a full data analysis pipeline, from scraping to statistics to visualization.