Research projects affiliated with CDHU:
- The Norse Perception of the World: A Mapping and Analysis of Foreign Place Names in Medieval Swedish and Danish Texts
- From Dust to Dawn: Multilingual Grammar Extraction from Grammars Project
- CHRONOS - Chronology of Roots and Nodes of Family Trees: Fine-tuning the Instruments of Linguistic Dating
- Swedish Caribbean Colonialism 1784–1878: Caribbean Colonialism 1784–1878. Integrating, Classifying, Publishing and Investigating Dispersed Swedish Colonial Archives
- Gender and Work (GaW)
- Patterns of Popularity: Towards a Holistic Understanding of Contemporary Bestselling Fiction
- Around the village ring street: interdisciplinary research and historical visualization, the cultural heritage site of Ekeby village
- Everlasting Runes – a Research Platform for Sweden’s Runic Inscriptions
- GIS for language study
- Digitala Birgitta: Att tillgängliggöra heliga Birgittas fornsvenska texter
- Svensk dramadialog under tre sekler
- Automatic Decryption of Historical Manuscript
- The GIFT project
- Från närläsning till fjärrläsning
- From Quill to Bytes (q2b)
- The News Evaluator
- CApturing Paradata for documenTing data creation and Use for the REsearch of the future (CAPTURE)
- GLoW - Geomapping Landscapes of Writing
- Memories for life: Materiality and Memory of Ancient Near Eastern Inscribed Private Objects
- A Database of Turkic Runiform Inscriptions
- (Re)constructing a Bible. A new approach to unedited Biblical manuscripts as sources for the early history of the Karaim language
- Swedish Diachronic Corpus
- Urdar. A research infrastructure for archaeological excavation data
- Cultural Evolution of Texts
- Quantifying Culture: AI and Heritage Collections
AROUND THE VILLAGE RING STREET: INTERDISCIPLINARY RESEARCH AND HISTORICAL VISUALIZATION, THE CULTURAL HERITAGE SITE OF EKEBY VILLAGE
The project is a collaboration between Uppsala University, Upplandsmuseet and the Institute for Languages and Folklore. Research will focus on the cultural heritage site of Ekeby, combining archaeology and the history of society, buildings, and place-names, with the site's many time layers as a common hub. The aim is to create the conditions for well-founded narratives of the place, its farms, crofts, agriculture and indwellers from prehistory through the Middle Ages and to the present. The history of the hamlet's uniquely preserved 19th C setting, and earlier eras, will be mediated through a digital 3D visualization of Ekeby's 18th C appearance, information signs,
web & prints.
Contact: Rosemarie Fiebranz
The purpose of this project is to make a holistic analysis of contemporary bestsellers by combining different methods (distant and close, inductive and deductive, probabilistic and historic-empiric), materials (text data, metadata, reader data), and theoretical perspectives (publishing studies, computational criticism, media theory). Three formats for bestsellers – hardbound, paperback, audiobooks – will be analysed and discussed. By using literary materials as key empirical ground in a study of contemporary publishing, the ambition is to bridge the gap between studies of cultural production and content. The project brings the scale of statistical text analysis to book history and at the same time provides a concrete interpretational frame of publishing studies to literary distant readings. The analysis of reader behaviour data will give unique knowledge on reading patterns in contemporary book trade.
Traditionally, researchers often study the diversity of world's languages by reading and comparing grammatical descriptions manually. Nowadays, a large amount of linguistic descriptions and books are easily available in digital formats. Reading them all for a wider-level comparison and analysis is way beyond individual people's capabilities. Text technology, i.e. computer-based text management in natural language, is now powerful enough to potentially be used to harvest facts at different levels of detail within a given domain (in this case, information on world languages). In this project we want to utilize a useful collection of 9000 digitized grammatical descriptions covering over a thousand languages in order to significantly expand the ability to make major language comparisons. For this purpose, the project will develop methodologies to enable computers to read grammatical descriptions and automatically extract information ("linguistic facts"). We are to explore and develop a notion of "language profile", which is a structured digital collection and representation of a language encapsulating all available knowledge about a language extracted from various sources.
Contact: Harald Hammarström
CHRONOS – CHRONOLOGY OF ROOTS AND NODES OF FAMILY TREES: FINE-TUNING THE INSTRUMENTS OF LINGUISTIC DATING
The project aims to explore the possibilities of dating ancestral stages of language families by a systematic and careful study of the stability and replacement patterns of different types of linguistic data. Toward this aim, the project has chosen to look at the language families Indo-European (Eurasia) as well as Arawakan and Tupí (South America). The ability to date ancestral linguistic stages would be a revolutionary step forward for understanding language and population history, parallel to carbon-14 dating in archaeology. Even a less precise method than carbon-14 would be a significant achievement for linking ancestral linguistic stages to other disciplines such as archaeology, genetics and geoclimatology. Classic glottochronology, which assumes a constant rate of lexical replacement, has long been discredited. However, even if lexical replacement rates are not constant, they are also not totally random. Moreover, other aspects of language may show tighter regularity than lexicon since lexicon is amenable to conscious manipulation; indeed, word taboo is one of the reasons for accelerated lexical replacement, but so far there has been little research into grammatical chronology, which is an important aim of this project. Now, however, the time is ripe for a systematic investigation into linguistic dating. Far more data is available, and large linguistic databases have become practical to use. Similarly, methodological advances, often imported from biology, presently allow for a more thorough exploitation of the data. The project involves collaboration with Lund University. The project is funded by the Marcus and Amalia Wallenberg foundation (MaW 2017.0050).
THE NORSE PERCEPTION OF THE WORLD: A MAPPING AND ANALYSIS OF FOREIGN PLACE NAMES IN MEDIEVAL SWEDISH AND DANISH TEXTS
East Norse (Old Swedish and Old Danish) literature is a mine of information on how foreign lands were visualised in the Middle Ages: What places were written about and where? Are some places more popular in certain text types or at certain times? How do place names link different texts? Is there a shared concept of spatiality? How is space gendered?
Geohumanities, the spatialisation of literary studies, and cognitive mapping are growing fields within digital humanities, but the study of spatial thinking and knowledge in medieval Scandinavia and its development as an area of enquiry are hampered by a dearth of information on place names in literary texts. Any research aiming to uncover what pre-modern Scandinavians understood about places abroad requires as a minimum an index of foreign place names in East Norse literature. Yet to-date no such index exists.
Contact: Alexandra Petrulevich
The research platform Everlasting Runes will present Sweden’s runic inscriptions in a new way and give new possibilities to work with runic material. The aim is to show, in one and the same place on the Internet, all the country’s runic inscriptions in text and image, and simultaneously to provide a large and varied collection of documentation and original sources for further research. The research platform will link the published parts of the series Sveriges runinskrifter with the Scandinavian Runic-text Database and make it possible to use both these sources together. In connection with the project three research tasks will also be carried out, using the material and contributing to the design of the platform: “Runic inscriptions of Medelpad”, “The islands in the Baltic Sea”, and “Otto von Friesen as a runologist”. Everlasting Runes is a collaboration between the National Heritage Board and Uppsala University. The name is a translation of the Old Norse word ǣvinrūnaʀ, which is attested twice in the sources: once carved in stone on the Malt Stone in Jutland and once written on parchment in the Eddic poem Rígsþula.
Contact: Marco Bianchi
Contact: Alexandra Petrulevich
Contact: Marco Bianchi
More about the project Digitala Birgitta
Contact: Carin Östman
Thousands of encrypted manuscripts are found in archives all over Europe, documents that are not yet available for historical research. Examples of such materials are diplomatic and military correspondence and intelligence reports, magical and scientific writings, private letters and diaries, as well as manuscripts related to secret societies. Many scholars and scientists are working on some of these documents in a completely uncoordinated fashion, and from different and complementary areas such as history, linguistics, philology, computer science, and computational linguistics, all with their own point of view, purpose and methods. They encounter the same or similar problems when confronted with encrypted documents. Whereas various algorithms and tools have been developed to decipher the most common forms of encryption, most of these are not suitable to deal with historical, hand-written encrypted documents that don’t use standardized methods, are often hybrid in nature, and are not available in machine-readable form. The aim of the project is to bring the expertise of these different disciplines together, to digitize and process the historical encrypted sources and release these through a database with information about provenance and other facts of relevance. We focus on the development of software tools for automatic or semi-automatic analysis and decryption of various types of encrypted documents, by employing linguistic universals and formal methods.
Contact: Beata Megyesi
Museums serve as our collective memory, preserving and interpreting our shared culture and identity. The central challenge of the GIFT project is to create designs that facilitate meaningful interpersonal experiences. GIFT focusses on hybrid experiences, realised through mixed reality designs that overlay physical visits with digital content as a way to complement, challenge or reframe the experience of museum visits.
The department of Informatics and Media participates in the project in the role of theory lead. Based primarily in discourse theory and pragmatic design theory, our role is to document and articulate the knowledge contributions that emerge from the practical design and evaluation work within the project, and that help us understand how hybrid museum experiences are designed and experienced. Within the project, we also experiment with a range of methods to make rather abstract theories accessible to the kind of interdisciplinary design teams that typically take on the challenge of creating these kinds of museum experiences.
Contact: Annika Waern
More about The GIFT Project
Contact: Johan Svedjedal
This cross disciplinary initiative takes its point of departure in the analysis of handwritten text manuscripts using computational methods from image analysis and linguistics. It sets out to develop a manuscript analysis technology providing automatic tools for large-scale transcription, linguistic analysis, digital paleography and generic data mining of historical manuscripts. Our mission is to develop technology that will push the digital horizon back in time, by enabling digital analysis of handwritten historical materials for both researchers and the public.
Gender and Work (GaW) is a combined research and digitisation project hosted by the Department of History at Uppsala University, since 2008. The aim of the project is to increase knowledge about the work of both men and women in the past (1550-1880).
In the project we have gathered and classified thousands of fragments of information from a variety of historical sources that describe the ways people sustained and provided for themselves. This information has been stored in a unique database that has been made accessible for researchers, students, and the general public. The database and the research project use the verb-oriented method which in turn is inspired by time-use studies (advocated by the UN).
SWEGRAM aims to provide a tool for text analysis in Swedish and English. You can upload one or several texts and annotate them at different linguistic levels with morphological and syntactic information. The annotated texts can then be used to extract statistics about the text properties with respect to text length, number of words, readability measures, part-of-speech, and much more.
Contact: Beáta Megyesi
Young people today receive news mainly in digital media. In the channels where also destructive movements try to spread fear and prejudice with fake news. To deal with the social challenges of fake news and fact resistance, we have developed the digital tool News Evaluator.
The tool supports young people's reviews of news feeds and is also a user-friendly database where reviewed news can be explored by young people themselves. Design and content are based on current research on young people's difficulties in reviewing news online and the importance of their own active processing of their news feeds. The tool creates a dynamic interface between education and research, and stimulates students to actively engage where they can compare their own evaluations with other students' reviews. The current information on credible and fake news in young people's media streams gathered in the tool can be used in teaching, research and journalism. In the first step of the project, 6,000 young people tested the tool and before the 2018 elections, 3,500 young people.
In the next step we will:
- refine the design of the tool and database
- supplement the tool with a self-test in digital source criticism
- evidence-test the possibilities and limitations of the tool
- explore opportunities to give the public access to the tool
- further develop the tool for national and international dissemination.
Contact: Thomas Nygren
CAPTURE investigates what information about the creation and use of research data that is paradata) is needed and how to capture enough of that information to make the data reusable in the future. The wickedness of the problem lies in the practical impossibility to document and keep everything and the difficulty to determine how to capture just enough. The empirical focus of CAPTURE is archaeological and cultural heritage data, which stands out by its extreme heterogeneity and rapid accumulation due to the scale of ongoing development-led archaeological fieldwork. Within and beyond this specific context, CAPTURE develops an in-depth understanding of how paradata is being created and used today, elicits methods for capturing paradata, tests new methods in field trials, and synthesises the findings in a reference model to inform the capturing of paradata and enabling data-intensive research using heterogeneous research data stemming from diverse origins.
Contact: Isto Huvila
Establishing a Database of Turkic Runiform Inscriptions is one of the major tasks of a recently initiated interdisciplinary research network at Uppsala University. The research network includes philologists, linguists and archaeologists, and aims to document, describe and analyse the runestone inscriptions of Eurasia.
Contact: László Károly
GEOMAPPING LANDSCAPES OF WRITING (GLOW): LARGE-SCALE SPATIAL ANALYSIS OF THE CUNEIFORM CORPUS (C. 3400 BCE TO 100 CE)
Cuneiform is one of the oldest scripts in human history and among the largest bodies of historical documentation from the ancient world. Rough estimates suggest the total word count of all cuneiform records to outmatch those of Egypt and Rome by a considerable margin. Cuneiform writing was widely used across the Middle East for over three millennia, from c. 3400 BCE to 100 CE. Written primarily on clay, cuneiform texts are preserved in larger numbers than virtually any other type of written media. This project assembles and analyses a full digital record of this corpus drawing on recent advances in digital humanities and geospatial data mapping. As a first quantifiable and corpus-wide study of one of the greatest corpora of historical records from the ancient world, it will provide a benchmark example of the application of digital and spatial computing tools to the study of writing in early human history.
The project develops a more complete understanding of ancient Near Eastern inscribed objects commissioned by private individuals using cuneiform writing between ca 2800 BCE–100 CE. The objects were set up for the sake of remembrance, most commonly in cultic contexts. In the early Near Eastern urbanities, people sought to establish a presence before gods to ensure divine favour. The combined strengths of material objects and inscriptions lent permanence to the symbolic act of gift-giving, establishing lasting ties between humans and the divine. The aim is to identify and highlight the personal perspective in inscribed objects commissioned by private individuals and the relationship- and value-creating potential that come to the fore in such objects. Objects are approached by means of a materiality profiling, combining analyses concerning, e.g., archaeological context, content and finish of the text, along with physical characteristics and production techniques.
Contact: Jakob Andersson
(RE)CONSTRUCTING A BIBLE. A NEW APPROACH TO UNEDITED BIBLICAL MANUSCRIPTS AS SOURCES FOR THE EARLY HISTORY OF THE KARAIM LANGUAGE
Eastern European Karaims are the sole representatives of Karaite Judaism in Europe. Their native tongue is a severely endangered Turkic vernacular listed on the UNESCO Atlas of the World’s Languages in Danger. Due to many historical events, including World War II and the Soviet era, the cultural heritage of this intriguing ethnic minority suffered great losses.
The project will construct a digital edition of the entire Karaim Bible, based almost exclusively on unedited texts in Hebrew script (15th–20th cc). It will contain the first ever comprehensive Karaim translation of the Hebrew Bible intelligible to present-day native speakers. As a complex scientific instrument, it will deliver the first linguistic and palaeographic descriptions of the oldest, still unedited records of Karaim as well as reconstruct the way in which the Karaim Bible was created.
Combining traditional and computer-aided research methods will provide essential data on the early history of the Karaim language and ethnicity.
Michał Németh at Jagellonian University is the principal investigator of the project. László Károly as partner at Uppsala University will be responsible for creating an on-line platform including a dynamic digital edition of Karaim Bible translations interconnected with a lexicographic database.
In the long run, the platform at UU will be capable of providing digital editions, a corpus based lexicographic database, and other related software components to support the scholarly study of other Middle Turkic literary languages as well.
Within the framework of Clarin, we are developing a Swedish diachronic corpus, containing texts from Old Swedish (13th century) to Contemporary Swedish. Our primary goal is to create a resource to be used for research in historical linguistics, for studies on how the Swedish language has changed over time, but the corpus can also be an important resource for other researchers in the humanities. We aim for a balanced corpus, with texts from different genres and time periods. Furthermore, the corpus should have a structure similar to diachronic corpora available for other languages, in order to enable comparative studies both within the Swedish language and between Swedish and other languages. In addition, we will add linguistic annotation to the texts in the corpus, enabling more sophisticated search queries. We also find it very important to take input from the intended users into account in the corpus building process, and we have therefore at an early stage sent out a survey to researchers in historical linguistics, in order to learn more about their needs and views on the contents and structure of the upcoming Swedish diachronic corpus.
Contact: Eva Pettersson
SWEDISH CARIBBEAN COLONIALISM 1784–1878: CARIBBEAN COLONIALISM 1784–1878. INTEGRATING, CLASSIFYING, PUBLISHING AND INVESTIGATING DISPERSED SWEDISH COLONIAL ARCHIVES
Sweden became a slave nation when the Caribbean island of Saint Barthélemy was taken into possession in 1785. The island was sold to France in 1878 and the entire Swedish government archive was left on the island. This Swedish archive, the Fonds Suédois de Saint Barthélemy contains c. 300.000 manuscript pages –– and is in the French colonial archives in Aix-en-Provence. It is closed to both researchers and the public. The project is based on the successful digitization of this archive (2011–2016) and makes this the largest Swedish colonial archive available via the Internet together with other collections of Swedish Caribbean documents in the Swedish National Archives and foreign collections.
The rich heritage legacy from archaeological excavations in Sweden is largely inaccessible for data driven research. The Urdar project will ensure that digitally born documentation from excavations will not be lost to posterity and that it will be findable for researchers through linked data and open archives. Semantic linking of field documentation and research data will enable information to be optimized for Digital Humanities and the sciences. This will contribute to interdisciplinary research as well as strengthen the position of archaeology in academic research. Digital excavation documentation is a prime resource for exploring long-term perspectives in many different fields of research. Urdar will incorporate the FAIR principles, ensuring the results from field archaeology are primed for incorporation in a wider European framework of archaeological infrastructures through the use of common open standards and formats.
Contact: Daniel Löwenborg
How is cultural knowledge passed down through generations? Which processes promotes the fidelity of transmission of written or oral texts over longer or shorter times? And are there regularities in the processes of change that they undergo? This project takes a mixed methods approach to analyse how religious and instructional texts are passed down through time. Among the religious texts represented are the ancient liturgies of Zoroastrianism. These originate from oral traditions dating back about 2500 years. The Apophthegmata patrum, collections of sayings of the Christian church fathers, likewise belong to a tradition of many centuries. These writings have been copied, edited and translated over and over again. Instructional texts are collected from, among others, a corpus of cookbooks that span several centuries. By examining how these types of texts change, the research project will contribute to an in-depth understanding of how cultural knowledge develops and is renegotiated over time. The research project brings together researchers with expertise in different types of text traditions with researchers working within computer science and phylogenetic frameworks. This unique collaboration is expected to contribute to the development of new methods for phylogenetic network analysis of linguistic and cultural evolution.
Contact: Michael Dunn
The project has the purpose to unlock the future potential of AI for the management and curation of cultural heritage collections. A synthesis of AI methods and critical scholarship can co-produce diverse and more nuanced perspectives on heritage collections, thus reaching the public of the future. By developing theoretical and technological knowledge the project’s concrete aims are: 1) To map and explore the current practices and experiences, as well as anticipated futures, of GLAM digitalisation in Sweden; 2) to investigate how AI/ML-generated descriptions of art and heritage can be enhanced in meaningful ways; 3) to analyse AI/ML methods’ and tools’ compliance with FAIR and international data standards, as well as their reflection of and engagement with diversity and ethics; 4) To explore how we can connect AI to qualitative aspects of the examined material where critical and ethical theories meet with algorithms and mathematics. The project’s key research questions therefore are: RQ1: What is the current and evolving state of the art of GLAM digitization in Sweden today? RQ2: How can we have more nuanced and meaningful AI/ML generated descriptions on heritage collections? RQ3: How do AI/ML methods and tools comply with FAIR and international data standards? RQ4: How can issues of a qualitative nature (bias, diversity and ethics) be connected to the quantitative nature of AI/ML algorithms used for GLAM digitalisation? The knowledge developed will enable GLAMs in Sweden to present cultural heritage framed by diversity and inclusion, responsive to future audiences.
The project utilizes case studies which span time and space (across continents) and come in a number of formats, mainly photography, prints and textual archives. These include (mid 19th century- early 20th century) photographic and archival collections of artefacts and people from African and Asian countries collected by Swedish ethnographers, photographic archives of the early archaeological Swedish expeditions to Egypt, Greece, and Cyprus, including pictures of antiksamlingen (objects collected by Swedish professors of Archaeology including some minimal accompanying information such as index cards - date bought/collected). These are digitized and curated by Swedish stakeholders such as the National Museums of World Culture Museums of (National Ethnographic and Museum of Mediterranean and Near Eastern Antiquities); the Uppsala Museum Gustavianum with an emphasis on Mediterranean collections; the Svensk Diplomatarium archives (1100-1523 CE) at the National Archives of Sweden (Riksarkivet); and the National Heritage Board of Sweden (Riksantikvarietsämbetet). The Project is generously financed for five years by WASP-HS (Autonomous Systems and Software Program-Humanities and Society) initiative of the Wallenberg Foundations. (https://wasp-hs.org)