Research Unit Data Linking
One of the central ideas of the Cluster of Excellence is to combine research in the humanities with research in the natural sciences for the study of written artefacts. Scientists working in artefact profiling analyse materials of written artefacts with a range of recent and up-to-date technologies, thus generating different kinds of associated data (e.g., X-ray data, spectral data). Supported by such materials science data and using various kinds of computational processes, researchers in the humanities can derive new conclusions based on scientific analyses and produce new results in the field of written artefact research to be published in volumes and journal articles.
With artificial intelligence (AI) technology, scientific results of humanities research can be automatically made available as data again to be used by computational processes in future investigations, e.g., machine-readable dictionaries of old languages derived from manuscripts, or databases of descriptions for the content as well as for the material of ancient inscriptions, to name just a few concrete examples. In the humanities, artefact description data are often provided in terms of so-called annotations. Annotations are used to symbolically describe the content of artefacts. Thus, for annotations, we also use the term symbolic content descriptions, which are located on the digitisation of an artefact with respect to a suitable reference system. Intelligent agents known from AI can support researchers in the humanities in their daily work by deriving symbolic content descriptions, and thus allow for new scientific insights with less effort. This way, data stemming from humanities publications are automatically linked to future humanities research problems so that humanities researchers are optimally supported with AI technology in subsequent scientific work.
Yet, the next frontier of AI is not just technological but also humanistic and ethical. AI will enhance the work of human actors rather than replace them, and extensive research programmes are set up to make intelligent agents explicable, legible, and predictive. The Cluster of Excellence contributes to respective research goals. In the process of recording and interpreting results, linking services can be combined with other services to (automatically) find relevant data and even evaluate the relevance of documents on the web for certain research problems. In the same spirit, researchers from the natural sciences can benefit from support services to find out in what contexts their data might be relevant and in which contexts their data are used.
In order to derive symbolic content descriptions and build corpora of, e.g., related publications automatically, intelligent agents need to acquire domain models from (sparse) data. In particular, learning technology developed in artificial intelligence exploits model derivation from annotated data or publications. Creating annotations is a central working method of humanities scholars (e.g., for manuscripts, inscriptions, etc.). At present, annotations are often still (laboriously) created by hand, but in the future, this will be done much more effectively with AI technology. These annotations can then be used to make machines learn "the right thing" at the right time, and thus, humanities researchers can train intelligent agents, or, to put it in other words, humanities researchers can control the content of domain models derived and used by intelligent agents by composing annotated data and/or publications from which annotations are extracted. Our data linking architecture supports this process, and, in this respect, the Cluster is also at the forefront of AI.