Faculty of Science
Faculty of Science
UNIverse - Public Research Portal

Databases and Information Systems

Projects & Collaborations

44 found
Show per page
Project cover

Polypheny-DDI

Research Project  | 3 Project Members

In recent years, data-driven research has established itself as the fourth pillar in the spectrum of scientific methods, alongside theory, empiric research, and computer-based simulation. In various scientific disciplines, increasingly large amounts of -both structured and unstructured- data are being generated or existing data collections that have originally been isolated from each other are being linked in order to gain new insights. he process of generating knowledge from raw data is called Data Science or Data Analytics. The entire data analytics pipeline is quite complex, and most work focuses on the actual data analysis (using machine learning or statistical methods), while largely neglecting the other elements of the pipeline. This is particularly the case for all aspects related to data management, storage, processing, and retrieval - even though these challenges actually play an essential role. A Distributed Data Infrastructure (DDI) supports a large variety of data management features as demanded by the data analytics pipeline. However, DDIs are usually very heterogeneous in terms of data models, access characteristics, and performance expectations. In addition, DDIs for integrating, continuously updating, and querying data from various heterogeneous applications need to overcome the inherent structural heterogeneity and fragmentation. Recently, polystore databases have gained attention because they help overcome these limitations by allowing data to be stored in one system, yet in different formats and data models and by offering one joint query language. In past work, we have developed Polypheny-DB, a distributed polystore that integrates several different data models and heterogeneous data stores. Polypheny-DB goes beyond most existing polystores and even supports data accesses with mixed workloads (e.g., OLTP and OLAP). However, polystores are limited to rather simple object models, static data and exact queries. When individual data items follow a complex inherent structure and consist of several heterogeneous parts between which dedicated constraints exist, when the access goes beyond exact Boolean queries, when data is not static but continuously produced, and/or when objects need to be preserved in multiple versions, then polystores quickly reach their limits. At the same time, these are typical requirements for data management within a data analytics pipeline. Examples are scientific instruments that continuously produce new data as data streams; social network analysis that requires support for complex object models including structured and unstructured content; data produced by imaging devices that requires sophisticated similarity search support, or frequently changing objects that are subject to time-dependent analyses. The objective of the Polypheny-DDI project is to seamlessly combine the functionality of a polystore database with that of a distributed data infrastructure to meet the requirements of data science applications. It will focus on i.) supporting complex composite object models and enforcing constraints between the constituent parts; ii.) supporting similarity search in multimedia content, and iii.) supporting continuous data streams and temporal/multiversion data.

Project cover

Video for Scientific Outreach of the Research Network Responsible Digital Society

Research Networks of the University of Basel  | 8 Project Members

The research network "Responsible Digital Society" is involved in a variety of ways to strengthen the promotion of interdisciplinary exchange and cooperative research in the field of digital transformation.

In the area of research, the network creates forums for regular scientific exchange and supports the coordination of interdisciplinary research proposals. In the area of promoting young researchers, the network organizes summer and winter schools for them. In the area of networking, the network promotes regular exchanges with industrial partners in the region. In the area of outreach, the network strengthens the public dialogue by organizing colloquia and panel discussions on digitization with guests from various disciplines.

Project cover

DRES - Distributed Retrieval Evaluation Server

Research Project  | 3 Project Members

Evaluation campaigns for interactive multimedia retrieval, such as the Video Browser Shodown (VBS) or the Lifelog Search Challenge (LSC), so far imposed constraints on both simultaneity and locality of all participants, requiring them to solve the same tasks in the same place, at the same time and under the same conditions. These constraints are in contrast to other evaluation campaigns that do not focus on interactivity, where participants can process the tasks in any place at any time. In this work, we are designing and implementing an evaluation scheme for interactive retrieval evaluation that relaxes both simultaneity and locality constraints, enabling participation from any place at any time within a predefined time frame. This scheme, as implemented in the Distributed Retrieval Evaluation Server (DRES), enables novel ways of conducting interactive retrieval evaluation and bridged the gap between interactive campaigns and non-interactive ones.

Project cover

Participatory Knowledge Practices in Analog and Digital Image Archives

Research Project  | 2 Project Members

The common goal of this project is to design a visual interface with machine learning-based tools to make it easy to annotate, contextualize, organize, and link both images and their meta-information, to deliberately encourage the participatory use of archives. In a series of workshops and interviews with both academic and non-academic users, along with archivists and database specialists, the project will analyze the new demands of digital (and process-oriented) knowledge production in order to achieve these goals. In their own rubric - Citizen Archive - academic and non-academic users of the existing Swiss Society for Folklore Studies SSFS's (Schweizerische Gesellschaft für Volkskunde SGV) networks and partners will receive a series of Calls for Images inviting them to upload and comment current photographs as comments on historical images; this will further foster the contextualization of the archival material. In turn, these digital additions will have to be supplied with metadata and contextual knowledge. Such analysis of the context of images and collections (crowdsourcing) will enrich the metadata of the material and thus also make image searching and information retrieval more effective. Along with the design of the participatory digital image archive, this four-year research project will describe the transformation of analog archives into digital archives from the perspective of technology, communication, and the anthropology of knowledge. The common goal is the analysis and systematic description of historical and contemporary archiving practices: the generation, organization, storage, and communication of knowledge. The complex interplay of participants, epistemological orders, and the genesis and graphical representation of information and knowledge in such practices will be studied in connection with three collections from the photo archive of the Swiss Society for Folklore Studies. In previous research, these areas were mostly considered separately rather than from an interdisciplinary, cross-domain and application-oriented perspective that can capture such interplay. In contrast, the proposed project's interdisciplinary collaboration between digital humanities, cultural anthropology, and design research will serve our goal of increasing, improving, and imparting knowledge of analog and especially digital image archives and of ways to use them. As its common primary outputs, the project will produce not only the visual interface discussed above, a dynamic storage infrastructure, but also a handbook with guidelines for the future development of participatory archives as well as six dissertations and several scientific papers in the various disciplines.

Project cover

Fake News Detection and Content Categorization

Research Project  | 1 Project Members

The main innovative content of the planned work is the scalable classification and contextualization of news items, enriched with the detection and analysis of disinformation based on multi-modal cues (i.e., taking different types of media into account). Differentiating opinionated journalism from objective news reporting, real stories from rumors, and hoaxes from maliciously crafted disinformation has become a major challenge and has led to new terms like fake news, post-truth, alternative facts, and "truthiness". Fact-checking platforms like FactCheck.org or PolitiFact.com employ journalists and volunteers to manually verify articles and speeches as they are being published. However, the task of manually verifying and labeling news articles is arduous, expensive, prone to error or personal bias, and does not scale. Out of the many projects that aim at automatically classifying and analyzing news items, none of them jointly considers text and embedded multimedia content.

Project cover

Bitcoin Full Node Analysis

Research Project  | 2 Project Members

The popularity and rapid adoption of Bitcoin has significantly changed both the financial and digital landscape. First, from a financial point of view, it has introduced a novel and in particular democratic monetary instrument. Second, from a computer science perspective, it comes with a large-scale, inherently distributed and highly fault-tolerant peer-to-peer infrastructure, without any centralized control. While the former aspect, the financial perspective, has been analyzed in detail, the latter aspect, the computer science perspective is rather unexplored, especially from a quantitative point of view. In this project, we present the results of an analysis of a big data collection on all advertised Bitcoin full nodes worldwide, harvested in a period of more than seven years, from September 4th 2014 until October 31st 2021.