ubi:ldspace Linked Data Lifecycle Management Framework and Infrastructure
The Linked Data technologies define methods and provide tools for publishing structured data so that it can be interlinked and add value to applications that not only require access to data, but also need to be aware of the relationships among data, allowing them to query that data and draw inferences using common vocabularies. Capitalizing on standard Web technologies, such as HTTP, RDF and URIs, the Linked Data technologies extend them to share information in a way that can be read automatically by computers, enabling data from different sources to be connected and queried.
To achieve and create Linked Data, technologies should be available for a common format (RDF), to make either conversion or on-the-fly access to existing databases (relational, XML, HTML, etc). It is also important to be able to setup query endpoints to access that data more conveniently. In this context, UBITECH R&D team has developed a framework and infrastructure, called ubi:ldspace, for the collection, transformation, interconnection and publication of data coming from structured, semi-structured or unstructured data sources found in public bodies or private companies. ubi:ldspace supports also the semi-automated linkage and interfacing with third-party data sources and with data coming from the Linked Open Data cloud. Incorporating a toolset for the management of the lifecycle of the Linked Data created, ubi:ldspace effectively supports sophisticated data curation, provenance and quality assurance processes and allows the dynamic definition and deployment of access policies.
ubi:ldspace facilitates organizations and enterprises to publish their data in a structured and formal way, creating their own Web of Data and Private Linked Data Cloud and providing authenticated access to their partners through a unified query interface. As a result, ubi:ldspace promotes the exploitation and utilization of interconnected data from software applications, and encouraging the creation of Linked Data Applications.
ubi:ldspace provides organizations and enterprises with an integrated platform that allows them to manage the lifecycle (aggregation, extraction, curation, linkage, publication, quality assurance and provenance) of private and public Linked Data. Creating private Linked Data Clouds with unified query interfaces, ubi:ldspace enables third-party applications to access, utilize and capitalize on qualitative, accurate, timely and formally structured data.
The Linked Data lifecycle, as defined and supported by ubi:ldspace, incorporates the following five steps:
(1) the RDF Data Extraction, which allows the semi-automated extraction of RDF triples from structured, semi-structured and un-structured data sources that are available in the systemic infrastructures of public or private organizations and enterprises.
(2) the RDF Data Linkage, which enables the semi-automated generation of links among the RDF triples and other data sources, including RDF or structured data with different formalisms of third-party organizations, Linked Open Data and SPARQL end-points.
(3) the Data Curation, Provenance and Quality Assurance, which filters the data based on the adopted quality assurance policies and monitors the origin and the evolution of the data themselves and the third-party data sources interlinked with these data, preserving the linkages generated among them.
(4) the Linked Data Publication that is responsible for the publication of the semantically interlinked and structured data sources, either like triples (basic graphs), either like quads (named graphs) or utilizing a programmable interface (e.g. RESTfull API, SPARQL endpoint). All these publication mechanism are incorporated in the Unified Query Interface of ubi:ldspace that provides unique and homogenized access to the generated private linked data cloud of the organization.
(5) the Data Access Authentication incorporating a sophisticated mechanism that realizes a set of authorized access policies on top of the published Linked Data. This mechanism allows the data owners to define how open (public) or private their data wish to be upon publication.