Archive

Posts Tagged ‘software’

First release of SemanticXO!

Here it is: the first fully featured release of SemanticXO! Use it in your activities to store and share any kind of structured information with other XOs. The installation procedure is easy and only requires and XO-1 running the operating system version 12.1.0. Go to the GIT repository and download the files “setup.sh” and “semanticxo.tar.gz” somewhere the XO (these files are in the directory “patch_my_xo”). Then, log in as root and execute “sh setup.sh setup”. The installation package will copy the API onto the XO, setup the triple store and install two demo activity. Once the procedure is complete, reboot the XO to activate everything.

The XO after the installation of SemanticXO

There are two demo activities which are described in more details on the project page. Under the hood SemanticXO provides an API to store named graphs containing description of one or several resources. These named graphs are marked with an author name, a modification date and, eventually, a list of other devices (identified by their URI) to share the graph with. This data is used by a graph replication daemon which every 5 minutes browse the network using Avahi, find other triple stores, and download a copy of the graphs that are shared with it. The data backend of the mailing activity provides a good example of how the API is used.

Advertisements

Exposing API data as Linked Data

The Institute of Development Studies (IDS) is a UK based institute specialised in development research, teaching and communications. As part of their activities, they provide an API to query their knowledge services data set compromising more than 32k abstracts or summaries of development research documents related to 8k development organisations, almost 30 themes and 225 countries and territories.

A month ago, Victor de Boer and myself got a grant from IDS to investigate exposing their data as RDF and building some client applications making use of the enriched data. We aimed at using the API as it is and create 5-star Linked Data by linking the created resources to other resources on the Web. The outcome is the IDSWrapper which is now freely accessible, both as HTML and as RDF. Although this is still work in progress, this wrapper already shows some advantages provided by publishing the data as Linked Data.

Enriched data through linkage

When you query for a document, the API indicates you the language in which this document is wrote. For instance, “English”. The wrapper replaces this information by a reference to the matching resource in Lexvo. The property “language” is also replaced by the equivalent property as defined in Dublin Core, commonly used to denote the language a given document is wrote in. For the data consumer, Lexvo provides alternate spelling of the language name in different languages. Instead of just knowing that the language is named “English”, the data consumer, after deferencing the data from Lexvo will know that this language is also known as “Anglais” in French or “Engelsk” in Danish.

Part of the description of a document

Links can also be established with other resources to enrich the results provided. For instance, the information provided by IDS about the countries is enriched with a link to their equivalent in Geonames. That provides localised names for the countries as well as geographical coordinates.

Part of the description of the resource "Gambia"

Similarly, the description of themes is linked with their equivalent in DBpedia to benefit from the structured information extracted from their Wikipedia page. Thanks to that link, the data consumer gets access to some extra information such as pointers to related documents.

Part of the description of the theme "Food security"

Besides, the resources exposed are also internally linked. The API provides an identifier for the region a given document is related to. In the wrapper, this identifier is turned into the URI corresponding to the relevant resource.

Example of internal link in the description of a document

Integration on the data publisher side

All of these links are established by the wrapper, using either SPARQL requests (for DBpedia) or calls to data API (for Lexvo and Geonames). This is something any client application could do, obviously, but one advantage of publishing Linked Data is that part of the data integration work is done server side, by the person who has the most information about what his data is about. A data consumer just as to use the links already there instead of having to figure out a way to establish them himself.

A single data model

Another advantage for a data consumer is that all the data published by the wrapper, as well as all the connected data sets, are published in RDF. That is one single data model to consume. A simple HTTP GET asking for RDF content returns structured data for the content exposed by the wrapper, and the data DBpedia, Lexvo and Geonames. There is no need to worry about different data formats returned by different APIs.

Next steps

We are implementing more linking services and also working on making the code more generic. Our goal, which is only partially fullfiled now, is to have a generic tool that only requires an ontology for the data set to expose it as Linked Data. The code is freely available on GitHub, watch us to stay up to date with the evolution of the project 😉

%d bloggers like this: