Archive

Posts Tagged ‘network analysis’

CKAN->network exporter for the LOD Cloud

The LOD cloud as rendered by Gephi

One year ago, we posted on the LarkC blog a first network model of the LOD cloud. Network analysis software can highlight some aspects of the cloud that are not directly visible otherwise. In particular, the presence of dense sub-groups and several hubs – whereas in the classical picture, DBPedia is easily perceived as being the only hub.

Computing network measures such as centralities, clustering coefficient or the average path length can reveal much more about the content of a graph and the interplay of its nodes. As shown since that blog post, these information can be used to appreciate the evolution of the Web of Data and devise actions to improve it (see the WoD analysis page for more information about our research on this topic). Unfortunately, the picture provided by Richard and Anja on lod-cloud.net can not be fitted directly into a network analysis software which expects a .net or CSVs files instead. Fortunately, thanks to the very nice API of CKAN.net it is easy to write a script generating such files. We made such a script and thought it would be a good idea to share it 🙂

The script is hosted on GitHub. It produces a “.net” file according to the format of Pajek and two CSV files, one for the nodes and one for the edges. These CSV can then easily be imported into Gephi, for instance, or any other software of your choice. We also made a dump of the cloud as of today and packaged the resulting files.

Have fun analysing the graph and let us know if you find something interesting 😉