LD2SD, Helios and Linked Data / SemWeb extracted from development forges

We have received a visitor from DERI last week (PhD student Aftab Iqbal) who’s researching integration of facts about software development tools into Linked Data in order to provide interesting “semantic mashups” of data into IDEs like Eclipse (see his slides). Quite interesting is the choice of ontologies and the results integrated in an Eclipse plugin made available to developers.

This approach is quite similar to the one we practice in the core of the Helios platform (still under development) to integrate data coming from different FLOSS ALM tools in order to create dashboards offering a consolidated view of software (maintenance) process.
Maybe the difference is that Helios does this internally inside a “self-contained” platform whereas the potential of LD2SD presented by Aftab is to do the same on the Web of Linked Data.

Also, in Helios, there are other contributions made for the Mandriva distribution (with links with projects like Scribo and Nepomuk to which Mandriva is also participating) in the form of the doc4.mandriva.org, in order to aggregate, this time, not facts at the “project” level, but for a meta-project (a GNU/Linux distribution) that are quite interesting. See Stéphane Laurière’s slides for details.

We’re also experimenting in the frame of the COCLICO project on producing RDFa data about software development projects hosted in FusionForge instances. (see our progress tracked through this FusionForge feature request). First candidates are project’s DOAP profiles and developer’s FOAF ones, and lots of SIOC to glue it all, and of course other informations relating to a Forge ontology that we’re proposing in COCLICO.

With recent announcements that Mylyn is investing a lot in OSLC, and OSLC being based on RDF, and the advent of the Semantic Desktop starting to emerge (in KDE mainly) on top of Nepomuk, this brings great promises for a great Semantic future.

Very interesting presentations this morning at OWF about the future of the Semantic Desktop

I’ve attended this morning the OWF session on the future Semantic Desktops, with excellent presentations by Stefan Decker (DERI) on the concepts of the Semantic Web and the Social Semantic Desktop, by the Zeitgeist project guys (Seif Lofty and Alexander Gabriel), and finally by Sebastian Trüg demonstrating the Nepomuk semantic desktop components in KDE.

It was a good occasion to meet these people (together with Henry Story) and talk a little bit about our efforts in the area of bugtracking and Semantic Web, and to discuss the future of the Baetle ontology, and do more teasing for fetchbugs4.me.

I hope some day, we integrate the models and tools so that bugs filed on bugtrackers can be referenced and manipulated with Desktop tools through interoperable APIs and common ontologies. More work ahead of us in Helios 🙂

2 presentations about Helios, Semantic Web, bugs, etc. at RMLL 2009

In the “Development” track of the recent LSM/RMLL 2009, we (Stephane Laurière and me) have presented two related speeches, about the use of Semantic Web technology in the frame of Open Source projects development.

Stéphane presented SWIM : Semantic Web enabled Issue Manager, which presents an integration of Semantic Web techniques in the Mandriva community support site, and on the desktop. It’s based on results of projects Nepomuk, Helios and Scribo.

I have also presented Tracking bugs on the (Semantic) Web, which explores the use of Semantic Web techniques (RDF) as a mean to render bugtrackers interoperable, to be able to track bugs to the scale of the whole Semantic Web. This is also based on the work we do in the frame of the Helios project.

Enjoy the slides attached to the linked pages above.

UDD, SWIM, Flossmetrics : facts databases about libre software distributions… going Semantic ?

I’ve attended the recent FOSDEM 2009 (great as always), where a number of presentations triggered a lot of my interest.

First @DebianRoom where Lucas presented UDD, the Universal Debian Database. This database groups facts about the Debian project, to ease the creation of queries on what’s happening in the Distribution. This is for instance very helpful for QA tasks, like counting bugs with certain characteristics, or comparing packages in various ways.
Note that a complementary presentation by Enrico was very interesting, on DDE : Debian Data Export (edit: see also his post on DDE), showing ways to offer services to query UDD.

Another presentation, @CrossDesktopRoom introduced the Flossmetrics database, which is collected out of many libre software projects, by extracting contents of the project data from the hosting forges. Very much interesting, in particular since the data becomes available, and a large number of projects allow researchers to compare them in many ways.

Maybe Flossmetrics could benefit from data coming from the Debian UDD… or vice versa ? I think contacts have been taken to think about potential future interchange between the 2.

A general criticism I could make on these two databases is that their schema (the tables & columns layout, as well as the eventual relations), and the code of the data “harvesters” is the only way to understand the real meaning of these data. There’s not so much semantics. Sometimes for known reasons, because, as explained by the UDD developers, there’s actually much incoherence in some of the Debian tools already, and it still it happens to deliver 😉

I’m thinking of a way to produce similar databases of facts (results of queries on these) with Semantic Web standards, to try and convey some bits of commonly agreed semantics, hence fostering interoperability of these databases, and maybe allow comparison of facts relating to different projects. Edit: actually, this idea has been in the air for a year, as readers of a provious post may remember. See in particular the paper by James Howison presented last year at WOPDASD 2008.

It happens that Mandriva, as a followup of the Nepomuk project is indeed trying to setup such a database (called SWIM at the moment. Edit: it has been renamed to MEPHISTO. See my other post on it.) with the use of RDF ontologies, to store facts and annotations about its distribution (more details here). In the HELIOS project, we’ll certainly try and investigate the use of such techniques to try and manipulate such data, like bugs for instance.

I’m thinking about providing an access to UDD with the use of a SWIM-like service, so maybe we can imagine things like more linking of facts about packages, people, bugs and such between Mandriva and Debian, for instance.

Note that at the FOSDEM there were also interesting presentations relating to these kinds of semantic techniques, both relating to outcomes of the Nepomuk project : one about the integration of KDE 4.2 in Debian, where tools like Soprano were mentioned, and another about Tracker in Gnome (which I haven’t attended) about the same kind of techno on the Gnome side.

The future seems semantic, somehow… and we have then a lot of work ahead of us. More to come.