Linked data: new ontologies website

Hello, I鈥檓 Sofia Angeletou and I鈥檓 the Data Architect for the Linked Data Platform (LDP),聽which builds the聽麻豆约拍鈥檚聽services for creating and publishing linked data.

I鈥檓 going to talk to you about our new site which we聽released聽last week聽and where you can find the that 麻豆约拍 uses to support , news prototypes and soon 麻豆约拍 and Radio programmes.

What is it and why are we doing it?

, the owner of the Linked Data Platform, has how we have expanded the reach of linked data within the 麻豆约拍 to more audience facing products and presented our ambitions to using linked data as glue for the plethora of content the 麻豆约拍 produces. As a direct result of this, more models are being built to support additional functionality and cover new and diverse domains of interest.

is a human friendly view of the data models in the Linked Data Platform and is meant to give a comprehensive understanding of which ontologies the 麻豆约拍 uses, why and how. This is provided for members of the public and anyone who wants to get a better understanding of the 麻豆约拍's Linked Data.

Those of you who have visited before will immediately notice the different look, but the main changes lie beneath the presentation.

The previous /ontologies did not reflect the 麻豆约拍鈥檚 work with Linked Data, and was updated in an ad-hoc manner. It had organically grown as a result of various projects and hosted the schemas used by applications to publish such as and ; yet these applications were not built on a semantic stack,聽did not use聽a triplestore or for data manipulation (the exception to this being the Sport ontology which has been used in the Linked Data Platform for bbc.co.uk/sport since 2010).

The Linked Data Platform team is now responsible for /ontologies as part of its wider goal to support the appropriate usage and management of ontologies. We want the 麻豆约拍 to continue being part of the Linked Open Data (LOD)聽ecosystem and to this end we have decided that /ontologies should reflect our work with Linked Data, be open and more transparent and make the first step towards opening up the 麻豆约拍鈥檚 data as explained by the 麻豆约拍's Director of Strategy and Digital James Purnell in

What does it contain and how does it work?

The models published in /ontologies live in our triplestore and are the basis for the Linked Data services offered the clients of the LDP platform. They fall into three main categories, models about the content, the reference data and the applications.

The content model is the ontology which is used for content metadata such as an ID from the originating CMS and the date it was published. It is used to associate creative works with the things they are about and links to the human readable view of the creative work.

The domain ontologies are used to describe the things for which the 麻豆约拍 creates content, the reference data. These ontologies include the which describes the main things the 麻豆约拍 talks about (People, Places, Events, Organisations and generic Topics), thewhich supports 麻豆约拍 Sport, the which supports 麻豆约拍 Education, and the ontology, built in collaboration with other media organisations which supports News connected stories pilot propositions. These ontologies tend to be owned by product teams (eg News or Sport) and not the LDP team.

Last, but not least, there are application ontologies which encode application logic such as the that supports the LDP interaction with content management systems, the that helps us manage and audit data and the which describes the products, web documents and platforms for for which the 麻豆约拍 produces content.

The new /ontologies is hosted on Amazon AWS using the 麻豆约拍鈥檚 cloud聽tools and uses the LDP APIs to obtain the ontologies from the live triplestore. This means that the ontologies you see are the exact same models used to support the 麻豆约拍鈥檚 live websites. Aside from a minimal amount of manual documentation, all ontologies are self documenting such that the term documentation dynamically changes as the ontologies change in the store.

Another important feature is that all ontology changes will also be public. Each new version contains a change reason which justifies its existence. All previous versions are available and one can easily find the delta of the models.

The reason we did this is twofold. On the one hand, we work in an agile manner and our ontologies change very frequently but on the other hand we want to have an up-to-date public facing view of our ontologies. So we chose to publish all changes in order to avoid the maintenance overhead and the risk of having internal and external versions of the same ontology. The changes are usually raised from our clients to support additional functionality in their product, or they are raised by the LDP aiming to improve our services.

We have often been asked why we built our own models given that there鈥檚 already an enormous number of existing vocabularies out there. The main reason is the fact that small and controlled models are easier to change. Our approach to ontology building is iterative and incremental reflecting the same agile approach to how we build our products. Although at times it鈥檚 tempting to model the world, we try to keep the ontologies as minimal as possible. modelling聽only what's needed to meet particular requirements. We embrace fast failure over perfect first time, and the impact this has on our ontologies is that we will actively depreciate and ultimately remove terms that don't work for us.聽Frequently we have a long backlog of change requests that are often dependencies for functionality that needs to be rolled out immediately. Engaging with a large community to clarify the semantics of a widely adopted model is often not viable when speed is of the essence. In addition, the nuances of each use case might not be appropriately represented in an open vocabulary. Our primary goal is to deliver functionality for the live site quickly, alignment with the LOD world is a secondary goal which we address by providing from our models to the popular vocabularies in the LOD cloud.

What next?

As I previously mentioned we view opening up our ontologies as the first step to opening up our data. The immediate next steps include providing more data about how the ontologies are used, for example through links to instance data. For example, which people, places, football teams and more does the 麻豆约拍 create content about. What would you like to see next?

Sofia Angeletou is the Data Architect for the Linked Data Platform, 麻豆约拍 Future Media

麻豆约拍

Accessibility links

Linked data: new ontologies website

More Posts

Previous

Continuous Integration for PHP, JavaScript and Cucumber

Next

#newsHACK 2: participants and live updates