Thursday, June 30, 2011

OWL-izing the Cognitive Atlas

Recently I have spent some time turning the Cognitive Atlas knowledge base into a full-blown ontology.  This effort came about as I was working on responding to reviews from a paper describing the project (submitted to Frontiers in Neuroinformatics).  We made the choice at the outset of the project to store it in a custom relational database, rather than using an ontology language as the basic storage format, because we wanted to retain flexibility to store things that are not necessarily ontology-friendly.  We always said that we would come up with a way to later translate the knowledge into a formal ontology, but I began to get a bit itchy about our ability to actually do this.  So, over the last few weeks I have created a pipeline for turning the database into an OWL ontology.  I have made the code and ontology available at https://github.com/poldrack/cogat in case anyone is interested in having a look.  I am a complete newbie when it comes to ontologies, so any comments on the structure of the ontology are very welcome. (We would also welcome comments on the content of the ontology, but if you don't like the content then you really should just go into the Cognitive Atlas site and change it yourself!)

I started with RDF dumps of the concepts and tasks, which are available from the SPARQL endpoint page.  Assertions about relations in the database are not currently available from the endpoint, so I got a custom dump of those as well.  I then wrote a python script to finesse all of this into an OWL ontology. I have to say that this is a great way to learn about the structure of OWL!  I used Protégé to help validate my OWL and examine the contents, which was really useful.  

Part of the work in setting up this ontology was deciding how to align it with other ontologies, particularly with CogPO. I had a long meeting at OHBM2011 with Jessica Turner and Angie Laird, during which we hashed out a set of classes and relations that will serve to structure the task representation in both of our ontologies.  For our representation of tasks, we are inheriting the BehavioralExperimentalParadigm class from CogPO. 

I have also registered the Cognitive Atlas with the NCBO BioPortal site, which I hope will give it increased visibility.  In the long run, I plan to create a pipeline by which the ontology will the automatically generated and updated directly from the database, but for now will do regular manual updates.