Latest News
Nepomuk Virtual Folders - The Next Level
Well, maybe "The Next Level" is overstating it but I improved the query API a lot. Not only can we now properly handle all sorts of literal comparisons but we can also use plain SPARQL queries. The latter allow some nice stuff like "Recent Files".
For anyone interested the "Recent Files" virtual folder is coded using the folloing SPARQL query:
select ?r where {
?r a <http://freedesktop.org/standards/xesam/1.0/core#File> .
?r <http://freedesktop.org/standards/xesam/1.0/core#sourceModified> ?date .
} ORDER BY DESC(?date) LIMIT 10
A very simple query that just selects the 10 most recent files.
Also nice are the folders listing all files modified today or yesterday. Anyway, time for a little screenshot and for me to code some query creation GUI (and maybe nested virtual folders).
Soprano 2.0.98 alpha1 released
KDE 4.1 comes close and so does Soprano 2.1. To be ready for the first KDE 2.1 alpha I hereby declare Soprano 2.0.98 as released. Get it while its warm and enjoy a preview of the new features:
- SignalCacheModel to restrict the number of emitted statementsAdded and statementsRemoved signals in a certain timeframe.
- Raptor serializer now supports all raptor serializer factory names which are mapped to Soprano
user serialization types. - Changed mimetype of N-Triples to "application/n-triples"
We Don't Search...
Virtual Folders in KDE
You can tag files, you can annotate them, Strigi indexes your files, I showed how to create new information types and things, but you could not really use it. I suspect you want to find the things again by searching for it. Well, I don't think we should search. We should simply find!
OK, enough of the bragging. Introducing Nepomuk virtual folders.
Step 1 - Predefined Virtual Folders
There are the default predefined folders. An example (and honestly the only I defined so far) is the folder that lists all music files (actually all music files indexed by Strigi. So, yes, Strigi needs to be running for this to work best.)
Step 2 - On-the-fly Virtual Folders
We can also define our virtual folders on the fly by simply using the nepomuksearch:/ protocol in combination with the query (I personally would prefer to hide the protocol in a nice GUI but that is polishing for the future). If we for example want to list all resources related to Nepomuk:
We get files tagged with tag Nepomuk, we get folders that contain the term "Nepomuk" (indexed via Strigi again) and we also get the tags themselves. The latter is most interesting as the tags are Nepomuk resources which do not relate to any file. KIO can list them anyway and even displays the correct type. The display of the correct type which is read from the Nepomuk store was achieved by a very small patch to KFileItem which I hope will be accepted. It basically treats the "application/x-nepomuk-resource" mimetype as a special case very much like "application/x-desktop". But instead of reading the type from the desktop file it is read via Nepomuk::Resource::resourceType. Simple but effective.
Well, off to a better example, a real example. Let's say we wrote down some notes using KWrite and saved them in some folder anywhere in the home directory:
Now instead of clicking through the directory hierarchy in search for the file we simple access it using a Nepomuk virtual folder. Be aware that the word "nepomuk" is part of the text itself while the word "notes" is part of the file name.
As a last little example we reuse the information created in my last blog: my friend Tudor. He can of course also be accessed through a virtual folder:
Again the type information we created ourselves is used in the KIO slave. Well, that is it for today. I am off to work some more on the folders, making the query language more expressive (at some point we will need variables in there!), and maybe creating a virtual folder service. The latter could then be used not only for KIO virtual folders but also in KMail or some other application to list whatever resources one likes.
Blog title plagiarism: "Will the real Nepomuk please stand up!"
Now what is that supposed to mean? The "real" Nepomuk? Well, you did not actually think that I would introduce an RDF store into KDE just to save some tags and ratings? No, the "real" motivation goes way beyond that and it is time to hint at it.
Today I committed the PIMOShell to the Nepomuk playground (To the right you see the PIMOShell main window showing all xesam:Music resources).
The PIMOShell is a metadata maintenance and debugging tool which I will now use to give a glimpse of the "big picture".
But first a few words on PIMO: PIMO, the Personal Information Model Ontology, forms the basis for all custom, user-created classes (types) and properties. It defines basic stuff like an Agent or a Location and is intended to be extended by the user in any way he or she likes (more information on PIMO).
Let us dive into an example:
By using the context menu in the upper left class list PIMOShell allows us to create new classes with a nice little dialog. Basic information like a label and an optional icon and description can be added directly. We create a new class "Friend" which is a special kind of Person: a close friend.
Once the class is created it shows up as a new subclass of pimo:Person which in turn is a subclass of pimo:Agent. It can now be used like any other class in the system. That essentially means that we can create instances:
Again we can set basic properties like the label and the image. We create an instance of Tudor, my friend from DERI, Galway. Once we did so, PIMOShell lists Tudor as a new instance of Friend:
Now Tudor has been created as a new RDF resource in the Nepomuk storage. And that means we can query him using the simple Nepomuk query client:
When searching for all resources of type "Friend" we find Tudor. Nice. 
But simply creating new classes is not fun enough. To categorize friends we might be interested in what relates us to them. Like a mutual interest. Thus, we create a new property:
As you can see the new property will be for our new class "Friend". And once it is created we can change its value in the lower pane:
The mutual interest that I share with Tudor is complaining about bad food (just for the sake of the example of course).
Again the new information allows us to find Tudor:
I personally think this is quite cool. Of course the PIMOShell is not intended for the end user. But it gives an idea pf the possibilities. A PIM application might categorize according to user created classes with additional fields, saved text documents might be typed, we can organize arbitrary data in a powerful way. All we need is a deeper application integration. And this is where you come in. I hope. 
The Last Bug...
We probably all know the situation: I finally fixes the last bug in Soprano. Yes, I know, there probably is no such thing as the last bug. But it feels good to lie to myself in this case. The redesigned Nepomuk server is done and it works smoothly.
First of all, the Nepomuk Server is no KDED module anymore. So no more 90% CPU load for KDED. It was to heavyweight for KDED anyway. As a recently learned KDED was never intended to be a general purpose service daemon but a manager for small and stable modules. So now the Nepomuk server has its own service management including dependencies handling. Each Nepomuk service runs in its own child process and can be controlled through D-Bus, either through the Nepomuk server's service manager or the processes interface directly (The idea of course is not new: I took the ProcessControl class from Akonadi, thanks guys).
This has several advantages:
- Implementing a Nepomuk service is still as simple as writing a KDED module except that you derive from Nepomuk::Service instead and use the NEPOMUK_EXPORT_SERVICE macro.
- The Nepomuk repository (storage) is a service itself like any other which makes for a much cleaner design.
- Strigi is also handled through a Nepomuk service. Again: clean.
- A buggy service will never bring down the whole system.
- Services can perform as many blocking operations as they want. When implementing a service you don't have to care about threading or asyncing (except if the service should be responsive itself)
- The dependency handling automatically delays starting of services until the Nepomuk storage is ready (in case it needs to perform some longer initialization like converting to a new backend)
All in all I am very pleased with the new design. I will commit all of it next Monday since we have the new Nepomuk::Service class.
And in case you are wondering why we need a Nepomuk service manager: at the moment we have a total of six Nepomuk services:
- The data storage
- Strigi
- The ontology loader which makes sure ontologies are up to date (and will soon support importing new ontologies from the web)
- The file watch service which monitors file move and delete operations and updates the metadata accordingly (still in playground)
- The alignment service which tries to optimize the data stored in Nepomuk (still in playground and very simple, at some point will do stuff like merging duplicates contact information and so on)
- The Nepomuk Social Query Daemon which I blogged about before (in playground and more a proof-of-concept but still....)
One future services will be the search service which provides a nice search interface to all KDE applications (that do not want to use low level SPARQL stuff). Work on something in this direction has begun as part of Akonadi.
Nepomuk Performance and GUI goodies
Some words on performance
Nepomuk performance has always been a bit of a problem. Last but not least this was due to the D-Bus communication with the Nepomuk server that took place all the time. Don't get me wrong, D-Bus is pretty fast, but you always get the overhead of the marshalling of messages and routing them through the D-Bus daemon.
So with the new QLocalServer and QLocalSocket in QT 4.4 which introduce Windows compatibility, I re-enabled the Soprano local socket communication which is a lot faster.
Now the Nepomuk server provides two interfaces: the good old and very easy to use D-Bus interface and the fast binary local socket interface. (The latter is barely documented since it is only intended for Soprano itself through Soprano::Client::LocalSocketClient).
To use the new interface one could of course create and instance of LocalSocketClient but that is not recommended for two reasons:
- The path to the socket would be hardcoded in the application
- The local socket communication does not support signals
That is why libnepomuk provides a fancy Soprano::Model implementation that handles all this transparently. It executes all commands through the socket interface while listening for signals via D-Bus. At the moment this Model can be accessed through
Soprano::Model* mainModel = Nepomuk::ResourceManager::instance()->mainModel();
Internally a new class called Nepomuk::MainModel is used. That one is not public API yet but might become at some point to allow developers to use the interface without creating a ResourceManager instance.
Apart from that I also did some small optimizations in Soprano which I tracked down using valgrind. Very cool, I think that is actually the first time I was able to improve performance based on valgrind results.
GUI stuff
Ok, enough of the internal implementation details. Lets have a look at some nice GUI improvements that I would like feedback on. The target is Dolphin which has been providing Nepomuk annotations for quite a while now. The problem always was that I created ugly prototype widgets which I never improved (except for the rating one). Yesterday evening I changed that and commited a tagcloud and a new commenting widget to Dolphin. Now it still looks a bit cluttered but IMHO much better than before.
The basic idea for both is to display the comment and the tags read-only and allow the user to edit them in a fancy popup which appears after clicking on the appropriate button. In case of the tagcloud we get another tagcloud which shows all available tags, the ones assigned selected.
Please go ahead and test it and let me know how it can be improved. This does include the look and feel as well as the layout of the sidebar (spacing and margins and alignment and stuff).
Me Nepomuk, You Nepomuk, too?
Now that the Nepomuk project review is done I can get back to promoting Nepomuk features and possibilities. Today I will show how existing Nepomuk and Soprano technologies can be combined to provide very simple "Social" capabilities.
In a previous blog entry I presented the Nepomuk search client which allows to search the Nepomuk data store based on installed types and properties. Now how about taking that, wrapping it in the simple Soprano tcp server/client system and announcing it via Avahi? That would allow us to query our buddies' Nepomuk data. I did exactly that and the result are two little tools with very fancy names: The Nepomuk Social Query Daemon and Client.
The nsqd, the Nepomuk Social Query Daemon, is a kded module which provides an Avahi service that is then found by the nsqclient, the Nepomuk Social Query Client, which allows to perform queries on remote data in the exact same way as the normal search client does. Easy.
Now as this is just a showcase tool there is close to no security except for read-only access. Thus, once the nsqd is running everyone able to open a connection to the server port is able to read your data. So there is room for improvement. 
The nsqd and nsqclient can be found (like all experimental Nepomuk stuff) in the KDE svn playground module. Due to the security issues the nsqd is not started by default but has to be started manually:
qdbus org.kde.kded /kded org.kde.kded.loadModule nsqd
Implementation
If you are not interested in hacking details just stop reading now.
Let's look at some details. The most simple thing first: the read-only access. I did this by implementing a read only Soprano Model (which I actually moved to Soprano::Utils since it has a clean API and seems usable beyond this example). This is actually pretty simple: I just derived it from Soprano::Model and made all writing methods throw a "permission denied" error. (I did not derive from Soprano::FilterModel since then one could easily access the parent model and write to it anyway.)
Exposing the local Nepomuk store through TCP was simple, too. Soprano already comes with a simple binary TCP server/client implementation (which I used before the cleaner D-Bus one). So all I had to do was create a Soprano::Server::ServerCore implementation which forwards all calls to the local Nepomuk server.
This is actually pretty simple. In our new ServerCore subclass we create a connection to the Nepomuk server:
SopranoForwardingCore::SopranoForwardingCore( QObject* parent )
: ServerCore( parent )
{
m_client = new Soprano::Client::DBusClient( "org.kde.NepomukServer", this );
[...]
Then we have a cache for the models which store the wrapping ReadOnlyModel instances and simply forward:
Soprano::Model* SopranoForwardingCore::model( const QString& name )
{
if ( m_models.contains( name ) ) {
return m_models[name];
}
else {
if ( Soprano::Model* model = m_client->createModel( name ) ) {
Soprano::Util::ReadOnlyModel* roModel = new Soprano::Util::ReadOnlyModel( model );
model->setParent( roModel ); // memory management
m_models.insert( name, roModel );
return roModel;
}
}
return 0;
}
Last but not least we start the server core, i.e. make it listen on some port and promote the service through Avahi:
m_serverCore = new SopranoForwardingCore( this ); m_serverCore->listen( 0 ); m_dnssdService = new DNSSD::PublicService( "Nepomuk Social Query Service","_nepomuk._tcp", m_serverCore->serverPort() ); m_dnssdService->publish();
Now the nsqclient can discover our service through Avahi and connect to it using Soprano::Client::TcpClient:
Soprano::Model* createModel( DNSSD::RemoteService::Ptr service, const QString& name )
{
Soprano::Client::TcpClient* sopranoTcpClient = new Soprano::Client::TcpClient();
QHostInfo hostInfo = QHostInfo::fromName( service->hostName() );
sopranoTcpClient->connect( hostInfo.addresses().first(), service->port() );
return sopranoTcpClient->createModel( name );
}
Again: easy.
Ok, that is all for today. I feel I am getting too technical again anyway.
Soprano 2.0.3 released
One could ask why there are so many bugfix releases for Soprano these days. Well, the reason is simple: we are in the process of preparing the Nepomuk project review. That includes a lot of testing. :)
There are only two fixes but one of them seemed important enough for a new release.
- Fixed a string caching bug in LiteralValue which resulted in invalid string representations when assigning a QDate, QTime, QDateTime, or QByteArray via operator=
Generic and nice-looking ratings all over KDE (wouldn't that be nice)
I just commited the finalized KRatingPainter to svn trunk which allows to paint a rating value using any QPainter. I think it is quite nice since it allows to specify the alignment, a spacing, a custom icon, the maximum rating, a hover rating, and so forth. And I think it would be great if this class (and its easy-usage widget companion KRatingWidget) would be used throughout KDE whenever we want to display a rating value. Although it is part of the Nepomuk lib at the moment, it has no real dependancy here: the rating is a simple integer value.
A little test application shows the rating widget in action:
nepomuk.kde.org online
I am proud to announce that finally nepomuk.kde.org went online. A owe a big thank you to pinheiro who not only designed the new Nepomuk icon but also did the webpage layout. I also want to thank Luke Parry who adapted pinheiros design for Drupal. Last but not least Dirk Mueller went through the trouble to actually setup Drupal.
So please check it out and especially take a look at the beefed up documentation section: I wrote a bunch of new tutorials for techbase.
Soprano 2.0.2 released
It has not been long since 2.0.1 but 2.0.2 introduces an important change (not really a fix) in the MutexModel.
MutexModel in ReadWriteMultiThreading mode now allows multiple read operations from the same thread at the same time even if a write operation is waiting. This fixes a deadlock in the Nepomuk Strigi backend from KDE.
Source package from now on available.
Soprano 2.0.1 released
The Soprano team is proud to announce the release of Soprano 2.0.1. This maintenance release fixes a number of issues with 2.0:
- Fixed method statementCount in Sesame2 backend
- Redland backend: Always encode strings as xls:string rather than rdfs:Literal values to match the Soprano guidelines.
- Always set a dummy base URI in the raptor serializer
- Fixed formatting of dateTime values.
- Fixed NRL namespace
- Fixed NAO namespace
- Fixed plugin loading on MAC OS/X
Nepomuk Appendix A - RDF for Dummies in a Nutshell
In my previous posts I used some terms that probably need explaining. The following descriptions should not be used as basis for any exam and may very well scare some academic semantic web professionals, but they get me through the day. And I think they are sufficient to understand most of what is going on with Nepomuk data in KDE.
RDF - The Resource Description Framework describes a way of storing data. While "classical" databases are based on tables RDF data consists on triples and only triples. Each triple, called statement consists of
subject - predicate - object
The subject is a resource, the predicate is a relation, and the object is either another resource or a literal value. A literal can be a string or an integer or a double or any other type defined by XML Schema (actually it is even possible to define custom literal types). Since RDF was born as a web technology all resources and relations are identified by their unique URI. (Meaning they have a namespace often ending in a # and a name. Typically abbreviation such as foo:bar are used.) Thus, a dataset in RDF is basically a graph where resources are the nodes, predicates the links, and literals act as leaves.
RDF defines one important default property: rdf:type which allows to assign a type to a resource.
RDFS - The RDF Schema defines a set of resources and properties extending RDF. This extension basically allows to define ontologies. RDFS defines the two important classes rdfs:Resource and rdfs:Class which introduces the distinction between instances and types, as well as properties to define type hierarchies: rdfs:subClassOf and rdfs:subPropertyOf, and rdfs:domain and rdfs:range to specify details when defining properties.
This allows to create new classes and properties much like in object oriented programming. For example:
@PREFIX foo: <http://foo.bar/types#> foo:Human rdf:type rdfs:Class . foo:Woman rdf:type rdfs:Class . foo:Woman rdfs:subClassOf foo:Human . foo:isMotherOf rdf:type rdf:Property . foo:isMotherOf rdfs:domain foo:Woman . foo:isMotherOf rdfs:range foo:Human . foo:Mary rdf:type foo:Woman . foo:Mary foo:isMotherOf foo:Carl .
A simple example of how to define an ontology in RDFS (using the Turtle language). The last two important predicates in RDFS are rdfs:label and rdfs:comment which define human readable names and comments for any resource (the labels are used for matching fields and grouping results in my previous blog on search).
NRL - The Nepomuk Representation Language was developed in Nepomuk to further extend on RDFS. I will not go into detail and explain everything about NRL but keep to what is important with respect to KDE at the moment.
Most importantly NRL changes triples to quadruples where the fourth "parameter" is another resource defining the graph in which the statement is stored (may be empty which means to store in the "default graph"). This graph (or context as it is called in Soprano) is just another resource which groups a set of statements and allows to "attach" information to this set. NRL defines a set of graph types of which two are important here: nrl:InstanceBase and nrl:Ontology. The first one defines graphs that contain instances and the second one, well you guessed it, defines graphs that contain types and predicates.
To make this clearer let's extend our example with NRL stuff:
@PREFIX foo: <http://foo.bar/types#> foo:graph1 rdf:type nrl:Ontology . foo:graph2 rdf:type nrl:InstanceBase . foo:Human rdf:type rdfs:Class foo:graph1. foo:Woman rdf:type rdfs:Class foo:graph1. foo:Woman rdfs:subClassOf foo:Human foo:graph1 . foo:isMotherOf rdf:type rdf:Property foo:graph1 . foo:isMotherOf rdfs:domain foo:Woman foo:graph1 . foo:isMotherOf rdfs:range foo:Human foo:graph1 . foo:Mary rdf:type foo:Woman foo:graph2 . foo:Mary foo:isMotherOf foo:Carl foo:graph2 .
But making a distinction between ontology and instance resources is not all we gain from contexts.
NAO - The Nepomuk Annotation Ontology already defines resource types and properties you already encountered in KDE: nao:Tag or nao:rating. But it also defines nao:created which is a property that assigns an xls:dateTime literal to a resource, in our case a graph. This way we store information about when a piece of information was inserted into the Nepomuk repository.
foo:graph1 nao:created "2008-02-12T14:43.022Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
SPARQL - The Query Language for RDF is what we use to query the RDF repository. Its syntax has been designed close to SQL but since it is quite young it is by far not as powerful yet.
Anyway, this is how a simple query that retrieves the mother of Carl looks like:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foo: <http://foo.bar/types#>
select ?r where { ?r foo:isMotherOf foo:Carl . }
Or if we take NRL into account:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foo: <http://foo.bar/types#>
prefix nrl: <http://semanticdesktop.org/ontologies/2007/08/15/nrl#>
select ?r where { graph ?g { ?r foo:isMotherOf foo:Carl . } . ?g rdf:type nrl:InstanceBase . }
I think this is enough for today. I hope this blog entry helps in understanding the inner workings of Nepomuk better. Let me just give one more hint: Soprano (the RDF storage solution we use in KDE) comes with static QUrl objects for most of the common resource URIs. You find them in the Soprano::Vocabulary namespace.
Fetch, Nepomuk, fetch!
Search - a very important topic when it comes to data in general. The same is true for metadata and all that is Nepomuk. I blogged about the virtual folders idea for KMail which will be realized through Nepomuk. But before that there is the "simple" desktop search. We know it from systems like Beagle or Strigi. With Nepomuk, however, a lot more is possible. We are just getting started.
Let me give a quick glace of what I am doing regarding search. Now that Strigi analyzes files and Nepomuk extensions to Dolphin allow to tag and comment files we surely want to reuse that information. On the list of simple ways to exploit the data in the Nepomuk store, search is No. 2 (No. 1 being a simple display of it). We want the desktop search to handle manual metadata like tags and automatically gathered metadata alike.
Well, that is possible and I am doing it already in playground:
Now isn't that nice? You can combine searching for tags with other metadata searches. So far so good. it gets better: Nepomuk is based on RDF/S/NRL ontologies. Thus, each metadata type and field is defined by an RDF resource. In most cases (for example Xesam) these come with proper rdfs:label definitions. Thus, Nepomuk can not only group the results automatically (see the File, Image, or Music groups) but can also generically handle search fields. What does that mean. Well, it means that when searching for "hastag:nepomuk", "hastag" will be matched to nao:hasTag automatically. The same would be true for "tag" as we are doing a fulltext search on field names. And even better: if the ontologies are translated (RDF supports language tags after all) you can search the same fields using your native language and the results will be grouped in your native language (I could use some help on setting up a translation system as for desktop files here). It all happens generically without any hardcoded mapping. Pretty cool, isn't it?
OK, so much for the outer shell. Let's dive into the code for a bit. (But please keep in mind that I have plans to wrap this into a nice search service soon which allows most application developers to perform their simple day-to-day queries without knowing much SPARQL.)
If we want to find the proper field to match in a field:value query we can do as follows:
QString field = getFieldNameWhateverFooBarBlaBla();
QString query = QString( "select ?p where { "
"?p <%1> <%2> . "
"?p <%3> \"%4\"^^<%5> . }" )
.arg( Soprano::Vocabulary::RDF::type().toString() )
.arg( Soprano::Vocabulary::RDF::Property().toString() )
.arg( Soprano::Vocabulary::RDFS::label().toString() )
.arg( field )
.arg( Soprano::Vocabulary::XMLSchema::string().toString() );
Soprano::QueryResultIterator labelHits = model->executeQuery( query, Soprano::Query::QueryLanguageSparql );
This will give us all direct hits for a properly (field) label. However, in most cases users will enter a slight variation of the actual label. Thus, we use a more fuzzy search:
QString query = QString( "select ?p where { "
"?p <%1> <%2> . "
"?p <%3> ?label . "
"FILTER(REGEX(STR(?label),'%4','i')) . }" )
.arg( Soprano::Vocabulary::RDF::type().toString() )
.arg( Soprano::Vocabulary::RDF::Property().toString() )
.arg( Soprano::Vocabulary::RDFS::label().toString() )
.arg( field );
The regular expression simply filters all properties with a label that matches our field string.
And then it gets a bit tricky as there is one problem left in Soprano: The RDF storage solutions we use (Redland or Sesame2) do not have performant full-text search indexes. Thus, for Soprano I implemented a wrapper that uses a CLucene index to provide a fast full-text index on all literal RDF triples (The Nepomuk server already uses it so there is no need to instantiate it on the client side). I have plans to hide this transparently under a nice Soprano query API but so far we do not have that. As a result we have to perform full-text queries and "normal" SPARQL queries separately (as always I need help implementing this).
Let's say we got a field URI from our previous search and stored it in fieldUri.
QString value = getSearchValueWhateverFooBarBlaBla();
Soprano::QueryResultIterator hits = model->executeQuery( fieldUri.toString() + ':' + value,
Soprano::Query::QueryLanguageUser,
"lucene" );
And as a result we get all the resources that match the query.
This is just a small excerpt of what I am doing in the search client and what will soon be done in the search service but it should give you an idea of how things need to be done ATM. More complex queries are of course possible but the blog entry is already too long as it is. 
Akonadi and Nepomuk - Holding Hands in Osnabrück
Last weekend I was invited to the KDE-PIM meeting in Osnabrück to represent Nepomuk. First of all I have to say: thanks a lot for inviting me, guys. The meeting was a lot of fun (although staying awake got harder during the course of the three days you crazy work-maniacs!) and it was great to see known faces again and meet new nice people. As they have during the last years Intevation hosted the event and I want to give a quick thanks to them, too.
So much for the introduction. Let's dive into the good stuff now. The main focus of the meeting were the plans for KDE 4.1 and the integration of Aknoadi. However, the part that is most interesting to me is the Nepomuk integration. And this is were I was very pleasantly surprised. I did not have to do any convincing or argumenting at all. It was obvious that Nepomuk would be the solution for search in Akonadi. And not only that. The understanding of the concepts was flawless.
So what are the plans for Akonadi-Nepomuk integration?
- Taging in KDE-PIM: The most obvious integration at the momment is without a doubt the replacement of categories in KDE-PIM with Nepomuk tags. This would relate PIM resources with tagged files (and of course any other resource type in the future).
- Akonadi Agents to push data into Nepomuk: Akonadi has the concept of agents. Agents are plugins (although running in their own process) that act on changed data in the Akonadi store. In this case the agents will gather changed data and push it into the Nepomuk storage so it gets searchable and indexed properly. Tobias König already started a first agent which handles contact data, meaning that it converts the Akonadi items into NCO resources which are then stored into Nepomuk.
- Virtual folders in KMail: KMail will combine the current static folder layout with virtual folders based on live searches. A virtual folder selects a set of emails based on a Nepomuk query. This can turn out to be very powerful since one can define queries that do simple things like "select all emails that contain picture attachments" or more complex stuff like "select all emails that were sent by someone who participated in events tagged with 'KDE-PIM'" or even very fuzzy ones like "select all emails relating to a certain topic". For this to work Tobias and I started to create a higher level query interface. Although it is currently possible to do these queries, one has to do so by using the Soprano SPARQL query interface which may be too much for many applications.
While this is by no means a complete list it shows the direction Nepomuk integration will take in KDE-PIM. A fact I am very happy about.
So much for the high level report about the KDE-PIM meeting. More technical details about the implementation and the problems that still have to be solved later...
A little bit of tagging
For many Nepomuk is a rather abstract thing. So I will not try to explain it as a project again. I will just show what I have been up to. Randomly...
Tagging and KIO
We all know KIO and we all love it (At least I think we all do, right?). Now it was pretty obvious to create a KIO slave that allows to navigate the Nepomuk tags as folders. Writing this was not that hard. Just listing the tags and doing some plain magic around that:
QList tags = Nepomuk::Tag::allTags();
foreach(Nepomuk::Tag tag, tags ) {
doMagic(tag);
}
So no problem there:
However, now that we can browse the tags in Dolphin, we can also rate them in Dolphin. And this is where the trouble starts: The tag URLs used in the tags KIO slave differ from their original resource URIs (remember: Nepomuk uses RDF for storage and, thus, each resource has a unique URI). The original URI looks like nepomuk://foobar while the tags KIO slave of course uses tags:/<tag name>. This is a problem since now Dolphin will store ratings and comments for the tags under the tags URI and not the original one. (this is due to the fact that KIO does not allow to have different URLs for navigation and identification, maybe this could be tackled in KDE 4.x or 5.0?)
So what to do? The simple answer is called alignment. At least that is what we call it in Nepomuk. It references a service that aligns multiple resources that actually refer to the same entity. In general this can become arbitrarily complicated. In our case, however, we can use the brute force way and simply replace tag URIs.
So now we have a kded module in playground/base/nepomuk-kde (BTW: this is where all the experimental stuff happens) that does exactly that. To give you and idea of how something like this looks a bit of code:
QString query = QString( "select distinct ?tag ?name where { "
"?tag a . "
"?tag ?name . "
"FILTER(!REGEX(STR(?tag),'^tags:/')) . }" )
.arg( Soprano::Vocabulary::NAO::Tag().toString() )
.arg( Nepomuk::Resource::labelUri() );
QList tagsToChange = sopranoModel()->executeQuery( query, Soprano::Query::QueryLanguageSparql ).allBindings();
foreach( Soprano::BindingSet set, tagsToChange ) {
QUrl oldUri = set["tag"].uri();
QString name = set["name"].toString();
QUrl newUri = "tags:/" + name;
QList tagStatements = sopranoModel()->listStatements( oldUri, Soprano::Node(), Soprano::Node() ).allStatements();
foreach( Soprano::Statement s, tagStatements ) {
sopranoModel()->removeStatement( s );
s.setSubject( newUri );
sopranoModel()->addStatement( s );
}
tagStatements = sopranoModel()->listStatements( Soprano::Node(), Soprano::Node(), oldUri ).allStatements();
foreach( Soprano::Statement s, tagStatements ) {
sopranoModel()->removeStatement( s );
s.setObject( newUri );
sopranoModel()->addStatement( s );
}
}
This is basically how you do more advanced Nepomuk data handling. Using Soprano + SPARQL. Sadly SPARQL does not officially support update queries yet but it is a pretty new technology and we will get there.
Well, that's pretty much it. It changes the tag URIs and that results in a merge of the tag annotations with the original resource. It is a bit simple but does the job. And as a side effect: when you execute a search result that is a tag you directly come to the tags KIO slave and thus, the tagged resource. Fun, he? Ok, more on search next time.
Finally I got my kdedevelopers blog running
Why I failed before: I have no idea. Maybe the password mails got lost in the spam filter or I was just blind. Who knows. Important is that I am not using blogger which clee will like to hear (although I will be missing the WYSIWYG editor). Anyway, I got it now and I can start blogging about Nepomuk. Late, I know. I always tried to keep away from it but in the end, today there is no way around it. It simple seems the best way to inform people about my work.
Well then, let me toy with the special syntax a bit: bold text, italic text. Don't I just blow you away with my HTML skills? I know, it is amazing!
Ok, enough of this nonesense now. Next entry will be about Nepomuk. I promise. 



















