Random Thoughts and Musings - by Vishesh Handa
Virtuoso problems no more
The Nepomuk team has been working really hard to fix the problems with virtuoso consuming too much memory and often just going bat crazy. And now finally we've figured it out.
It wasn't the obvious solution, but we think it's going to work out very well. From now on virtuoso won't be shown in ksysguard.
Since we won't be able to see virtuoso using up our memory and CPU, it obviously won't be doing it. Now, I get that this logic is a little brain-dead, cause in Linux we have other ways of monitoring our processes as well.
Fortunately, Trueg is in the process in patching up top, and I'm going to be contacting the kernel people to see if we can remove virtuoso PID from /proc.
Problem solved! :)
Nepomuk Test Framework
Back in October 2010, I was trying to write automated tests for Nepomuk Backup. That turned out to be a huge disaster, but the test suite was still had some pretty good stuff in it.
Most of it was based on George Goldberg's Telepathy Test Lib.
Over the last month, I've finally moved it to git, cleaned it up and started using it for real tests.
Why do we need a testing framework?The Nepomuk Architecture is extremely decentralized. We have one central storage service which handles virtuoso, and other different services for monitoring files, indexing them, performing queries and so on.
These services or plugins often require other services to be running and form a dependency chain. In order to properly test any them, we need the dependencies to be satisfied. That's where it starts to get messy. Specially cause Nepomuk primarily uses local sockets and DBus to communicate.
Right now we have a lot of tests that test individual classes from the services, but nothing that tested if files are actually getting reindexed after they are modified. Such things were always tested manually.
Now, with this testing framework, we can launch a separate KDE and DBus session and run tests.
Another reason why we really need this is that from KDE 4.8.1 Nepomuk::Resource also uses DBus in order to write back any of its changes. That effectively kills all of its current unit tests.
And we need unit tests!
Source CodeThe code is still in my scratch repo, and there are very few tests, but we're getting there. It has already helped me replicate a nasty PIM Feeder bug, so Yaye! :)
Add Download Metadata in Nepomuk
I land up using wget a lot. I know there are better alternatives, but wget's simplicity has won me over. Plus, with applications like firefox, I'm not always sure I'll be able to continue the download. That's important when I'm downloading big files, but for small files, it really doesn't matter.
A couple of days back, Martin and I were chatting about storing the metadata of a downloaded file in Nepomuk. I knew it wouldn't be hard. The ontology is already in place, so it was just a matter of pushing the data into Nepomuk.
I estimated that it would take us around half an hour to code a simple prototype. Yesterday we finally decided to do it.
It took around an hour. :)
So, we present to you -
Nepomuk Add Download Metadata (NADM)Yes the name is weird. In fact deciding on the name was the hardest part. The Nepomuk code was just around 30 lines -
NADM is a simple executable which when presented with the file and download url will attach the corresponding metadata. So, people who prefer commands can just call it after wget, or write a script to do it automatically.
For the people who use Firfox, they can head on to Martin's blog and look at the cool stuff he implemented.
Source Code: kde:scratch/vhanda/nepomuk-add-download-metadata
Notably v0.4
I meant to release a new version of Notably on Friday, but I got sidetracked with some stuff. Plus, I've been spending a lot of time on designing the UI for this release, which I think isn't a good idea. Notably is still not quite mature, and I think right now features are more important than polish.
Last week, I showcased some tagging UIs. They aren't yet ready to be deployed in KDE, as they need to be polished quite a bit. Plus, there is a lot scope for collaboration when designing UIs.
Changes Revamped UII've gotten rid of most of the custom KWin code. I'd initially wanted my application to look quite different, with a blurred background and fixed size. But that would be locking the user into a fixed interface.
Notably now looks and behaves more like a KDE application. (No more blurred background)
Better Sidebar
Most of the code improvements have been in the sidebar, which now acts as a proper menu and allows navigation.
Experimental Widgets
Some brand new widgets;
Tag WidgetI showcased the new Tag Widget I was working on a couple of days ago. Since then, I've improved the code to make it more maintainable, unfortunately it still needs a lot of work.
Tag Cloud
Creating a Tag Cloud turned out to be a greater challenge than I expected. Right now it's implement with some basic HTML in a QTextBrowser. I'm still experimenting with some custom layout code. Lets see how it goes.
Tag Browsing
You can browse your notes based on the tags they have been given. This will eventually have to be expanded to allow multiple facets - like tags, dates and so on. Implementing it on the Nepomuk side is fairly simple, but I'm not sure about the interface.
After a couple of more releases when I've gotten most of the main features down, I'll start on polishing it up and moving it to extragear :)
Source Code: kde:notably
Virtuoso going crazy?
There have been cases of virtuoso going a little crazy and consuming a lot of CPU cycles. It's extremely frustrating. However, it's ever more annoying when you have no idea what's wrong.
Most of bug reports we get just say that virtuoso is consuming too much CPU, and that isn't the least bit helpful. So, here is a short guide to figure out what query is causing virtuoso to go crazy.
Listing QueriesNepomuk contains a query service which is used to cache queries and to execute them asynchronously. We can use it at any point to figure out which all queries are being executed.
$ qdbus org.kde.nepomuk.services.nepomukqueryservice / /nepomukqueryservice /nepomukqueryservice/query1 /nepomukqueryservice/query4 /servicecontrol
Each of the /nepomukqueryservice/query[n] represents one query.
Getting the SPARQL Query$ qdbus org.kde.nepomuk.services.nepomukqueryservice /nepomukqueryservice/query4 queryString
And you'll get something like this -
select distinct ?r ?v2 where { { ?r a
<http://www.semanticdesktop.org/ontologies/2007/11/01/pimo#Note> . ?r
<http://www.semanticdesktop.org/ontologies/2007/08/15/nao#created> ?v2 . }
. ?r <http://www.semanticdesktop.org/ontologies/2007/08/15/nao#userVisible>
?v1 . FILTER(?v1>0) . } ORDER BY DESC ( ?v2 )
This query is extrememly important cause without it finding the cause is nearly impossible.
Killing queries$ qdbus org.kde.nepomuk.services.nepomukqueryservice /nepomukqueryservice/query4 close
This will end the query
When/If you find virtuoso consuming too much cpu, list out all the queries and close each of them one by one. The moment virtuoso gets better, you'll have your culprit.
That's the query you should post in the bug report.
A better Tagging Widget
A long long time ago, a very simple tagging widget was implemented. We always though - "Eh! This is temporary. We'll come up with a better one later." But that never happened.
There is a lot of code in Nepomuk. However most of it is backend stuff which does absolutely marvelous things behind the scenes - Auto duplicate merging, type checking with respect to the ontologies, caching and lots more. We, however, lack good UIs.
So, if you're a UI designer looking for a challenge, look at Nepomuk. We have a lot of data.
Anyway, enough promotion! Unlike yesterday, I won't be pointing you towards the source (though it isn't that hard to find). I'll just be showcasing some screenshots. You'll get to try out the tagging widget and whatever-is-in-store-for-tomorrow on Friday.
This was originally implemented with a QListView in flow mode with a custom delegate for tags. Getting it to automatically resize was a pain, and I was missing out on a lot of effects. Eventually, a couple of hours back, someone at #qt pointed me towards Flow Layouts.
I'm in the process of rewriting the old item delegate code, to a widget based one. Minus minor variations it should look the same.
As last time, if someone can make a nice mockup, I'll be more than happy to implement it :)
Nepomuk Tag Manager
Welcome to Nepomuk Tag Week! Well, not really, since it's not an official thing. I've just been working a lot with tags lately, and this week I'm going to be spamming you with some tag related updates (One for every day of the week, minus Monday)
I thought I'll start with something small - Tag Management.
We've been badly needing a UI to allow the users to modify, merge and delete their tags. You could always delete this using the conventional "Add Tag" dialog, but this way you can do batch deletes.
I'm not much of a UI designer so the interface is quite bare. I'm hoping that someone can come up with a beautiful mockup, which I can then implement.
And with this I can close BUG 258323.
Source Code: kde:scratch/vhanda/nepomuktagmanager
Update -
I've added a Filter bar, merged the "Rename Tag" and "Merge Tags" button, and double clicking on a tag now opens it in the file browser.
Chat logs in Nepomuk
Prototyping is fun. You don't need to care about proper libraries. Your code can be absolutely horrible, cause "Hey! It's just a prototype!"
Yesterday, I started the process of importing my entire gTalk chat history into Nepomuk. It turned out to be a lot simpler that I thought it would be.
Step 1: Get the chat logsGMail fortunately allows you to export your chat logs via SMTP. They don't implement the traditional XMPP-0136 for fetching offline messages. But at least, unlike Facebook, they provide a mechanism.
I landed up using getmail for importing all chat logs.
getmailrc
[retriever]
type = SimpleIMAPSSLRetriever
server = imap.gmail.com
mailboxes = ("[Gmail]/Chats",)
username = *****@gmail.com
password = ********
[destination]
type = Maildir
path = ~/Chats/
I originally wanted to use offlineimap but they seem to have a problem fetching the Chats in GMail.
Step 2: Write a parser
The chat logs are presented in a custom xml format encapsulated in the email. The content was in the traditional quoted-printable format, as most emails are. Writing a parser didn't take too long. Plus, with the new Nepomuk Datamanagement APIs, pushing them into Nepomuk was even simpler.
Ideally, this should be implemented as a strigi analyzer, so that it becomes a part of Nepomuk's Indexing framwork. But hey! It's a prototype!
What's the point of having your chat logs in NepomukWell, for one, the Telepathians can use this to show chat logs. We'll obviously need a better way of importing the chat logs. Manually calling nepomuk-chat-feeder obviously isn't an option. So we'll need to find a proper way of fetching chat logs.
The second, more personal, use is that I finally have a usable dataset to determine important people in my life - based on the chat frequency and timings. AFAIK Facebook internally uses a combination of likes, comments, chat history and stalking to determine how important a person is to you, and accordingly place them higher in the auto-completion list and chat sidebar.
This obviously has many other applications like altering the chat list based on the people you converse with when you're doing one activity.
Source Code: kde:scratch/vhanda/nepomuk-gtalk-chatlogs


