Revision 8d23f36...

Go back to digest for 20th November 2011

Optimization in KDE Base

Sebastian Trueg committed changes in [nepomuk-core/symlinkHandling] /storage:

Better duplicate merging in storeResources

This is an inefficient way of doing this, but it's better than not
having duplicate merging at all.

The problem is finding duplicates in the graph provided when those
resources do not already exist in the repository. Previously, we
used to hash each of the resources, check for duplicates, and then
merge. Then we would perform the whole identificiation & merging
code.

This suffered from the draw back that it would not recognize
duplicates of the form -

:a a nco:Contact ;
nco:fullName "Peter" ;
nco:hasEmailAddress :b .

:b a nco:EmailAddress;
nco:hasEmailAddress "" .

:c a nco:Contact ;
nco:fullName "Peter" ;
nco:hasEmailAddress :d .

:d a nco:EmailAddress;
nco:hasEmailAddress "" .

Here :d and :b would be found as duplicates, but :a and :c would not
as their nco:hasEmailAddress property is different, and therefore
the hash would be different.

This has been solved by looking for duplicates multiple time. So, the
code is something like this -

do {
// Look for duplicates using hashing
// merge the duplicates

} while( there are duplicates );

// continue with the identification and merging process

Also added a unit test.

File Changes

Modified 4 files
  • /storage
  •   services/datamanagementmodel.cpp
  •   services/test/datamanagementmodeltest.cpp
  •   services/test/datamanagementmodeltest.h
  •   services/test/qtest_dms.cpp
4 files changed in total