Posts Tagged subject=triplestore

Reference: “Triple Stores Aren’t”

From the blog of Eric Hellman

Triple Stores Aren’t

“…all the triple stores in serious use today use more that 3 columns to store the triples. Instead of triples, RDF atoms are now stored as 4-tuples, 5-tuples, 6-tuples or 7-tuples.

Is there anything harmful with the misnomerization of “triple”, enough for the community to try their best to start talking about “tuples”? I think there is. Linked Data is the best example of how a focus on the three-ness of triples can fool people into sub-optimal implementations. I heard this fear expressed several times during the conference, although not in those words. More than once, people expressed concern that once data had been extracted via SPARQL and gone into the Linked Data cloud, there was no way to determine where the data had come from, what its provenance was, or whether is could be trusted. He was absolutely correct- if the implementation was such that the raw triple was allowed to separate from its source. If there was a greater understanding of the un-three-ness of real rdf tuplestores, then implementers of linked data would be more careful not to obliterate the id information that could enable trust and provenance. I come away from the conference both excited by Linked Data and worried that the Linked Data promoters seemed to brush-off this concern.”

,

No Comments

Reference: “A Reflection on the Structure and Process of the Web of Data”

A Reflection on the Structure and Process of the Web of Data

“What has been the sole territory of relational database technologies may soon be displaced by the use of RDF and the triple store. Moreover, because RDF is the common data model utilized by triple stores, it is possible to integrate data sets across different triple stores – across different RDF data providers. This integration is conveniently afforded by the URI and RDF as web standards and is a function foreign to the relational database domain. With the Web of Data, no longer is information isolated in individual inaccessible data silos, but instead is exposed in an open and interconnected environment – the web environment.”

, , ,

No Comments

Schema-less Databases, Transactions and Eventual Consistency

The value of schema-less databases seems to be a topic of emerging interest, see for instance:

Is the Relational Database Doomed?

Discusses some of the potential of key/value databases as compared to RDBs.   The immediate answer to the inflammatory title is, of course, no.  However, it seems increasingly clear that one can find a lot of company in suggesting that there is significant value in and adoption of schema-less database approaches.

See also:

How FriendFeed uses MySQL to store schema-less data

There are many responses to the above post, so some reading is required, but it may be worth it.  I found it interesting though that there were no comments on triplestores.  I’m not quite ready to jump in on that though.  I’ll try and come back to it later and see what additional comments may have been made.

In any case, performance is still always an issue.  Many approaches are taken to dealing with response time, and some form of replication is frequently involved, whether it be copying data into parallel systems,  or storing it in alternate forms or formats that have different access versus update characteristics.  This inevitably leads to a problem of maintaining consistency between the various manifestations of the data.  The challenge of maintaining consistency across various forms of parallel systems is therefore a recurrent theme and one addressed in the following sources:

Eventually Consistent – Revisited

Discusses some of the problems managing reads and writes and keeping everything consistent.

Sesame 3.0 Preview: An Open Source Framework for RDF Data

From a recent DevX.com article. Mentions the concept of “eventual consistency” in the “Transactions” section.

In our case, end-user results and the process of achieving consistency depends on the order in which one updates:

  • Files
  • Triplestores
  • Solr Indexes

The following references and quote are from an email exchange with Benjamin O’Steen [bosteen@gmail.com].

Writing to serialized data (files), and later updating Solr indices using JMS/AMQP [RAbbitMQ] enables ” indexes ‘eventually converging’ to the truth within seconds after the event (truth being whatever the data held on disc says is true.)”

Changes to an RDF document can be queued as a Talis changeset and later committed.

Note: This post originally addressed the topic of R/W Contention in triplestores and Solr as an approach, posed by Declan Fleming.

, ,

No Comments