Channel #semsol: Logs

This is a public chat log generated from the #semsol IRC channel.

23:37:43 sid: Hi I need to know that can we actually query a specific dataset at dbpedia by mentioning it in the from clause?
09:00:32 Hory: hey, I've gotten a series of triples into a PHP array with ARC, how can I transfer it to the (mysql) store?
09:18:52 Hory: $store->query('LOAD <http://example.com/home.html>'); :)
09:20:46 ^Jenny^: yes, right :)
09:22:41 kwijibo: morning KiYanWang
09:23:14 ^Jenny^: hi KiYanWang, kwijibo, Hory
09:23:29 kwijibo: hi ^Jenny^
09:54:52 Hory: if I have a website with a "regular" mysql structure, and I want to implement a SPARQL endpoint with ARC, should I adjust all tables to match ARC's store structure and modify my application accordingly or can I just mirror the data into the store, somehow, from time to time?
09:55:20 ^Jenny^: just like you want it
09:55:57 Hory: I'm not very experienced with this, I thought maybe one of these ways is better
09:56:02 Hory: or maybe there's another one
09:58:36 kwijibo: Hory: it depends what is best use of your time
09:59:17 Hory: well my application is pretty small and I could try to convert it to RDF store-like tables
09:59:19 kwijibo: to me it sounds like it is a choice between using RDF internally in your application, or simply publishing it
09:59:40 Hory: but wouldn't a RDF-like store be a performance problem?
10:00:00 Hory: you'd have to do a lot of JOINS to get data
10:00:09 kwijibo: changing it to use RDF internally would be more work - you would have to change all your form processing scripts and so on
10:00:11 Hory: and since the tables have a lot of rows..
10:00:33 kwijibo: Hory: yes, there is a performance - flexibility tradeoff
10:00:51 Hory: and if I were to just publish it, how could I do it?
10:01:18 kwijibo: Hory: either write some alternate templates for your views
10:01:25 kwijibo: or embed RDFa into the HTML view
10:02:28 kwijibo: I wouldn't say that using an RDF store (like ARC) is necessarily a performance problem though
10:02:49 kwijibo: it depends on the quantity and structure of your data
10:03:36 kwijibo: for many uses, the difference in performance will be pretty negligible
10:03:43 Hory: it's sort of like a book database
10:04:01 Hory: each book is linked to authors, genres, publications, series..
10:04:36 Hory: you can have 50 triples just for one book..
10:05:03 kwijibo: Hory: sounds like you would have a lot of joins anyway
10:05:24 Hory: yes but because the tables would be smaller it would be a big difference
10:05:33 ^Jenny^: thats exactly what sparql is ment to do, I think :)
10:06:13 Hory: if I keep the regular mysql structure then it's not possible to implement SPARQL on it? only "static" rdf exports for each book, for example?
10:06:39 ^Jenny^: why is there such a large difference? triples consist of subject, predicate and object and I guess subject always is the book that is described. the predicate would be your fieldname in a "normal" mysql database and the object the set it is linked to
10:08:11 ^Jenny^: I guess it's hard work to implement sparql on it and other people already have done this work for you if you use arc :)
10:08:24 kwijibo: Hory: there are SQL 2 SPARQL mappers that can help you do this google for D2RQ
10:08:33 Hory: well for example if previously I wanted to join a book's genres to the book, I'd join an average-sized table (books) on a small table (genres)
10:09:48 Hory: with a triple store I have to join tables which contain all resources in them..
10:09:51 ^Jenny^: and now its only one triple - "http://books.com/mybook" "http://books.com/has_genre" "scifi"
10:09:52 Hory: oh, so that's how they're called? :)
10:10:03 Hory: mappers
10:11:07 kwijibo: Hory: ^Jenny^ is kind of right, because the data is modelled differently, in instances like this, there may be less joins
10:11:14 Hory: yes jenny but what if I want to get the author of each sci fi book?
10:11:18 kwijibo: DESCRIBE queries tend to be pretty performant
10:11:40 kwijibo: if you are doing a simple DESCRIBE <{$bookURI}>
10:12:21 kwijibo: then all that has to do is something like SELECT * FROM triples WHERE triples.s = "{$bookURI}"
10:13:29 Hory: this will give me a huge list of books..
10:13:46 kwijibo: no
10:13:48 ^Jenny^: if authors are linked by "http://books.com/mybook" "http://books.com/has_author" "peter" and you want all scifi authors the query would be "SELECT ?x WHERE ?z "http://books.com/has_genre" "scifi" AND ?z "http://books.com/has_author" ?x
10:13:51 Hory: or rather, all of the triples
10:14:20 ^Jenny^: just this in sparql-language because AND is not that beautiful ;)
10:14:26 ^Jenny^: only for understanding at this place
10:14:28 Hory: Jenny, yes, but could you tell me how many mysql queries are involved to resolve that sparql query?
10:14:51 ^Jenny^: not exactly, no
10:14:54 Hory: that's what I'm trying to compare
10:15:03 Hory: with plain Mysql it would be one query..
10:15:09 ^Jenny^: but I guess the people who implemented this did a lot of performance-work
10:15:16 kwijibo: Hory: well, I would start with modelling your data as RDF first
10:15:32 kwijibo: then you can import it into a store and see what the performance is like
10:15:57 Hory: thanks
10:16:09 Hory: I thought maybe someone knew the underworkings of ARC :)
10:16:50 kwijibo: i do roughly, but bengee is the developer, he could tell you more precisely
10:17:05 kwijibo: I suspect performance would be really fine
10:17:19 kwijibo: not as great as if you optimised mySQL
10:17:41 Hory: that's normal..
10:17:43 kwijibo: but less work to write the queries - as ^Jenny^ says, SPARQL is a lot simpler - it abstracts over the joins
10:18:05 Hory: that's true too, I just don't want a dedicated server for it :)
10:18:13 ^Jenny^: Hory: the structure of the arc store makes the sparql-querys a lot more efficient normally
10:18:49 kwijibo: and I suspect that the performance of ARC's SPARQL is still better than Object->relational mappers for instance
10:19:11 kwijibo: like Rails or CakePHP use
10:19:31 Hory: I've installed OntoWiki recently
10:19:44 Hory: it's nice, except that it has about 30MB worth of soruce code
10:20:16 Hory: it's scary to me, since I've never written an app that uses more than 1MB of code :)
10:20:18 ^Jenny^: its not that much "onto" :/
10:20:53 kwijibo: Hory: probably because it is built ontop of other code
10:21:07 Hory: yeah, zend framework and a lot of other stuff
10:21:25 Hory: but it takes a couple of seconds for a page to load on my computer
10:21:29 Hory: and I'm the only user :)
10:21:42 ^Jenny^: do you use xampp?
10:21:52 Hory: yes
10:22:16 ^Jenny^: it has poor performance on windows in the standard-configuration
10:22:21 kwijibo: Hory: you can get ARC to tell you what the underlying SQL query of a SPARQL query is as well
10:22:26 ^Jenny^: we recently had this problem, too
10:22:56 Hory: oh, nice..
10:41:15 kwijibo: Hory: if you haven't seen it - http://purl.org/ontology/bibo/ is the best ontology for books
10:41:44 Hory: wow, thanks
10:42:19 Hory: I tried implementing one in OWL and ran into the problem of not being able to link object properties to classes
10:42:47 kwijibo: ?
10:43:56 Hory: sorry, brb
10:47:17 Hory: well, basically, according to this 100 page tutorial most of the "entities" had to be classes
10:47:31 Hory: so if "sci-fi" was a class, I couldn't make a hasGenre property
10:47:46 Hory: because in OWL you can't link instances to classes as far as I understand
10:52:40 kwijibo: i don't understand
10:53:30 Anchakor: I wonder if you could do: :hasGenre owl:subPropertyOf rdf:type .
10:54:45 Anchakor: but anyway I would do it that "sci-fi" was an instance of class Genre and hasGenre would be normal property
10:55:34 kwijibo: yeah
10:56:12 Hory: yes, that's the solution I think
10:56:16 kwijibo: though the noun pattern is now generally preferred for property names
10:56:32 kwijibo: so, "genre" rather than "hasGenre"
10:56:44 kwijibo: but that's not really important
10:56:47 Anchakor: I prefer verbs :)
10:57:01 Hory: I saw that dbpedia uses "genre" too
10:57:32 kwijibo: i think sometimes verbs seem the most sensible - i can't think of a noun equivalent for :likes or :uses
10:57:58 kwijibo: but I don't like hasSomething or isSomethingOf patterns
10:58:31 Hory: those are better for showing which is a child of which
10:59:05 kwijibo: not really
10:59:37 kwijibo: everyone understands the <x> dc:title "Great Expectations"
11:00:06 kwijibo: so why not <x> dc:creator <y> ?
11:00:43 Anchakor: IMHO they are just identifiers - sooner or later annotation properties would be more important
11:01:24 ^Jenny^: many people just prefer predicates for "predicates" :)
11:02:00 Anchakor: noun identifiers have the problem that you dont know if they are shorthand for "isXXXof" or "hasXXX"
11:03:25 Anchakor: so for example <x> dc:creator <y> - does it mean <x> isCreatorOf <y>, or <x> hasCreator <y>?
11:05:20 ^Jenny^: theres a problem with inverses, too
11:05:28 Hory: dbpedia automatically appends "of" to an inverse property
11:05:40 Hory: but doesn't really list it as a distinct property
11:06:18 Hory: just links to the "has" property
11:06:23 ^Jenny^: some don't know how to build inverses and don't see that isCreatorOf is/can be an inverse to hasCreator