Channel #semsol: Logs

This is a public chat log generated from the #semsol IRC channel.

22:10:03 tuukkah: works great, thanks kwijibo!
22:10:18 kwijibo: np
22:10:59 tuukkah: one thing i'm wondering about the quads is how the graph uri is determined in case of redirects etc.
22:11:21 kwijibo: hmm
22:11:47 tuukkah: i'd like to delete old triples before loading new ones
22:12:47 kwijibo: from the same graph you mean?
22:12:52 tuukkah: yeah
22:13:06 kwijibo: ie, you only keep the current version of one document's triples
22:13:09 kwijibo: ?
22:13:10 tuukkah: maybe LOAD <$g> INTO <$g>
22:13:13 tuukkah: yeah
22:13:36 tuukkah: and before that, DELETE FROM <$g>
22:13:48 kwijibo: I think you would have to delete first yeah
22:15:33 kwijibo: tuukkah: I'm not 100%, but I think it will use the uri you give it, even if it redirects
22:15:45 kwijibo: I think the graph is determined directly from your query
22:16:00 kwijibo: rather than somewher down the line
22:16:01 tuukkah: ok, then the INTO is superfluous
22:21:05 tuukkah: that has its problems too. non-canonical forms of the uri will stay in the store
22:21:36 tuukkah: i mean, if you update using a somewhat different uri for the same document
22:21:59 tuukkah: not the most important use case though
22:22:56 kwijibo: I suppose you would have to canonicalise it before you add it to the query
22:23:53 tuukkah: indeed
22:27:18 kwijibo: i dunno if you could actually do much canonicalisation
22:27:32 kwijibo: beyond casing the domain name maybe?
22:29:11 tuukkah: well, you could also check where the redirects point and if there's a content-location
04:32:35 mattpardee: hi everyone, I'm getting acquainted with ARC and I'm using a MySQL database as an RDF store. I'm wondering if ARC is set up to handle multiple users accessing respective RDF data
04:33:11 mattpardee: so user A has loaded http://whatever.com/data.rdf and user B has started a session that's loaded http://blahblah.com/data2.rdf
04:33:33 mattpardee: as the data is stored in the database, is there a way to separate it on a per-user basis?
04:35:01 mattpardee: ultimately I'm attempting to create an address book interface for FOAF files
04:35:15 mattpardee: perhaps it's not the best approach programmatically to store individual users' data in a database
06:55:36 kwijibo: morning bengee, morning KiYanWang
06:55:48 KiYanWang: morning kwijibo, bengee
06:56:42 bengee: morning all :)
06:57:37 kwijibo: bengee: tuukkah found a small bug with (I think) the rdf/xml parser yesterday
06:58:14 bengee: 'k, I'll check the logs, thx
06:59:15 kwijibo: the s_type doesn't get set in this kind of serialisation: <f:knows rdf:type="#foo" rdf:resource="#bar"/>
07:01:50 kwijibo: what you up to lately KiYanWang ?
07:02:20 KiYanWang: designing some ontologies
07:02:29 KiYanWang: in fact publishing on vocab.org today
07:02:38 KiYanWang: then doing some work on Zephyr
07:02:39 kwijibo: cool
07:03:00 KiYanWang: kwijibo: iand spoke to me yesterday says he may not go to eswc but i can go in his plae
07:03:08 KiYanWang: kwijibo: need to figure out the logisitics
07:03:10 KiYanWang: etc.
07:03:21 kwijibo: logistics of work, or travel?
07:03:55 KiYanWang: travel :p
07:04:05 KiYanWang: KiYanWang doesn't believe in logisitics of work :p
07:04:27 KiYanWang: surely logistics and work are an oxymoron
07:05:37 kwijibo: kwijibo looks up logistics on wikipedia
07:06:51 kwijibo: KiYanWang: I don't have any accommodation or flights or anything booked yet either
07:06:55 KiYanWang: kwijibo: NOOOO use powerset.com instead
07:07:09 KiYanWang: kwijibo: yeah i know we need to get organised
07:07:26 KiYanWang: it doesnt help that the conference starts the day after our company one
07:07:48 KiYanWang: KiYanWang might end up ditching eswc .... refuse to travel on my own after what happened last time
07:07:54 kwijibo: shit - is it as close as that?
07:08:10 kwijibo: well, I'll be flying out from brum i reckon
07:08:32 kwijibo: although you don't *have* to take the saem flight if you don't want to
07:08:47 kwijibo: tom will fly from brum too i imagine?
07:16:37 KiYanWang: kwijibo: ok ... tom is in the office today so ill try to organise some stuff with him
07:17:20 kwijibo: I'm just thinking it's gonna be well hot there
07:17:49 kwijibo: it's supposed to be like an african climate there or something
07:23:18 KiYanWang: kwijibo: take some sun block ... ( i dont need any:p )
07:23:49 kwijibo: what *none* ? *ever* ?
07:24:14 KiYanWang: kwijibo: it would have to be well into the upper 30's before i needed any
07:24:32 KiYanWang: kwijibo: i love having a natural tan :p plus i did live in South Africa for a while
07:24:37 KiYanWang: so im used to it
07:24:38 kwijibo: what about UV protection?
07:24:50 KiYanWang: kwijibo: me never had a problem
07:24:52 kwijibo: yeah, I imagined you would be used to the african climes bit :p
07:24:56 KiYanWang: :p
07:25:07 KiYanWang: i saw your tweet about climbing while were out there
07:25:26 KiYanWang: i havent done it for years but if im willing to go with you and try
07:25:42 kwijibo: don't you get grumpy and sleepy in direct sunlight either :p?
07:27:53 KiYanWang: kwijibo: nope ... as long as their are bikini clad chicks around ... speaking of which make sure you pack that Mankini of urs
07:29:17 kwijibo: lol
07:29:24 kwijibo: kwijibo powersets mankini
07:30:13 KiYanWang: kwijibo: nooooooooooooooooooo you'll go blind
07:30:30 KiYanWang: http://www.google.co.uk/search?hl=en&q=mankini&btnG=Google+Search&meta=
07:31:23 kwijibo: kwijibo learning more than he really wanted to about bikinis
07:42:22 tuukkah: i'm getting 22 results for a query that ends limit 20. i wonder how that can be
07:43:09 kwijibo: what kind of query is it?
07:43:35 tuukkah: select distinct ... order by desc (?date) limit 20
07:43:58 kwijibo: dunno then :D
07:44:06 kwijibo: bengee?
07:46:56 tuukkah: btw, you might know what's the best way to write something like this in php: list($post, $p, $date, $d, $content, $c, $maker, $m, $name, $n, $depiction, $d) = array_values($result_row);
07:47:34 tuukkah: that is, i want each sparql select variable to be assigned to a php variable
07:48:09 kwijibo: ah
07:48:11 tuukkah: (this is what we do now, but it bugs out if a literal has a language code!)
07:48:20 kwijibo: extract($row)
07:50:05 kwijibo: tuukkah: what's wrong with the language code?
07:50:19 kwijibo: i mean, why does it cause a problem?
07:51:07 tuukkah: the result_row will have more elements and the variables will be assigned incorrectly. someone was wondering last night why their depiction is fr-FR =)
07:52:00 tuukkah: extract is what we want, i think
07:53:15 kwijibo: aren't the values in array values arrays as well?
07:53:59 tuukkah: nope
07:54:33 tuukkah: there's keys like "name", "name type", "name lang"
07:54:57 tuukkah: extract($r); is shorter to write anyway =)
07:54:59 kwijibo: ahh
07:55:06 kwijibo: hmm
07:55:20 kwijibo: i wonder if that will work if there is a space in the keys
07:55:29 tuukkah: oops :-)
07:55:53 tuukkah: "If the prefixed result is not a valid variable name, it is not imported into the symbol table."
07:56:39 kwijibo: you could run through the array and replace the spaces with underscores
07:57:01 tuukkah: indeed
07:58:46 kwijibo: foreach($row as $k => $v) { $kn = str_replace(' ','_', $k); $$kn = $v; }
07:59:27 kwijibo: woulda been good if extract just let you set a flag to do that
07:59:47 tuukkah: btw someone comments: "What they should say is that if _any_ of the results have invalid names, _none_ of the variables get extracted."
08:00:37 tuukkah: and for a bit cleaner result, use extract in the PREFIX mode
08:02:25 tuukkah: hmm, i see your clean-up code does the extract part too
08:03:07 kwijibo: yeah - might as well really
08:05:18 tuukkah: how would you change that to add a prefix to each var defined?
08:05:52 kwijibo: foreach($row as $k => $v) { $kn = 'prefix_'.str_replace(' ','_', $k); $$kn = $v; }
08:06:39 tuukkah: heh, silly me :-) thanks!
08:07:14 kwijibo: np ;)
08:35:09 bengee: tuukkah, ARC should parse rdf:type + rdf:resource
08:35:37 kwijibo: benge - that was just my guess where it was tripping up really
08:36:07 bengee: bengee tries to reproduce the bug
08:36:13 tuukkah: bengee, did you see the uri to the error message? maybe you can make more sense of it
08:36:39 bengee: it says "Done" when I deref it, looks like Cloud tweaked his foaf file
08:37:10 kwijibo: i found another bug btw: http://rdfweb.org/foaf/corp/data/_corpdata.rdf results in: missing stream in "readStream" via ARC2_Reader via ARC2_StoreLoadQueryHandler"
08:38:39 tuukkah: bengee, oh right, he just fixed it
08:39:24 bengee: but even when I re-add the rdf:type=skype, I get correct tripls
08:39:44 tuukkah: yet another bug? loading http://iki.fi/asko.soukka/foaf.rdf results in uris like http://users.jyu.fi/~atsoukka/foaf.rdf#me because of the redirect
08:40:13 tuukkah: (whereas rapper returns uris like http://iki.fi/asko.soukka/foaf.rdf#me)
08:45:33 bengee: kwijibo, that foaf corp bug is interesting, will check things
08:46:59 bengee: tuukkah, the iki.fi/jyu.fi behaviour is correct in ARC, I'd say
08:47:18 kwijibo: bengee: re: the breslin bug, yes, thes p o triples are all correct, but the s_type is missing
08:47:29 tuukkah: bengee, may well be
08:47:31 bengee: not here
08:47:41 bengee: ah, s_type
08:47:43 bengee: sorry
08:48:01 tuukkah: well, it was clear to spot as the result was mysql syntax error
08:49:06 bengee: right, that's a bug
08:49:20 kwijibo: bengee: the foaf corp thing - seems to be loading at least some of the triples
08:49:41 kwijibo: not all of them though I think - there looks like more than 175 triples there
08:52:04 bengee: bengee fixes the s_type bug, that was a copy&paste error or some other stupidity
08:52:35 tuukkah: bengee, cheers!
08:52:47 bengee: thx for spotting it
08:59:20 bengee: kwijibo, there is a character error in th xml which makes the parser fail
09:00:26 bengee: it then tries again with an ISO-8859-1 header but somehow chokes then
09:00:58 bengee: gotta check that, at least the error message should be different
09:15:27 bengee: ok, fixed
09:15:31 tuukkah: here's a simple query where LIMIT 20 doesn't do what i'd expect: http://smob.sioc-project.org/server/sparql?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0D%0Aselect+distinct+%3Fpost%0D%0Awhere+%7B%0D%0A++%3Fpost+foaf%3Amaker+%3Fmaker+.%0D%0A++%7B+%3Fmaker+foaf%3Adepiction+%3Fdepiction+%7D+union+%7B+%3Fmaker+foaf%3Aimg+%3Fdepiction+%7D%0D%0A%7D+LIMIT+20%0D%0A&output=plain&jsonp=&key=
09:15:55 bengee: heh, seems to be support day
09:16:25 tuukkah: every new user we get to smob seems to expose a new issue ;-)
09:24:36 kwijibo: it never rains but it pours
09:27:07 tuukkah: since i'm a php, sparql, mysql and arc newbie, i'm especially good at creating issues =)
09:28:53 bengee: it's tricky to translate sparql's order by to mysql union queries, but I guess I can at least get distinct to work
09:29:39 bengee: the LIMIT is currently only applied to each UNION branch
09:30:28 tuukkah: can't you apply it in the very end at the top level?
09:31:06 bengee: well, that's what it should do, but it's not that easy to tell the sql rewriter ;)
09:31:23 tuukkah: oh, i see :-)
09:31:44 bengee: but not that hard, either, so I'll do it now
09:32:19 tuukkah: these things are not urgent by the way, so if you prefer me to file an issue some other way, i can do that
09:32:46 bengee: order by is hard, and as LIMIT is/was rewritten together with ORDER, I obviously skipped it back then
09:33:32 bengee: it's ok, a broken LIMIT sucks
09:37:44 bengee: bengee discovers some commented-out sections, looks like I experimented with this already
09:39:29 bengee: ok, LIMIT is now always at the root level
09:39:46 bengee: bengee runs DAWG tests
09:56:58 bengee: bengee creates a dump from the smob endpoint via DESCRIBE ?s WHERE {?s ?p ?o}
09:57:36 bengee: ... and discovers another bug: lang'd literals are not properly XML-escaped
09:57:55 bengee: .. fixing
09:58:01 kwijibo: should they be?
09:58:19 bengee: & should be &amp; etc
09:58:41 kwijibo: only in rdf/xml though?
09:58:46 bengee: yes
10:00:01 kwijibo: see what we were talking about before re danja's blog's rdf/xml
10:00:52 kwijibo: how do we get it to reserialise again correctly?
10:05:25 bengee: ok, tuukkah, UNION + LIMIT is working correctly now, just tried it on a smob dump
10:06:48 tuukkah: wow, big thanks!
10:07:39 bengee: the next rev uses slightly changed internal structures, though, you might have some upgrade issues
10:08:02 tuukkah: what do you mean with internal? the sql schema?
10:08:15 bengee: the new rev is aligned with the Talis platform, instead of "lang" and "dt", it nw uses "language" and "datatype" array keys
10:08:43 tuukkah: you mean in the reponses to sparql select?
10:09:57 bengee: in many places, oh, "val", "dt" => "value", "datatype". "lang" is not changed
10:10:36 tuukkah: then i wouldn't describe the structures as internal, as they're in the public api =)
10:11:12 tuukkah: or does internal mean not exposed by the sparql endpoint?
10:11:13 bengee: arc has 2 internal structures (flat riple arrays, and resource indexes), they are adjusted
10:11:38 bengee: and to keep things consistent, I adjusted the other areas, too
10:12:15 bengee: see http://arc.semsol.org/community/arc-dev/archives/2008/01/PM-GA.20080131094910.4C65A.2.1D@semsol.com
10:16:00 tuukkah: maybe we're not affected since we currently only use SELECT - not resource indices, triples sets, or DESRCIBE and CONSTRUCT
10:17:38 bengee: the array key in SELECTs will change from "foo dt" to "foo datatype"
10:18:25 tuukkah: oh ok. but we don't use those either =)
10:18:33 bengee: good :)
10:18:49 tuukkah: i suppose you can say we're still very hacky
10:19:11 bengee: "evolving" ;)
10:19:25 tuukkah: not checking type or datatype ever
10:23:47 tuukkah: this reminds me of one of the worse hacks: if someone had more than one foaf:name or foaf:img, the posts would multiply by select distinct :-) can something like group by be used for that?
10:24:44 bengee: hm, worth a try
10:25:50 bengee: I'm often doing things in multiple steps, 1st a query to retrieve the resource IDs, then for each ID another query to get the details
10:29:28 tuukkah: or would this be enough: foreach ($rs['result']['rows'] as $row) $distinct_rows[$row['post']] = $row;
10:30:40 bengee: if you don't care for a specific name/img, then yes, I guess
10:34:37 bengee: kwijibo, danja's rdf/xml looks ok here, entity-wise
10:34:51 bengee: this is the fixed rdf/xml serialiser, though