This is a public chat log generated from the #semsol IRC channel.
08:12:58
Hi! Feature suggestion: a simple convenience method similar to ExecuteScalar in ADO.NET that returns a single value from the store. If named "queryScalar" it would directly return the object value of the first item in the result set?
09:23:47
peterkz, you can already do query('SELECT ...', 'row') to get the first result
09:25:01
e.g. $row = $store->query('SELECT ?myvar WHERE ...', 'row'); $val = $row['myvar'];
09:25:29
it's not the value directly, but close
09:26:08
Great! That is compact enough. I didn't catch that in the docs. On the other hand it is fairly trivial to do.
09:26:28
not sure if it's documented at all..
09:26:55
you can use 'rows', 'row', 'sql', or 'raw' I think
09:27:28
raw is handy for ASK queries as it directly gives you true|false
09:28:26
ah, it is documented on /docs/v2/store
12:44:37
Hi. if I LOAD a graph with more than 5000 triples I get multiple graphs with the same URI. A subsequent DELETE will only remove the first 5000 triples.
12:48:21
If I try to load a rdfxml file that has been through cwm ARC2_getFormat categorizes it as xml. It seems to be the leading xml comments by cwm that confuses it. As a fix I've had to add a line after the /* markup checks */
12:49:12
hi jesperll
12:49:25
hmm, what do you mean by "multiple graphs with the same uri"?
12:49:51
Oh, I forgot. You should start with something positive before complaining :) ARC is a clever little thing!
12:49:57
heh
12:50:21
I'll add the comment removal to the format detector
12:50:26
LOAD <bigfile.rdf>
12:50:39
and then
12:50:43
SELECT ?graph COUNT(?graph) AS ?count WHERE { GRAPH ?graph { ?s ?p ?o } } GROUP BY ?graph ORDER BY ?graph
12:51:32
should be LOAD <file:bigfile.rdf>
12:52:10
the count stuff is still a bit odd, it doesn't always translate easily to sql
12:52:39
I looked in MySQL and there seems to be two different ids for bigfile.rdf
12:53:21
DELETE FROM <file:bigfile.rdf> will only remove the first 5000. Even when run multiple times
12:53:48
that clearly shouldn't be
12:54:24
There is some batching in ARC2_StoreLoadQueryHandler that uses 5000 as limit
12:54:37
yeah
12:55:25
if it creates a new id every 5000 triples, that'd explain the bug
12:55:34
bengee checks code
12:55:42
that's probably it then
12:56:24
I've tested arc with large files, but possibly not the graphs
13:03:14
I'll debug it in a minute
13:05:31
Feature request: Would it be possible to ORDER BY an aggregate value such as COUNT
13:05:38
SELECT ?s COUNT(?s) AS ?count WHERE { ?s ?p ?o } GROUP BY ?count ORDER BY ?count
13:05:49
hmm
13:08:01
that doesn't sound too hard for simple cases like aggregate vars
13:10:47
the grammar allows ORDER BY ?var IIRC, so I wouldn't have to change the parser
13:11:35
bengee wonders which SQL is currently produced from an aggregate var in ORDER BY
13:11:52
string(293) "SELECT G_0_0_0_0_0.g AS `graph`, COUNT(G_0_0_0_0_0.g) AS `count` FROM (test_triple T_0_0_0_0_0) JOIN test_g2t G_0_0_0_0_0 ON ( (T_0_0_0_0_0.t = G_0_0_0_0_0.t) )JOIN test_id2val V_0_0_0_0_0_g ON ( (G_0_0_0_0_0.g = V_0_0_0_0_0_g.id) ) GROUP BY G_0_0_0_0_0.g ORDER BY (V_0_0_0_0_0_g.val) "
13:12:32
ah, it ignores the aggregate flag
13:12:35
jesperll had that on the clipboard
13:12:47
heh, yeah, that was quick
13:13:14
if the sql rewriter gets so far, it might be almost trivial to add
13:30:01
ok, bug reproduced
13:36:39
d'oh
13:40:35
that's an embarrasing one, thanks a lot, jesperll
14:20:21
ok, got that one fixed
14:20:29
(I hope ;)
14:20:54
will have a look at "aggregate var in ORDER BY" now
14:25:43
Hi (again)! Is there an easy way to remove duplicate triples from the ARC store?
14:26:06
it should do that automatically
14:26:31
Ah, at insert time?
14:28:42
if it doesn't, that might be related to the bug jesperll found
14:28:50
the id cache was borken
14:30:36
ah, no, I don't have a primary key set on the triples table
14:32:09
dupes may still happen, I haven't fully tested the practicability of those dupe tests yet
14:35:07
there is a way to avoid dupes from a single doc/graph to be added to the store, though
14:37:56
when you add a "skip_dupes" => 1 to your config file, the parsers will ignore duplicates at parse/insert time
14:39:56
once they are in the store, you'd need some CONSTRUCT + DELETE + INSERT juggling
14:43:08
Ok. Thank you again. Sorry to bother you so much...:-)
14:43:26
nah, that's what this channel is for
14:59:22
bengee adds support for "ORDER BY aggregate var" and runs DAWG test suite
15:07:06
'k, looks good. I'll bundle a new rev
15:13:56
excellent
15:26:00
ok, it's online now (rev 2008-01-19)
15:26:58
I hope the bug is fixed, the LOAD handler is still a bit in flux
15:40:34
oh, new bug, gotta replace the zip
15:48:10
ok, relaced
15:48:17
replaced even
22:07:50
darn, no bengee!
22:08:46
for the logs - my endpoint at http://sandbox.foaf-project.org/2008/foaf/ggg.php is suddenly just returning a blank page
22:08:50
nothing obvious in the logs
22:08:54
ho hum
