This is a public chat log generated from the #semsol IRC channel.
03:53:57
hello!
10:05:12
bengee swithces from "CREATE tmp SELECT" to "CREATE tmp" + "INSERT INTO tmp SELECT"
10:05:53
hmmm
10:05:55
getting rid of the apparently problematic "ALTER tmp ADD _pos_" for queries that contain ORDER BY
10:06:52
with a separate CREATE, I can add the sort column before the table is filled
10:07:53
and mysql will hopefully stop hanging
10:08:22
weird
10:08:58
with Jena, variable predicates make queries slower
10:09:12
with arc i'm getting the opposite!
10:09:18
heh
10:09:19
can't be right
10:12:35
DESCRIBE ?s WHERE { ?s ?p ?o . FILTER(regex(?o, "machine")) } -> 5.82785010338
10:13:06
DESCRIBE ?s WHERE { ?s rdfs:label ?o . FILTER(regex(?o, "machine")) } --> emm, still not finished!
10:13:51
still not finished ...
10:14:00
last time it was 111 seconds
10:17:17
this time 364 seconds
10:18:46
bengee - any idea why that is?
10:22:03
might make sense to do a query(..., 'sql') and run an EXPLAIN against that
10:22:12
most probably index-related
10:24:26
ah
10:28:44
bengee: is there any docs about the indexing?
10:28:51
*are there
10:29:21
no
10:29:51
did you find out anything interesting?
10:38:58
not really familiar with the output of explain :(
10:41:02
the rdfs:label query has os,po in the possible keys column, whereas the ?p query has null
10:41:59
sorry, that should have been cid instead of os,po
10:44:54
hmm, no index or p, that's odd
10:45:17
ah, a variable p, ok
10:46:33
ok, then it perhaps does a table scan for ?p, but a (for some reason slow) index lookup for a given p
10:47:25
did you try $store->optimizeTables()
10:47:45
maybe the index is fragmented
11:00:17
bengee: what's sparqlxmlresultsloader?
11:00:55
I don't think that exists already
11:01:19
mortenf tweeted about sending it to you :)
11:02:13
it'd be a streaming rdf loader from a predefined sparql xml result (e.g. g = graph, s = subject, p = predicate, etc)
11:02:26
for store replication
11:06:01
bengee - optimising tables made the rdfs:label query a lot better
11:06:06
but the ?p query a lot worse
11:06:48
rdfs:label now takes only 15.0095608234
11:06:59
heh, but that's how it was supposed to be, no? ;)
11:07:03
but ?p now takes 76.0367071629
11:07:15
(whereas before it took 5-7 seconds)
11:07:41
yeah, I suppose it depends though - optimised for what ? :)
11:07:59
some query cache thing perhaps, too?
11:08:12
not sure
12:47:17
bengee mangs gets rid of the table locks for DELETE queries
12:47:40
heh, s/mangs/manages to/
12:55:52
bengee: I seem to be managing to stream large datasets into the platform now thanks to ARC's streaming parser :)
12:56:03
oh, cool :)
12:56:29
script hasn't finished yet - but it's not died yet either :) and the triples are definitely going in :)
12:56:42
yay
12:57:59
I have a question about the format detector - I pointed it at a local file (on mac) with a .rdf extension, containing rdf/xml, and it detected xml rather than rdf/xml
13:00:22
I'd need a snippet of the first 1000 chars to see why it's not working
13:00:44
(or 1st couple of lines)
21:54:19
Hi
21:54:26
Anybody's here?
