Channel #semsol: Logs

This is a public chat log generated from the #semsol IRC channel.

09:41:13 mmmmmrob: bengee: started writing some notes: http://n2.talis.com/wiki/ARC2_Memory_usage
09:43:46 bengee: ah, cool
09:45:13 bengee: do you think some convenience method (such as parser->reset()) could make sense here?
09:47:05 bengee: I could also think about not setting the dataype and lang if they are not present
09:47:19 bengee: for a lower per-triple footprint
09:47:39 bengee: although we might need more isset() checks in other places then
09:50:43 kwijibo: bengee: sounds good - an additonal benefit would be enabling reuse of the parser for different docs?
09:51:45 bengee: hmm, perhaps
11:00:49 mmmmmrob: bengee: yep, a parser->reset() to clear it would be nice
11:01:12 mmmmmrob: bengee: as would not setting the datatype and lang, but not too worried about those
11:01:53 mmmmmrob: bengee: on a more complex note, you could abbreviate the URIs with namespaces, but that would make lots of the graph handling have to change
11:02:40 bengee: yep, and don't let talis hear that ;)
11:03:20 kwijibo: mmmmmrob-- ;)
11:03:38 bengee: bengee likes the current rdf/json rdf/php approach
11:03:56 kwijibo: phew :)
11:06:06 mmmmmrob: kwijibo: eh? what did I get a karma hit for?
11:06:11 mmmmmrob: mmmmmrob is trying to help
11:06:32 kwijibo: ;) suggesting namespace abbreviations
11:06:38 mmmmmrob: ah
11:06:54 mmmmmrob: It was just a thought, that's why I added that it wuld be copmplex
11:07:16 kwijibo: thus undoing months of discussion and code alignment ;)
11:07:38 mmmmmrob: kwijibo: why would you not want to though? other than it makes things harder - for the internal arrays only, not for the simple Index or anything
11:08:31 mmmmmrob: I wouldn't want to if it were my code - I'd just say 'buy more memory'
11:08:42 mmmmmrob: but that's cause I'm lazy and like simpler code
11:08:46 kwijibo: spose - though what you save on memory you'd maybe lose on performance
11:08:53 mmmmmrob: true
11:09:22 mmmmmrob: you would have to do a lot of string manipulation inbound and outbound
11:09:29 mmmmmrob: probably kills it
11:10:08 kwijibo: i wonder if you could do anything with constants or something
11:10:19 mmmmmrob: anyhoo, now we know the memory usage is linear and ~2k per triple and that we can discard the parser (helper method would be great) it's not too much of a concern
11:11:07 kwijibo: i guess the constant would just get converted to a string on assignment, i'm talking rubbish
11:12:51 bengee: I've just added a reset() method to the RDF parsers, and tweaked the getSimpleIndex builder to use in_array + an array as needle for dupe checks
11:13:36 bengee: thanks a lot for exporing this, mmmmmrob
11:13:42 bengee: exploring even
11:14:18 mmmmmrob: bengee: np, we're using it, so would like to contribute back if we can
11:14:48 mmmmmrob: maybe I can make up for the all mess^H^H^H^H help kwijibo's been giving you
11:15:14 mmmmmrob: mmmmmrob ducks
11:15:35 kwijibo: :D
11:21:58 bengee: ;)