This class provides convenient access to ARC's classes and methods.
or
Here is an example config array with commented store parameters:
The possible parameters in more detail:
Setup
Simply include the class file:include_once("path/to/arc/ARC_api.php");Instantiation
The class can be instantiated with an array of parameters:- inc_path (the path to the ARC class files, relative to the calling PHP script)
- config (an array containing the detailed configuration parameters)
- config_path (instead of passing a config array directly, it is possible to specify a config-path from where ARC tries to load the configuration on demand. The file at config_path should be a PHP script and provide a function
arc_get_api_config()which returns the array with configuration parameters)
$args = array(
"inc_path"=>"code/arc/",
"config"=>array(
/* db */
"db_host"=>"localhost",
"db_name"=>"my_db1",
"db_user"=>"user",
"db_pwd"=>"secret",
/* store prefix */
"prefix"=>"intranet1",
/* store */
"store_type"=>"basic+",
"id_type"=>"hash_int",
"reversible_consolidation"=>true,
"index_type"=>"advanced",
"index_graph_iris"=>true,
"index_words"=>false
)
);
$api = new ARC_api($args);
$args = array( "inc_path"=>"path/to/arc/", "config_path"=>"sys/arc_config.php" ); $api = new ARC_api($args);
$config = array( /* db */ "db_host"=>"localhost", "db_name"=>"my_db1", "db_user"=>"user", "db_pwd"=>"secret", /* store prefix */ "prefix"=>"intranet1", /* store */ "store_type"=>"basic+", // (basic|basic+|split) "id_type"=>"hash_int", // (hash_int|hash_md5|hash_sha1|incr_int) "reversible_consolidation"=>false, // adds additional columns "index_type"=>"advanced", // (basic|advanced) "index_graph_iris"=>true, // add graph columns to indexes "index_words"=>false, // creates FULLTEXT index on values "charset"=>"utf8" // for MySQL, if supported );
- A basic store uses a single triple table. Although it offers means to move triple duplicates (from different graphs) to a separate duplicates table, it is not possible to query those tables in a single query. The basic store does not use MySQL's MERGE storage engine. It can be used on servers running an older MySQL version (e.g. 4.0.18) where MERGE tables are still too experimental, or when duplicates are not an issue.
- A basic+ store uses the same table layout as the basic store (a triple table, and a duplicates table), but will create MERGE tables on the two triple tables to allow GRAPH queries across both tables.
- A split store uses several triple tables to increase query and insert speed. By default, datatype properties are separated from object properties. Additionally, it's possible to specify so-called prop-tables to split out selected properties. Below is an example which specifies two prop-tables:
The prop_tables feature is a possibility to significantly speed up store operations, e.g. if you are building a social networking app, a dedicated table for a group of relation properties ($rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; $foaf="http://xmlns.com/foaf/0.1/"; $pim="http://example.com/pim/pim#"; $config["prop_tables"] = array( array( "name"=>"type", "prop_type"=>"obj", "props"=>array($rdf."type") ), array( "name"=>"private", "prop_type"=>"obj", "props"=>array($foaf."mbox", $pim."fax") ) );foaf:knows,rel:friendOf, etc) will lead to faster path queries. Or for an OWL or SKOS editor, you could specify a prop_table forrdfs:subClassOforskos:broader,skos:narrowerrespectively to accelerate the recursive generation of (sub-)tree structures.
However, this feature is not well-tested in MySQL. If you want to stay on the safe side, it's best to use the basic store only. If you use the split store, it is recommended to add a script which calls the__destruct()method to drop and re-create the MERGEd tables once in a while. - The id_type defines if and how hashes for the normalized table layout are created: hash_int uses integers created from an md5 subset of a value, hash_md5 uses a 21-char version of a full md5 hash, hash_sha1 uses a 26-char version of a full sha1 hash, and incr_int does not use hashes, but incremented integers to identify values. The non-integer ids cause larger index size but reduce the probability of colliding ids, incr_int avoids hash-collision completely, but leads to a slower insert speed as a table look-up is needed for each ID creation.
- The advanced index_type uses some additional indexes to increase query performance.
- Reversible consolidation allows to undo a smushing operation (in case some meant-to-be distinct resoure descriptions were merged).
Methods
- db_connect()
-
$api->db_connect();
- __destruct
- Drop MERGEd tables. (PHP5 calls this automatically)
$api->__destruct();
- db_disconnect
-
$api->db_disconnect();
- store_exists
-
echo ($api->store_exists()) ? "it exists" : "it doesn't exist";
- create_store
- Creates tables, if they don't exist already.
$success=$api->create_store();
- delete_store
- Deletes all tables.
$success=$api->delete_store();
- reset_store
- Truncates the tables.
$success=$api->reset_store();
- add_data
- Streaming insert from the Web:
Adding a single triple (the triple has to be encoded in Turtle):$args = array( "result_type"=>"json", // (plain|array|json|xml) "graph_iri"=>"http://www.planetrdf.com/index.rdf" ); $sub_r = $api->add_data($args); echo ($sub_r["error"]) ? $sub_r["error"] : $sub_r["result"];
Adding RDF/XML code directly:$dc="http://purl.org/dc/elements/1.1/"; $args = array( "result_type"=>"array", // (plain|array|json|xml) "graph_iri"=>"http://localhost/tests/g1", // required "add_triple"=>'<http://arc.web-semantics.org/> <'.$dc.'creator> "Benjamin Nowack" .' ); $tmp=$api->add_data($args);
$dc="http://purl.org/dc/elements/1.1/"; $args = array( "result_type"=>"array", // (plain|array|json|xml) "graph_iri"=>"http://localhost/tests/g1", // required "add_rdfxml"=>' <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF ...> ... </rdf:RDF> ' ); $sub_r=$api->add_data($args); - query
- SPARQL SELECT/ASK/DESCRIBE/CONSTRUCT
$q=' PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT DISTINCT ?g ?p1_name ?p2_name WHERE { GRAPH ?g { ?p1 foaf:knows ?p2 } . ?p1 foaf:name ?p1_name . ?p2 foaf:name ?p2_name . FILTER(REGEX(?p2_name, "^D")). } LIMIT 50 '; $args=array( "result_type"=>"rows", // (rows|json|xml|single|rows_n_count|row_count|sql) "query"=>$q ); $qr=$api->query($args); if($rows=$qr["result"]){ foreach($rows as $row){ echo $row["g"]." "; echo $row["p1_name"]." "; echo $row["p2_name"]; echo "n"; } - delete_data
- Deleting triples from a given graph:
Deleting triples matching a given pattern:$args = array( "result_type"=>"array" // (sql|array|plain|xml|json) "graph_iri"=>"http://localhost/tests/g1" ); $sub_r=$api->delete_data($args);
Possible pattern elements:$args = array( "del_s"=>"http://arc.web-semantics.org/", "graph_iri"=>"http://localhost/tests/g1" ); $sub_r=$api->delete_data($args);
Any combination ofdel_s, del_p, del_o, del_o_lang, del_o_dt, graph_iri.
- update_data
- Combines delete_data and add_data in a single call:
$dc=http://purl.org/dc/elements/1.1/ $args = array( "graph_iri"=>"http://localhost/tests/g1", "del_s"=>"http://arc.web-semantics.org/", "del_p"=>$dc."creator", "add_triple"=>'<http://arc.web-semantics.org/> <'.$dc.'creator> "Benjamin Nowack" .' ); $sub_r=$api->update_data($args);
- move_duplicates
- Moves quads that only differ in their graph value (i.e. s, p, and o are redundant) to a separate duplicates table. Calling this method keeps the main triple table smaller and avoids combinatorial explosions in the SQL engine. The basic+ and the split stores still allow GRAPH queries after duplicate removal.
$tmp=$api->move_duplicates();
- restore_duplicates
- Moves duplicates back to the main triple table(s).
$tmp=$api->restore_duplicates();
- consolidate_resources
- Consolidates resources based on functional properties or inverse function properties. The ARC store supports incremental "smushing", the process gets faster when called multiple times.
Possible parameters:$owl="http://www.w3.org/2002/07/owl#"; $foaf="http://xmlns.com/foaf/0.1/"; $args=array( "fp"=>$owl."sameAs", "ifps"=>array($foaf."mbox", $foaf."homepage", $owl."sameAs") ); $sub_r=$api->consolidate_resources($args);
fp or fps, ifp or ifps - undo_resource_consolidation
- Tries to "un-merge" a smushed resource (needs reversible_consolidation to be set to true ). It may make sense to move back the triple duplicates first.
$args=array( "resource_id"=>"_:bnode27" // either a resource IRI or a bnode id ); $sub_r=$api->undo_resource_consolidation($args);
- remove_unlinked_ids
- Cleans up the id2val table.
$sub_r=$api->remove_unlinked_ids();
- optimize_tables
- Should not be needed as ARC uses a fixed table layout.
$tmp=$api->optimize_tables();
