API (v1)

This class provides convenient access to ARC's classes and methods.

Setup

Simply include the class file:
include_once("path/to/arc/ARC_api.php");

Instantiation

The class can be instantiated with an array of parameters:
  • inc_path (the path to the ARC class files, relative to the calling PHP script)
  • config (an array containing the detailed configuration parameters)
  • config_path (instead of passing a config array directly, it is possible to specify a config-path from where ARC tries to load the configuration on demand. The file at config_path should be a PHP script and provide a function arc_get_api_config() which returns the array with configuration parameters)
e.g.
$args = array(
  "inc_path"=>"code/arc/",
  "config"=>array(
    /* db */
    "db_host"=>"localhost",
    "db_name"=>"my_db1",
    "db_user"=>"user",
    "db_pwd"=>"secret",

    /* store prefix */
    "prefix"=>"intranet1",

    /* store */
    "store_type"=>"basic+",
    "id_type"=>"hash_int", 
    "reversible_consolidation"=>true,
    "index_type"=>"advanced",
    "index_graph_iris"=>true,
    "index_words"=>false
  )
);
$api = new ARC_api($args);
or
$args = array(
  "inc_path"=>"path/to/arc/",
  "config_path"=>"sys/arc_config.php"
);
$api = new ARC_api($args);
Here is an example config array with commented store parameters:
$config = array(
  /* db */
  "db_host"=>"localhost",
  "db_name"=>"my_db1",
  "db_user"=>"user",
  "db_pwd"=>"secret",

  /* store prefix */
  "prefix"=>"intranet1",

  /* store */
  "store_type"=>"basic+",            // (basic|basic+|split)
  "id_type"=>"hash_int",             // (hash_int|hash_md5|hash_sha1|incr_int)
  "reversible_consolidation"=>false, // adds additional columns
  "index_type"=>"advanced",          // (basic|advanced)
  "index_graph_iris"=>true,          // add graph columns to indexes
  "index_words"=>false,              // creates FULLTEXT index on values
  "charset"=>"utf8"                  // for MySQL, if supported
);
The possible parameters in more detail:
  • A basic store uses a single triple table. Although it offers means to move triple duplicates (from different graphs) to a separate duplicates table, it is not possible to query those tables in a single query. The basic store does not use MySQL's MERGE storage engine. It can be used on servers running an older MySQL version (e.g. 4.0.18) where MERGE tables are still too experimental, or when duplicates are not an issue.
  • A basic+ store uses the same table layout as the basic store (a triple table, and a duplicates table), but will create MERGE tables on the two triple tables to allow GRAPH queries across both tables.
  • A split store uses several triple tables to increase query and insert speed. By default, datatype properties are separated from object properties. Additionally, it's possible to specify so-called prop-tables to split out selected properties. Below is an example which specifies two prop-tables:
    $rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    $foaf="http://xmlns.com/foaf/0.1/";
    $pim="http://example.com/pim/pim#";
    
    $config["prop_tables"] = array(
      array(
        "name"=>"type",
        "prop_type"=>"obj",
        "props"=>array($rdf."type")
      ),
      array(
        "name"=>"private",
        "prop_type"=>"obj",
        "props"=>array($foaf."mbox", $pim."fax")
      )
    );
    
    The prop_tables feature is a possibility to significantly speed up store operations, e.g. if you are building a social networking app, a dedicated table for a group of relation properties (foaf:knows, rel:friendOf, etc) will lead to faster path queries. Or for an OWL or SKOS editor, you could specify a prop_table for rdfs:subClassOf or skos:broader, skos:narrower respectively to accelerate the recursive generation of (sub-)tree structures.
    However, this feature is not well-tested in MySQL. If you want to stay on the safe side, it's best to use the basic store only. If you use the split store, it is recommended to add a script which calls the __destruct() method to drop and re-create the MERGEd tables once in a while.
  • The id_type defines if and how hashes for the normalized table layout are created: hash_int uses integers created from an md5 subset of a value, hash_md5 uses a 21-char version of a full md5 hash, hash_sha1 uses a 26-char version of a full sha1 hash, and incr_int does not use hashes, but incremented integers to identify values. The non-integer ids cause larger index size but reduce the probability of colliding ids, incr_int avoids hash-collision completely, but leads to a slower insert speed as a table look-up is needed for each ID creation.
  • The advanced index_type uses some additional indexes to increase query performance.
  • Reversible consolidation allows to undo a smushing operation (in case some meant-to-be distinct resoure descriptions were merged).
Please note that the store configuration cannot be changed after data has been added to the store.

Methods

db_connect()
$api->db_connect();
__destruct
Drop MERGEd tables. (PHP5 calls this automatically)
$api->__destruct();
db_disconnect
$api->db_disconnect();
store_exists
echo ($api->store_exists()) ? "it exists" : "it doesn't exist";
create_store
Creates tables, if they don't exist already.
$success=$api->create_store();
delete_store
Deletes all tables.
$success=$api->delete_store();
reset_store
Truncates the tables.
$success=$api->reset_store();
add_data
Streaming insert from the Web:
$args = array(
  "result_type"=>"json", // (plain|array|json|xml)
  "graph_iri"=>"http://www.planetrdf.com/index.rdf"
 );
$sub_r = $api->add_data($args);
echo ($sub_r["error"]) ? $sub_r["error"] : $sub_r["result"];
Adding a single triple (the triple has to be encoded in Turtle):
$dc="http://purl.org/dc/elements/1.1/";
$args = array(
  "result_type"=>"array", // (plain|array|json|xml)
  "graph_iri"=>"http://localhost/tests/g1", // required
  "add_triple"=>'<http://arc.web-semantics.org/> <'.$dc.'creator> "Benjamin Nowack" .' 
);
$tmp=$api->add_data($args);
Adding RDF/XML code directly:
$dc="http://purl.org/dc/elements/1.1/";
$args = array(
  "result_type"=>"array", // (plain|array|json|xml)
  "graph_iri"=>"http://localhost/tests/g1", // required
  "add_rdfxml"=>'
    <?xml version="1.0" encoding="UTF-8"?>
    <rdf:RDF ...>
      ...
    </rdf:RDF>
  '
);
$sub_r=$api->add_data($args);
query
SPARQL SELECT/ASK/DESCRIBE/CONSTRUCT
$q='
 PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
 SELECT DISTINCT ?g ?p1_name ?p2_name
 WHERE {
   GRAPH ?g { ?p1 foaf:knows ?p2 } .
   ?p1 foaf:name ?p1_name .
   ?p2 foaf:name ?p2_name .
   FILTER(REGEX(?p2_name, "^D")).
 }
 LIMIT 50
';

$args=array(
  "result_type"=>"rows", // (rows|json|xml|single|rows_n_count|row_count|sql)
  "query"=>$q
);

$qr=$api->query($args);

if($rows=$qr["result"]){
  foreach($rows as $row){
   echo $row["g"]." ";
   echo $row["p1_name"]." ";
   echo $row["p2_name"];
   echo "n";
}
delete_data
Deleting triples from a given graph:
$args = array(
  "result_type"=>"array" // (sql|array|plain|xml|json)
  "graph_iri"=>"http://localhost/tests/g1"
);
$sub_r=$api->delete_data($args);
Deleting triples matching a given pattern:
$args = array(
  "del_s"=>"http://arc.web-semantics.org/",
  "graph_iri"=>"http://localhost/tests/g1"
);
$sub_r=$api->delete_data($args);
Possible pattern elements:
Any combination of del_s, del_p, del_o, del_o_lang, del_o_dt, graph_iri.
update_data
Combines delete_data and add_data in a single call:
$dc=http://purl.org/dc/elements/1.1/
$args = array(
  "graph_iri"=>"http://localhost/tests/g1",
  "del_s"=>"http://arc.web-semantics.org/",
  "del_p"=>$dc."creator",
  "add_triple"=>'<http://arc.web-semantics.org/> <'.$dc.'creator> "Benjamin Nowack" .' 
);
$sub_r=$api->update_data($args);
move_duplicates
Moves quads that only differ in their graph value (i.e. s, p, and o are redundant) to a separate duplicates table. Calling this method keeps the main triple table smaller and avoids combinatorial explosions in the SQL engine. The basic+ and the split stores still allow GRAPH queries after duplicate removal.
$tmp=$api->move_duplicates();
restore_duplicates
Moves duplicates back to the main triple table(s).
$tmp=$api->restore_duplicates();
consolidate_resources
Consolidates resources based on functional properties or inverse function properties. The ARC store supports incremental "smushing", the process gets faster when called multiple times.
$owl="http://www.w3.org/2002/07/owl#";
$foaf="http://xmlns.com/foaf/0.1/";
$args=array(
  "fp"=>$owl."sameAs",
  "ifps"=>array($foaf."mbox", $foaf."homepage", $owl."sameAs")
);
$sub_r=$api->consolidate_resources($args);
Possible parameters: fp or fps, ifp or ifps
undo_resource_consolidation
Tries to "un-merge" a smushed resource (needs reversible_consolidation to be set to true ). It may make sense to move back the triple duplicates first.
$args=array(
  "resource_id"=>"_:bnode27" // either a resource IRI or a bnode id
);
$sub_r=$api->undo_resource_consolidation($args);
remove_unlinked_ids
Cleans up the id2val table.
$sub_r=$api->remove_unlinked_ids();
optimize_tables
Should not be needed as ARC uses a fixed table layout.
$tmp=$api->optimize_tables();