This parser creates an array of triples from RDF/XML. It passes each of the 128 positive parser tests.
A single triple is structured as follows:
Setup
Simply include the parser class:include_once("path/to/arc/ARC_rdfxml_parser.php");Instantiation
The parser can be instantiated with an array of parameters with the following (all optional) keys:- base
- bnode_prefix (custom bnode prefix)
- encoding
- proxy_host
- proxy_port
- user_agent (custom User-Agent string)
- headers (an array of HTTP headers)
- save_data (parsed RDF/XML chunks will be stored in a variable during parsing)
$args = array( "bnode_prefix" => "genid", "base" => "" ); $parser = new ARC_rdfxml_parser($args);
Parsing
There are three different methods for parsing:- parse_web_file($url)
- parse_file ($path)
- parse_data ($data)
parse_web_file method sends an "Accept: application/rdf+xml" header and follows up to 4 HTTP redirects. Here is an example for parsing an RDF/XML file from the Web:$url = "http://www.example.com/data.rdf";
$result = $parser->parse_web_file($url);
if (is_array($result)) {
echo count($result) . " triples found";
}
else {
echo "couldn't parse " . $url . ": " . $result;
}
Triples array structure
The triples array returned by the parser is a flat array of associative arrays. It can be processed with a simple loop:$triples = $parser->parse_web_file($url);
for ($i = 0, $i_max = count($triples); $i < $i_max; $i++) {
$triple = $triples[$i];
echo 'triple ' . $i . ': ';
print_r($triple);
$triple = array(
's' => array(
'type' => 'uri|bnode',
'uri|bnode_id' => '...' // subject value
),
'p' => '...', // property URI
'o' => array(
'type' => 'uri|bnode|literal',
'uri|bnode_id|val' => '...', // object value
'dt' => '...', // datatype URI
'lang' => '...', // language
)
);
Methods
- set_base ($base)
- expects a URL for $base.
- init()
- resets the parser and re-processes the array of parameters that were passed when the parser was instantiated.
- parse_web_file($url)
- expects the URL of an RDF/XML document for $url and returns an array of triples or an error string. This method considers proxy settings and additional headers.
- parse_file($path)
- expects the path to an RDF/XML document for $path and returns an array of triples or an error string. $path can be a URL.
- parse_data($data)
- expects an RDF/XML string for $data and returns an array of triples or an error string.
- get_triples()
- returns the current array of triples.
- get_target_encoding()
- returns the value of the parser's target encoding option ("UTF-8", "ISO-8859-1" or "US-ASCII")
- get_data()
- returns the parsed RDF/XML if the parser was initialized with
"save_data" => true - get_result_headers()
- returns an array of HTTP headers if the method parse_web_file() was called
