ARC Release Notes and Change Log

2008-08-04

  • Addition: ARC2_JSONParser
  • Addition: ARC2_CBJSONParser
  • Addition: ARC2_StoreCBJSONLoader
  • Addition: getSGAJSONParser and getCBJSONParser methods
  • Addition: getStructType method
  • Addition: getPreferredFormat method
  • Addition: ARC2_LegacyXMLSerializer
  • Addition: ARC2_LegacyJSONSerializer
  • Addition: ARC2_POSHRDFSerializer
  • Addition: ARC2_HatomExtractor
  • Addition: toUTF8 method
  • Addition in ARC2_getFormat: Support for CrunchBase JSON
  • Addition in ARC2_Reader: basic support for fixing chunked responses despite sending an HTTP 1.0 request
  • Addition in ARC2_Reader: added ping_only config option for background processes
  • Addition in ARC2_Class: toLegacyXML, toLegacyJSON, toLegacyHTML, toHTML methods
  • Addition in ARC2_RDFParser: CrunchBase API JSON support
  • Addition in ARC2_RDFParser: reset method to free memory
  • Addition in ARC2_StoreLoadQueryHandler: CrunchBase API JSON support
  • Addition in ARC2_TurtleParser: Placeholders are allowed as triple blocks
  • Addition in ARC2_TurtleParser: toUTF8 call in xString (Thanks to Alexandre Passant)
  • Addition in ARC2_SPARQLParser: added Placeholder check to xGroupGraphPattern method
  • Addition in ARC2_SemHTMLParser: Support for hAtom
  • Addition in ARC2_SPARQLScriptParser: support for varMerge
  • Addition in ARC2_SPARQLScriptProcessor: getArraySerialization method
  • Addition in ARC2_SPARQLScriptProcessor: sparqlscript_max_operations and sparqlscript_max_queries config options
  • Addition in ARC2_SPARQLScriptProcessor: processVarMergeAssignmentBlock
  • Addition in ARC2_Store: getDomains method
  • Fix in ARC2_SPARQLScriptParser: last query block was not correctly extracted
  • Fix in ARC2_StoreEndpoint: No headers for DUMP queries, added Vary:Accept header
  • Fix in ARC2_NTriplesSerializer: set but empty lang/datatype was serialized (Thanks to Keith Alexander)
  • Fix in ARC2_TurtleSerializer: set but empty lang/datatype was serialized (Thanks to Keith Alexander)
  • Fix in ARC2_LegacyXMLParser: Reader wasn't unset before ISO switch after character error
  • Tweak in ARC2: using in_array for triple dupe checks in getSimpleIndex (Thanks to Rob Styles)
  • Tweak in ARC2_Reader: including a snippet from the response body in the error message in case of 4xx reponses
  • Tweak in ARC2_StoreEndpoint: HTML layout changes, more fine-grained snippet methods
  • Tweaks in ARC2_StoreDeleteQueryHandler: improved cleanTableReferences method (~3 times faster) + necessity check
  • Tweak in ARC2_RDFXMLParser: moved createBnodeID method to parent class
  • Tweak in ARC2_SGAJSONParser: moved re-usable code to new ARC2_JSONParser
  • Tweak in ARC2_SPARQLScriptProcessor: excluding unused prefix declarations during query construction
  • Tweak in ARC2_RemoteStore: added "query_type" info to query result

2008-07-15

  • Addition: ARC2_SGAJSONParser
  • Addition: ARC2_StoreSGAJSONLoader
  • Addition in ARC2_getFormat: Support for JSON and SG API JSON
  • Addition in ARC2_RDFParser: SG API JSON support
  • Addition in ARC2_StoreLoadQueryHandler: SG API JSON support
  • Fix in ARC2_Reader: readStream considers content-length header and adjusts d_size now
  • Tweak in ARC2_SPARQLParser: '==' is now an allowed operator in xRelationalExpression, for SPARQLScript
  • Addition in ARC2_SPARQLScriptProcessor: Placeholders, Placeholder property paths, Var assignments, FORBlock, Output templates
  • Additon in ARC2_SPARQLScriptParser: IFBlock, FORBlock, String
  • Tweak in ARC2_Store: added "query_type" info to query result
  • Addition in ARC2_TurtleParser: xPlaceholder method for SPARQLScript
  • Tweak in ARC2_RemoteStore: added "query_type" info to query result

2008-07-04


2008-07-02

  • Addition: ARC2_StoreDumper
  • Addition: ARC2_SPOGParser
  • Addition: ARC2_StoreSPOGLoader (Thanks to Morten Frederiksen for idea and code)
  • Addition in ARC2_Store: dump method and support for DUMP query
  • Addition in ARC2_Store: renameTo method
  • Addition in ARC2_Store: replicateTo method
  • Addition in ARC2_Store: createBackup method
  • Addition in ARC2_StoreEndpoint: SPOG export via dump method or DUMP query
  • Addition in ARC2_StoreLoadQueryHandler: sparqlxml/SPOG support
  • Addition in ARC2_RDFParser: sparqlxml/SPOG support
  • Tweak in ARC2_TurtleParser: added setDefaultPrefixes method to support config-set prefixes
  • Tweak in ARC2_SPARQLParser: switching to shared setDefaultPrefixes method to support config-set prefixes
  • Tweak in ARC2_SPARQLScriptParser: switching to shared setDefaultPrefixes method to support config-set prefixes
  • Fix in ARC2_getFormat: colon in xmlns check is now optional (Thanks to Morten Frederiksen)

2008-07-01

  • General change: Talis platform structure alignments (s/iri/uri/ and normalized "literal" type)
  • Tweak in ARC2_TurtleParser: Added some accelerators: xString, xPN_LOCAL, xPN_PREFIX
  • Fix in ARC2_TurtleParser: unescapeNtripleUTF skipped standard escapes, lang literals were not parsed correctly (Thanks to Arto Bendiken)
  • Tweak in ARC2_RDFExtractor: Added support for "keep_cdata_whitespace" config option
  • Tweak in ARC2_SemHTMLParser: Added support for "keep_cdata_whitespace" config option
  • Fix in ARC2_RdfaExtractor: Improved xmlns and xml:lang declarations in XML literals
  • Fix in ARC2_RDFJSONSerializer: Proper jsonEscape method (Thanks to Keith Alexander)
  • Fix in ARC2_StoreEndpoint: Proper jsonEscape method (Thanks to Keith Alexander)

2008-05-30

  • General change: Talis platform structure alignment (s/dt/datatype/ and s/val/value/)
  • Addition: ARC2_HcalendarExtractor (very early)
  • Addition: ARC2_SPARQLScriptParser (very early)
  • Addition: ARC2_SPARQLScriptProcessor (very early)
  • Addition: ARC2_RemoteStore (1st version, inspired by and based on Morten Frederiksen's RemoteEndpoint Plugin)
  • Addition in ARC2_Class: camelCase method
  • Fix in ARC2_ErdfExtractor: Anchors didn't generate label triples (Thanks to Robert Goené)
  • Tweak in ARC2_MicroformatsExtractor: Added cal namespace
  • Fix in ARC2_RDFXMLParser: rdf:type on property arcs was setting a broken s_type (Thanks to Tuukka Hastrup and Keith Alexander)
  • Addition in ARC2_SemHTMLParser: hcalendar hook
  • Fix in ARC2_SPARQLParser: CONSTRUCT regex was too strict
  • Addition in ARC2_TurtleParser: unescapeNtripleUTF method for N-Triples pasring
  • Fix in ARC2_RDFXMLSerializer: typed/lang'd literals were not escaped
  • Fix in ARC2_StoreSelectQueryHandler: incorrect UNION ALL syntax (Thanks to Alexandre Passant)
  • Tweak in ARC2_StoreSelectQueryHandler: switching from MEMORY to MyISAM to reduce memory requirements

2008-04-09

  • Tweak in ARC2_RdfaExtractor: xhtml namespace is added to tags in XMLLiterals
  • Tweak in ARC2_RdfaExtractor: s/instanceof/typeof/
  • Fix in ARC2_MicroformatsExtractor: document URL was set to base URL (Thanks to Morten Frederiksen)
  • Fix in ARC2_RDFExtractor: document URL was set to base URL (Thanks to Morten Frederiksen)
  • Fix in ARC2_DcExtractor: base URL (not doc URL) was used for annotations (Thanks to Morten Frederiksen)
  • Addition in ARC2_LegacyXMLParser: doc URL is added to node index entries
  • Addition in ARC2_SemHTMLParser: doc_url is set

2008-04-08

  • Tweak in ARC2_Store: Changed locking approach from "LOCK TABLE" to "GET LOCK"
  • Tweak in ARC2_StoreInferencer: Changed locking approach from "LOCK TABLE" to "GET LOCK"
  • Tweak in ARC2_StoreHelper: Changed locking approach from "LOCK TABLE" to "GET LOCK"
  • Tweak in ARC2_StoreTableManager(!): skipping triple backup table for now
  • Tweak in ARC2_StoreTableManager(!): Changed several KEY types to UNIQUE, esp g2t
  • Addition in ARC2_StoreTableManager(!): KEY for "misc" column in triple tables for faster DELETEs
  • Tweak in ARC2_StoreDeleteQueryHandler: Changed locking approach from "LOCK TABLE" to "GET LOCK"
  • Tweak in ARC2_StoreLoadQueryHandler: skipping triple backups for now
  • Tweak in ARC2_StoreLoadQueryHandler: Changed locking approach from "LOCK TABLE" to "GET LOCK"
  • Tweak in ARC2_StoreSelectQueryHandler: Changed temp table approach from "CREATE SELECT / ALTER" to "CREATE (TEMPORARY) / INSERT SELECT"
  • Fix in ARC2_StoreSelectQueryHandler: getLeftJoins method didn't send "LEFT" to getRequiredSubJoinSQL (Thanks to Masahide Kanzaki)
  • Addition in ARC2_ErdfExtractor: format detection
  • Addition in ARC2_RdfaExtractor: format detection
  • Addition in ARC2_SemHTMLParser: Format detection for RDFa and eRDF (eRDF profile / RDFa doctype is now required)

2008-03-26

  • Fix in ARC2_StoreLoadQueryHandler: getTermID forced full mysql table scans (Thanks to Steve Ducat)
  • Fix in ARC2_Store: Moved BINARY operator in getTermID() from column to constant to avoid full table scans (Thanks to Steve Ducat)

2008-03-20

  • Fix in ARC2_getFormat: xml wasn't properly detected when the opening tag was very large, added "xmlns:" as xml trigger
  • Fix in ARC2_RDFXMLParser: Entities in XMLLiteral CData sections were decoded (Thanks to Keith Alexander)
  • Fix in ARC2_RdfaExtractor: Avoid triple generation from non-curie rel values
  • Fix in ARC2_RDFExtractor: cdata was sometimes truncated in getPlainContent method
  • Tweak in ARC2_SemHTMLParser: added rel-tag-dc to default extractors
  • Tweak in ARC2_Reader: scheme check before http socket is opened
  • Tweak: inc method: support for third party classes
  • Tweak: getComponent method: support for empty config and third party classes

2008-02-25

  • Fix in ARC2_StoreSelectQueryHandler: Optimized JOIN order (Thanks to Dan Brickley)
  • Tweak in ARC2_RdfaExtractor: Updated to latest Syntax Document
  • Tweak in ARC2_SemHTMLParser: added support for container tags that should be empty
  • Tweak in ARC2_SemHTMLParser: re-added RDFa to default extractors

2008-02-15

  • Fix in ARC2_StoreConstructQueryHandler: auto-adding DISTINCT to avoid unnecessary duplicates (Thanks to Dan Brickley)
  • Fix in ARC2_StoreSelectQueryHandler: lang() test for an empty language didn't work with datatyped literals (Thanks to Viliam Simko)
  • Fix in ARC2_StoreDeleteQueryHandler: Table lock code was broken (Thanks to Morten Frederiksen)
  • Addition in ARC2_RDFParser: Support for RSS Parser
  • Addition in ARC2_StoreLoadQueryHandler: Support for RSS Parser
  • Tweak in ARC2_SemHTMLParser:HTML entities in objects are now decoded
  • Addition: ARC2_RSSParser

2008-02-08

  • Fix in ARC2_RDFSerializer: getPName method created invalid leading digits in local name parts (Thanks to Morten Frederiksen)
  • Fix in ARC2_TurtleParser: large prologues were not supported (Thanks to Morten Frederiksen)

2008-02-07

  • Fix in ARC2_Store: Trigger results removed from all query type results for bw compatibility
  • Addition in ARC2_Store: optimizeTables method
  • Fix in ARC2_StoreSelectQueryHandler: Too greedy graph table join condition in getRequiredSubJoinSQL
  • Fix in ARC2_getFormat: Comments were not supported, RSS detection bug (Thanks to Morten Frederiksen)
  • Tweak in ARC2_SemHTMLParser: extractRDF is now called automatically, unless "auto_extract" option is set to false
  • Tweak in ARC2_StoreDeleteQueryHandler: Tables are auto-optimized after ~50 DELETEs
  • Tweak in ARC2_StoreLoadQueryHandler: Tables are auto-optimized after ~50 LOADs

2008-02-03

  • Fix in ARC2_Store: Trigger results broke ASK query results, they are only added to other query type results now
  • Fix in ARC2_StoreSelectQueryHandler: Wrong equality check against '0' in getRequiredSubJoinSQL

2008-02-01

  • Addition: "triggers" directory
  • Tweak in ARC2_StoreTableManager:made t in triple tabel UNIQUE, testing
  • Addition in ARC2_Store: Support for "infos" result format
  • Tweak in ARC2_Store: Removed @ in front of mysql_connect for better error reporting
  • Addition in ARC2_Store: Support for Triggers
  • Addition in ARC2_Store: new consolidateIFP method
  • Fix in ARC2_getFormat: Multiple xml directives were not supported (Thanks to Morten Frederiksen)
  • Addition in ARC2_RDFParser: countTriples method
  • Addition in ARC2_SemHTMLParser: countTriples method
  • Addition in ARC2_RDFXMLParser: countTriples method
  • Addition in ARC2_TurtleParser: countTriples method
  • Fix in ARC2_TurtleParser: '0's in prefixes were ignored, *ahem* (Thanks to Morten Frederiksen)
  • Fix in ARC2_RdfaExtractor: Multi-line plain literals were typed as XMLLiterals
  • Tweak in ARC2_StoreInferencer: Improved IFP Consolidator
  • Addition in ARC2_StoreQueryHandler: check for allow_extension_functions option
  • Addition in ARC2_StoreEndpoint: check store_allow_extension_functions options
  • Tweak in ARC2_StoreSelectQueryHandler: Improved post-consolidation queries
  • Fix in ARC2_StoreSelectQueryHandler: aggregate results were incorrectly typed
  • Addition in ARC2_StoreSelectQueryHandler: Support for (MySQL) function expressions (Cheers to Morten Frederiksen)

2008-01-19

  • Addition in ARC2_Store: getResourceLabel() method
  • Addition in ARC2_StoreSelectQueryHandler: Allow aggregate aliases in ORDER BY
  • Fix in ARC2_StoreLoadQueryHandler: ID buffers were incorrectly reset (Thanks to Jesper Larsen-Ledet)

2008-01-17

  • Fix in ARC2_TurtleSerializer: Quotation mark guessing was still not correct
  • Tweak in ARC2_RDFExtractor: Accept @alt as plain node content
  • Fix in ARC2_RDFXMLParser: XML parser should only be invoked with standard charset (Thanks to Masahide Kanzaki)
  • Fix in ARC2_LegacyXMLParser: XML parser should only be invoked with standard charset (Thanks to Masahide Kanzaki)
  • Tweak in ARC2_Reader: allow overriding of default val in getEncoding
  • Addition in ARC2_SemHTMLParser: getEncoding method
  • Addition in ARC2_RDFParser: getEncoding method for sub-parser
  • Addition in ARC2_DcExtractor: Support for dc:format via meta

2008-01-16

  • Addition in ARC2_RDFXMLParser: Support for encodings detected by the XML parser
  • Addition in ARC2_LegacyXMLParser: Support for encodings detected by the XML parser
  • Tweak in ARC2_StoreLoadQueryHandler: N-Triples are supported now
  • Tweak in ARC2_Reader: Improved charset header detection
  • Addition in ARC2_Reader: Support for POST (or other) request methods
  • Addition in ARC2_Reader: Support for custom header code

2008-01-15

  • Fix in ARC2: getSimpleIndex method dropped object type information in triples coming from a SPARQL query (Thanks to Ivan Garcia Tora)
  • Tweak in ARC2_Class: Serializer calls (toTurtle etc.) use namespace information defined in the configuration now

2008-01-14

  • Addition in ARC2: getCleanedIndex method for triple removal from a resource index

2008-01-11

  • Fix in ARC2_RDFXMLParser: parseType handling was still broken (Thanks to Ivan Garcia Tora)
  • Addition in ARC2_StoreEndpoint: Support for API read/write keys
  • Addition in ARC2_StoreEndpoint: go() shortcut method

2008-01-09

  • Fix in ARC2_StoreTableManager: TEXT columns should not have DEFAULT values, leads to errors in certain MySQL versions (Thanks to Masahide Kanzaki)
  • Fix in ARC2_RdfaExtractor: xml:base on the root node was ignored (Thanks to Peter Krantz)
  • Addition in ARC2_RDFParser: auto-detected NTtriples format is forwarded to Turtle parser now
  • Tweak in ARC2_getFormat: Improved NTriples detection (Thanks to Morten Frederiksen)
  • Tweak in ARC2_StoreEndpoint: Removed comments from JSON results (Thanks to Alexandre Passant)
  • Tweak in ARC2_StoreDescribeQueryHandler: label auto-detection is now optional

2008-01-08

  • Addition in ARC2_RDFParser: extractRDF() can now be called on main RDFParser Class
  • Tweak in ARC2_SemHTMLParser: Removed RDFa from the default extractor formats for now
  • Fix in ARC2_NTriplesSerializer: Datatype overrides language, not the other way round
  • Fix in ARC2_TurtleSerializer: Datatype overrides language, not the other way round
  • Tweak in ARC2_TurtleParser: Allow \u \U escaping in literals
  • Fixes in ARC2_Class: Better URI calculation
  • Tweak in ARC2_StoreSelectQueryHandler: mysql_real_escape_string was called w/o a DB con
  • Fixes in ARC2_RDFXMLParser: Several fixes, esp. collection and container handling (Thanks to Ivan Garcia Tora)
  • Tweak in ARC2: Support for ARC Plugins
  • Tweak in ARC2_StoreEndpoint: There is now an empty result message for htmltab output

2008-01-07

  • Tweak in SPARQL Endpoint: "output" is now default parameter for custom result formats
  • Fix in SPARQL Endpoint: JSON literals were truncated (Thanks to Dan Brickley)
  • Addition to SPARQL Endpoint: get/post method switch
  • Fix in Turtle Serializer: incorrect literal serialization when both ' and " occured
  • Fix in N-Triples Serializer: final linebreak was missing (Thanks to Morten Frederiksen)
  • Tweak in SPARQL+ Parser: CONSTRUCT keyword is now optional (Hat tip to Tim Berners-Lee)

2008-01-04


2008-01-03

  • Fix in ARC2_StoreSelectQueryHandler: Graph join conditions were not correctly ordered (Thanks to Dan Brickley)
  • Tweak in ARC2_StoreEndpoint: Improved error reporting, added HTML Table output (Thanks to Dan Brickley)
  • Fix: Lost data in XML Literals in RDF/XML Parser (Thanks to Dan Brickley)
  • Several minor tweaks and fixes

2007-12-19

  • Addition to ARC2_StoreLoadQueryHandler: auto-REPAIR of triple tables
  • Fix in ARC2_StoreDeleteQueryHandler: Delete without specifying a target dataset didn't work correctly

2007-12-17

  • Fix: incomplete addT() call during RDF/XML Collection parsing (Thanks to Dave Langley)
  • Fix: missing isset() check during RDF/XML Collection parsing (Thanks to Dave Langley)

2007-12-14

  • Tweak: updated URL to SPARQL+ documentation in SPARQL Endpoint class
  • Tweak: added format selector to SPARQL Endpoint form

2007-12-10

  • Fix: incorrect Turtle literal serialization when both ' and " occurred in a single-line object (Thanks to Eric Hanson)

2007-12-01

  • Fix: graph query bug (unknown graphs were ignored in dataset restriction)

2007-11-30

  • Fix: DB Connection has to be established before the query is rewritten to SQL (Thanks to Dan Brickley)

2007-11-28

  • Addition: toNTriples method was missing in ARC2_Class.php

2007-11-26

  • Fix: getScriptURI() didn't support CLI-initiated requests (Thanks to Keith Alexander and Stéphane Corlosquet)

2007-11-20

  • Fix: RDF/XML Parser called non-existing init() method (Thanks to Keith Alexander)

2007-11-19

  • Release: ARC2 preview

2006-10-24 (ARC1)

  • Fix (RDF store ADD handler): XML Literals weren't handled correctly by the Turtle converter (Thanks to Martynas Jusevicius for note and test data).
  • Feature (RDF store): (Tables for) store variables are now optional.
  • Tweak (general): Removed various PHP notices.
  • Fix (RDF/XML parser): Better handling of non-XML CDATA.
  • Fix (SPARQL2SQL rewriter): Syntax error in CAST function.

2006-06-11 (ARC1)

  • Feature (SPARQL2SQL rewriter): Variables can now be compared via FILTERs.
  • Tweak (RDF/XML parser): Removed a couple of PHP notices.

2006-06-06 (ARC1)

  • Fix (SPARQL parser): Hyphens in QNames and blank node ids were not parsed correctly.

2006-06-02 (ARC1)

  • Fix (SPARQL2SQL rewriter): unsupported filters led to invalid SQL
  • Feature (SPARQL2SQL rewriter): Added support for language queries
    (?s ?p "foo"@en // FILTER (lang(?o) = "en") // FILTER langMatches(lang(?o), "en") ).
  • Feature(SPARQL2SQL rewriter): Added support for datatype queries
    (?s ?p "true"^^xsd:boolean).

2006-06-01 (ARC1)

  • Fix (SPARQL2SQL rewriter): Added support for an unlimited number of non-triple pattern siblings.
  • Tweak (SPARQL2SQL rewriter): Removed a couple of PHP notices.

2006-05-30 (ARC1)

  • Fix (eRDF parser): Schemas/prefixes can be defined after meta tags in the head section now
  • Fix (eRDF parser): Nested literals work with PHP5 now

2006-05-29 (ARC1)

  • Feature (eRDF parser): Added support for typing class values (e.g. class="-foaf-Person")
  • Fix (eRDF parser): Removed IRI-generation bug

2006-05-28 (ARC1)

  • Tweak (SPARQL2SQL rewriter): Another PHP notice removed
  • Component: Added ARC Embedded RDF Parser

2006-05-23 (ARC1)

  • Fix (SPARQL2SQL rewriter): SQL rewriter adds backticks now. (Thanks to Dan Brickley for note and pointer)
  • Feature (SPARQL2SQL rewriter): SQL rewriter accepts REGEX expressions where the 2nd argument is a var now.
  • Tweak (RDF store SELECT handler): "xml" is now the default result type for SELECT queries.