Mailing list ARC-DEV: Archives

[arc-dev] ARC2_SemHTMLParser parse selected DOM nodes

From: "Wagner, Claudia" 
Subject: [arc-dev] ARC2_SemHTMLParser parse selected DOM nodes
Date: Sat, 14 Feb 2009 17:54:38 +0100


This is a multi-part message in MIME format.

------_=_NextPart_001_01C98EC4.EC8750DA
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Ok parsing individual DOM nodes workes now fine for me thx!

I also need to serialize the triples found in individual DOM nodes as =
RDF/XML.=20
Therefore I changed the toRDFXML method of class ARC2_Class in order to =
accept a third parameter, which indicates if the header and footer =
should be included in the serialization result or not:
toRDFXML($v, $ns =3D '', $raw =3D 0)

This parameter is used later anyway. So why not allowing the caller to =
set it when starting the serialization?=20

Cheers Claudia
=20

-----Urspr=FCngliche Nachricht-----
Von: Wagner, Claudia [mailto:claudia.wagner@joanneum.at]
Gesendet: Do 12.02.2009 13:59
An: arc-dev
Betreff: [arc-dev] ARC2_SemHTMLParser parse selected DOM nodes
=20



I just tried the new ARC2_SemHTMLParser feature, which I thought is to =
=3D
parse selected DOM nodes instead of the whole document, but it seems not =
=3D
to work.



What I need to do is:



$test_data=3D'<div xmlns:foaf=3D"http://xmlns.com/foaf/0.1/">

              <div typeOf=3D"foaf:Person" about=3D"#URI">

                <span property=3D"foaf:name">Claudia Wagner</span>

             </div>

            </div>';



$parser->processData($test_data);



But no triples are found. Any ideas whats the problem?



Thanks Claudia



BTW the second optional parameter of parse() is called data. Can you =3D
maybe give an example how to use it.

I thought it could be a XPath which points to the DOM node from where =
=3D
the parser should start to parse.



___________________________________

Claudia WAGNER

Institut f=FCr vernetzte Medien, JOANNEUM RESEARCH Elisabethstra=DFe 20, =
=3D
A-8010 Graz, Austria

Tel. +43 316 876 2617   Fax. +43 316 876 1403

http://www.joanneum.at/inm

email: Claudia {dot} Wagner {at} joanneum {dot} at=20





------_=_NextPart_001_01C98EC4.EC8750DA
Content-Type: application/ms-tnef;
	name="winmail.dat"
Content-Transfer-Encoding: base64

eJ8+IigQAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy
b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEgAEANgAAAFthcmMtZGV2XSBBUkMyX1Nl
bUhUTUxQYXJzZXIgcGFyc2Ugc2VsZWN0ZWQgRE9NIG5vZGVzAIUSAQWAAwAOAAAA2QcCAA4AEQA2
ACYABgBjAQEggAMADgAAANkHAgAOABEANgAnAAYAZAEBCYABACEAAAAxMzgxNjQ4NzEyRkY3QTQ3
OURFOEExNUJCOTdBQTYxQQA3BwEDkAYAvAwAADkAAAADACYAAAAAAAMANgAAAAAAQAA5ANpQh+zE
jskBHgA9AAEAAAABAAAAAAAAAAIBRwABAAAAMgAAAGM9QVQ7YT0gO3A9Sk9BTk5FVU07bD1SWkpD
MkVYLTA5MDIxNDE2NTQzOFotMjE2NTQAAAAeAEkAAQAAADYAAABbYXJjLWRldl0gQVJDMl9TZW1I
VE1MUGFyc2VyIHBhcnNlIHNlbGVjdGVkIERPTSBub2RlcwAAAEAATgAAC7a7EY3JAR4AWgABAAAA
EAAAAFdhZ25lciwgQ2xhdWRpYQACAVsAAQAAAEgAAAAAAAAAgSsfpL6jEBmdbgDdAQ9UAgAAAABX
YWduZXIsIENsYXVkaWEAU01UUABjbGF1ZGlhLndhZ25lckBqb2FubmV1bS5hdAACAVwAAQAAACAA
AABTTVRQOkNMQVVESUEuV0FHTkVSQEpPQU5ORVVNLkFUAB4AXQABAAAAEAAAAFdhZ25lciwgQ2xh
dWRpYQACAV4AAQAAAEgAAAAAAAAAgSsfpL6jEBmdbgDdAQ9UAgAAAABXYWduZXIsIENsYXVkaWEA
U01UUABjbGF1ZGlhLndhZ25lckBqb2FubmV1bS5hdAACAV8AAQAAACAAAABTTVRQOkNMQVVESUEu
V0FHTkVSQEpPQU5ORVVNLkFUAB4AZgABAAAABQAAAFNNVFAAAAAAHgBnAAEAAAAbAAAAY2xhdWRp
YS53YWduZXJAam9hbm5ldW0uYXQAAB4AaAABAAAABQAAAFNNVFAAAAAAHgBpAAEAAAAbAAAAY2xh
dWRpYS53YWduZXJAam9hbm5ldW0uYXQAAB4AcAABAAAANgAAAFthcmMtZGV2XSBBUkMyX1NlbUhU
TUxQYXJzZXIgcGFyc2Ugc2VsZWN0ZWQgRE9NIG5vZGVzAAAAAgFxAAEAAAAbAAAAAcmNEbvPTxgb
5LrOTcS1t5uWCu1K+ABsIsMsAB4AdAABAAAACAAAAGFyYy1kZXYAHgAaDAEAAAAQAAAAV2FnbmVy
LCBDbGF1ZGlhAB4AHQ4BAAAANgAAAFthcmMtZGV2XSBBUkMyX1NlbUhUTUxQYXJzZXIgcGFyc2Ug
c2VsZWN0ZWQgRE9NIG5vZGVzAAAAAgEJEAEAAACFBQAAgQUAAFcJAABMWkZ1tkxx3wMACgByY3Bn
MTI14jIDQ3RleAVBAQMB908KgAKkA+MCAGNoCsBz8GV0MCAHEwKAD/MAUH8EVghVB7IRxQ5RAwEQ
xzL3BgAGwxHFMwRGEMkS2xHT2wjvCfc7GL8OMDURwgxgzmMAUAsJAWQzNhFQC6YwIE9rIAqxAJBu
ZwIgC4BkaXZpZHWBB0AgRE9NIG4EcXUEIHcFsGsHkR8wB+BmHQuAZSAwBbEHgCB0aJx4IQqiCoQK
gEkgB0Dsc28fIAngZCDwIlARILkHIml6IOIg4QURbAeRvQIQdR5QHjEePx9DYQQgAFJERi9YTUwu
+wrjCoBUI6AYwCCRIHAiAO8Q4R4QIpIjom8mYSahIMHjIQAEcCBvZigAC2AEEcBBUkMyX0MqEySx
TwWwBIEisgDQY2UFMWFtIPFpCyAdsmEpYQSQLM8fgCxwEOAk02NhDrAq8fcp4COSI6BhK2IAcCKg
AhBOby0hIuApkHVsIqBi9yBwC4AqAHUBACSTI5Ii9tUuIGkCICAYwHMwMAVAMwWxHzB0OiFEKNYo
JAp2LVAkBjE9ICcnzzRxLOAH4DTQMCkhSidQ7wQALLgk0AQgdREgIqALYEMvwgBweXdheSbQU/si
UC1weTMSIhEYUAPwHhH/I5IuEDmQK3QRISTQBUAtcL8J8CLgAZAAIDnWMas/JuX9IURDI6AEkAQg
KrEw4AcwKyFEPWstQEJVERBwcnxcJxDQHhAjQBDgIHBObQDQaAUQEOB0QEMhRFbBAiA6IFdhZyBg
LUG5PqUgWwDAAxAiwDoqAY0+0i44gENSQGpvAHBRIGB1bS4uIF0hREd/B5AJ8AEAM0Ae4CJQDiAu
JjBHoAHQMDlHgDM6NDU5IURBQwEKwGMttQEAdiFEQhEwJ4FmQxAaW0k1XSpUBmBtSFR9JrBQEQIF
wB3CMYIkEGO/DrAioB73Px9OnCIAajegfwVAI9EoZSBgB+BLPwXAZnUu8HQIcGUtViIAKYF17mdC
EDdiIsE9IURMbx8W/wuAO7Au8SnCI5ItcAbwIHBYZG9jRgAJ8HQtUGL+dTsTESBLkB/iBUBURSLB
7R+SLk6/IZVXEPAFQCIAnyJ2V5A3YTNVXQ8KJC4xCHRfZC4gYT0nPKElASB4bWwAgDoCEBBhZj0i
QhB0cDqULy9fky4FoG0vX/KALzAuMS8iPiFKBiBi219DdHlwZU+zYCFf8jpQPmECICIiEIcG4FhA
YDAjVVJJYf9vYuxAsAORQMBvZCAAIHk9ZGVuLPFmAEO2QzQ8L3dn8mYfYygvJQFq32v8J48Z0F1/
XiVUs3ItPmhBuyvwBBBEXvE0QF6XKW7P/SGGQlhBHzAjxwrAIHIkcf8m0EjwOSAekC7wH3FbwVPx
7yOhaEECYEuQP3LPNhcAcM5rPo15r0nwVFc8NQWgfySBaGAyMh7BNtgp0VSzKL4pN2I6UyKgXuIm
0EMDka55CGBUNgDAeTBxZyUQ/yBwA5EOwCzwJAEu0CARIsH/N6E7IVorU0kFQAWgMDUsQM5YS/Ah
AC1lcG8LgHZS/3QxI6Ee9iAwA2E7UifBWUb/dpJMBDAFO7MislSzWi8hhh5fi0+MXz17aaZBR068
RVIhS1ZROzBYQWZA4u8FwIBgBKARMHoOsAXQCYCFCJBuLVBKT0FOjuBCVR8QUkVTRSphSPQgRSNA
cwGgKXE7sCzgdUDgZFJQIAHQLVBURUFQLTgwMRFQRyzgen0tUEFQAQchNctVICbQKwA0MyAzMTYg
OEI3l6AyNjE3YtFGxGF4lywxNDAV0CFZVWBVd5rwLkWpLwuAbZ8hSkuQRFFDEEO2XHtXkPh0XH1D
JZ3BLiCeIUWmL53HW9GKL6ErfaJgAAAAHgA1EAEAAAA9AAAAPDM0NzdDNUU1Q0EzOTVBNEY4OTdG
NkUzRDVERTgwOTFBMDc2QzgyQUJAUlpKQzJFWC5qcjEubG9jYWw+AAAAAB4AORABAAAAPQAAADwz
NDc3QzVFNUNBMzk1QTRGODk3RjZFM0Q1REU4MDkxQTA3NkM4MkE4QFJaSkMyRVguanIxLmxvY2Fs
PgAAAAAeAEcQAQAAAA8AAABtZXNzYWdlL3JmYzgyMgAACwDyEAEAAAAfAPMQAQAAAHQAAABbAGEA
cgBjAC0AZABlAHYAXQAgAEEAUgBDADIAXwBTAGUAbQBIAFQATQBMAFAAYQByAHMAZQByACAAcABh
AHIAcwBlACAAcwBlAGwAZQBjAHQAZQBkACAARABPAE0AIABuAG8AZABlAHMALgBFAE0ATAAAAAsA
9hAAAAAAQAAHMMUZ3UbCjskBQAAIMJyxqOzEjskBAwDeP69vAAADAPE/BwQAAB4A+D8BAAAAEAAA
AFdhZ25lciwgQ2xhdWRpYQACAfk/AQAAAEMAAAAAAAAA3KdAyMBCEBq0uQgAKy/hggEAAAAAAAAA
L089Sk9BTk5FVU0vT1U9SlIvQ049UkVDSVBJRU5UUy9DTj1XQUMAAB4A+j8BAAAAFQAAAFN5c3Rl
bSBBZG1pbmlzdHJhdG9yAAAAAAIB+z8BAAAAHgAAAAAAAADcp0DIwEIQGrS5CAArL+GCAQAAAAAA
AAAuAAAAAwD9P+QEAAADABlAAAAAAAMAGkAAAAAAAwAdQAAAAAADAB5AAAAAAB4AMEABAAAABAAA
AFdBQwAeADFAAQAAAAQAAABXQUMAHgAyQAEAAAAbAAAAY2xhdWRpYS53YWduZXJAam9hbm5ldW0u
YXQAAB4AM0ABAAAAGwAAAGNsYXVkaWEud2FnbmVyQGpvYW5uZXVtLmF0AAAeADhAAQAAAAQAAABX
QUMAHgA5QAEAAAACAAAALgAAAAMAdkD/////CwApAAAAAAALACMAAAAAAAMABhDtPM6AAwAHEA8F
AAADABAQAAAAAAMAERAAAAAAHgAIEAEAAABlAAAAT0tQQVJTSU5HSU5ESVZJRFVBTERPTU5PREVT
V09SS0VTTk9XRklORUZPUk1FVEhYSUFMU09ORUVEVE9TRVJJQUxJWkVUSEVUUklQTEVTRk9VTkRJ
TklORElWSURVQUxET01OTwAAAAACAX8AAQAAAD0AAAA8MzQ3N0M1RTVDQTM5NUE0Rjg5N0Y2RTNE
NURFODA5MUEwNzZDODJBQkBSWkpDMkVYLmpyMS5sb2NhbD4AAAAAumw=

------_=_NextPart_001_01C98EC4.EC8750DA--

""" ;
         ns1:returnPath "<claudia.wagner@joanneum.at>" ;
         ns1:xOriginalTo "arc-dev@semsol.org" ;
         ns1:deliveredTo "web11p1@p15192371.pureserver.info" ;
         ns1:received """from RZJC2EX.jr1.local (rzjs027.joanneum.ac.at [143.224.71.151])
	by rzjgate1.joanneum.ac.at (8.14.2/8.14.2) with ESMTP id n1EGsdDY438045
	for <arc-dev@semsol.org>; Sat, 14 Feb 2009 17:54:39 +0100 (MET)""" ;
         ns1:dKIMSignature """v=1; a=rsa-sha256; c=relaxed/relaxed; d=joanneum.at;
	s=rzjgate; t=1234630480; bh=ffW2j00CxF4MDIOFtHfn3qYrwY6eY3zbhH66jdY
	hxjg=; h=MIME-Version:Content-Type:Subject:Date:Message-ID:
	 References:From:To; b=ZVySasIBEtNW8DjzXBkH0NaVgtRqI+Hf8/zYlR0lXwIP
	5unIE99pvlZgoxplZ2NDURE+eH3gDF5cJpfH5+X/TOYXizRWmT+AKd/A3M98aRBHZQv
	YyMLOP4tQsH/R5AToi3+x36JeLQRIgHoyIoApXAqFYAJukU7OraR3ORUe6J4=""" ;
         ns1:xMimeOLE "Produced By Microsoft Exchange V6.5" ;
         ns1:contentClass "urn:content-classes:message" ;
         ns1:mIMEVersion "1.0" ;
         ns1:contentType '''multipart/mixed;
	boundary="----_=_NextPart_001_01C98EC4.EC8750DA"''' ;
         ns1:subject "[arc-dev] ARC2_SemHTMLParser parse selected DOM nodes" ;
         ns1:date "Sat, 14 Feb 2009 17:54:38 +0100" ;
         ns1:messageID "<3477C5E5CA395A4F897F6E3D5DE8091A076C82AB@RZJC2EX.jr1.local>" ;
         ns1:xMSHasAttach "" ;
         ns1:xMSTNEFCorrelator "<3477C5E5CA395A4F897F6E3D5DE8091A076C82AB@RZJC2EX.jr1.local>" ;
         ns1:threadTopic "[arc-dev] ARC2_SemHTMLParser parse selected DOM nodes" ;
         ns1:threadIndex "AcmNEbvPTxgb5LrOTcS1t5uWCu1K+ABsIsMs" ;
         ns1:references "<3477C5E5CA395A4F897F6E3D5DE8091A076C82A8@RZJC2EX.jr1.local>" ;
         ns1:from '"Wagner, Claudia" <claudia.wagner@joanneum.at>' ;
         ns1:to '"arc-dev" <arc-dev@semsol.org>' ;
         ns1:xSpamCheckerVersion """SpamAssassin 2.64 (2004-01-11) on 
	p15192371.pureserver.info""" ;
         ns1:xSpamLevel "" ;
         ns1:xSpamStatus """No, hits=-3.9 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE 
	autolearn=ham version=2.64