Re: [arc-dev] LOAD data - but data is an authentication protected
file ... how 2 manage?
From: Fabio Ricci
Subject: Re: [arc-dev] LOAD data - but data is an authentication protected
file ... how 2 manage?
Date: Tue, 26 Jan 2010 10:30:16 +0100
This is a multi-part message in MIME format.
--------------050001010601050506030108
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Dear Benjamin
thank you very much for your advise! It works fine this way.
Related to your question: I think the information on how to put together
an auth. is given in the parser section. If you ask me, I would place
another 3 lines on the spec in the first section related to $config -
this would have saved some time. I know auth is not the main issue here,
but ... it may change.
In http://arc.semsol.org/docs/v2/getting_started :
Instead of writing:
-----------------------
$config = array(
/* db */
'db_host' => 'localhost', /* default: localhost */
'db_name' => 'my_db',
'db_user' => 'user',
'db_pwd' => 'secret',
/* store */
'store_name' => 'arc_tests',
/* network */
'proxy_host' => '192.168.1.1',
'proxy_port' => 8080,
/* parsers */
'bnode_prefix' => 'bn',
/* sem html extraction */
'sem_html_formats' => 'rdfa microformats',
);
One could write:
-----------------------
$config = array(
/* db */
'db_host' => 'localhost', /* default: localhost */
'db_name' => 'my_db',
'db_user' => 'user',
'db_pwd' => 'secret',
/* store */
'store_name' => 'arc_tests',
/* network */
'arc_reader_credentials' => array(
'http://twitter.com/' => 'USER:PASS',
'http://api.talis.com/stores/STORENAME/meta' => 'USER::PASS'*
**),*
/* parsers */
'bnode_prefix' => 'bn',
/* sem html extraction */
'sem_html_formats' => 'rdfa microformats',
);
Cheers
Fabio
Benjamin Nowack a écrit :
> Hi Fabio,
>
> This is probably going to need some tweaks in the ARC core code and
> as I can't rebuild your setup here, I'd suggest we approach it step
> by step and see how far we get ;)
>
> So, let me see if I understand correctly:
> * the ARC-hosting server can access MySQL with the typical creds
> (all fine here)
> * In order to access data, ARC needs to go through a
> password-protected proxy.
> * In order to LOAD RDF data, ARC would need to provide user/pass
> credentials which differ from the proxy creds.
>
> Is that the setup, or am I missing something?
>
> In general, ARC's HTTP client supports proxies, you can specify a
> "proxy_host" and a "proxy_port" parameter. For auth stuff, there
> are separate configuration options. Right now, ARC supports Basic
> Auth and -since a few weeks- also Digest Auth. Proxy Auth would
> still need to be added.
>
> Here is an example $config snippet from an app that needs Basic
> Auth to post to Twitter, and Digest Auth for a Talis platform store:
>
> ...
> 'arc_reader_credentials' => array(
> 'http://twitter.com/' => 'USER:PASS',
> 'http://api.talis.com/stores/STORENAME/meta' => 'USER::PASS',
> ),
> ...
>
> A double colon between USER and PASS tells the ARC Reader to be
> prepared for Digest Auth, otherwise it will use Basic Auth. The
> URI keys tell ARC which URL patterns require authentication. I don't
> know how Proxy Auth is working spec-wise, to be honest. Is it as
> simple as Basic Auth, or does it use a nonce and require a handshake?
> In the former case, I could maybe simply add the required headers
> if:
> * a "proxy_host" is specified
> * the "arc_reader_credentials" contain an entry that matches the
> "proxy_host" value.
>
> Do you know of a simple example similar to [1] that illustrates
> a proxy-plus-auth request? Then I could possibly extend the
> Reader to also support proxy authentication.
>
> Cheers,
> Benji
>
> [1] http://en.wikipedia.org/wiki/Basic_access_authentication
>
>
> On 19.01.2010 22:52:24, Fabio Ricci wrote:
>
>> Hello everybody
>>
>> I started today working witch ARC - thank you for the simplicity!
>> And I could already LOCALLY on my laptop load and query some RDF data.
>>
>> BUT - when I deploy all the package (arc, my software) to another server
>> in a DMZ, problems arose...
>>
>> The next problem is probably authorisation.
>>
>> In this server there is no need to authorize to get to an internal mysql
>> database,
>> but there is the need of going through a proxy server with a user
>> authentication for the proxy,
>> and a protection authentication for the RDF data situated on the server
>> (in order to access to this data, one need some other user/passwd
>> authentification, which is different from the one for the proxy.).
>>
>> My question: How is the $config Array to be configured in order to get
>> through all these authentication?
>>
>> Thanks in advance
>> Fabio
>>
>>
>>
>>
>
>
>
>
--------------050001010601050506030108
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Benjamin<br>
<br>
thank you very much for your advise! It works fine this way.<br>
Related to your question: I think the information on how to put
together an auth. is given in the parser section. If you ask me, I
would place another 3 lines on the spec in the first section related to
$config - this would have saved some time. I know auth is not the main
issue here, but ... it may change.<br>
<br>
In <a class="moz-txt-link-freetext" href="http://arc.semsol.org/docs/v2/getting_started">http://arc.semsol.org/docs/v2/getting_started</a> :<br>
Instead of writing:<br>
-----------------------<br>
$config = array(<br>
/* db */<br>
'db_host' => 'localhost', /* default: localhost */<br>
'db_name' => 'my_db',<br>
'db_user' => 'user',<br>
'db_pwd' => 'secret',<br>
/* store */<br>
'store_name' => 'arc_tests',<br>
<font color="#ff0000"> /* network */<br>
'proxy_host' => '192.168.1.1',<br>
'proxy_port' => 8080,<br>
</font> /* parsers */<br>
'bnode_prefix' => 'bn',<br>
/* sem html extraction */<br>
'sem_html_formats' => 'rdfa microformats',<br>
);<br>
<br>
One could write:<br>
-----------------------<br>
$config = array(<br>
/* db */<br>
'db_host' => 'localhost', /* default: localhost */<br>
'db_name' => 'my_db',<br>
'db_user' => 'user',<br>
'db_pwd' => 'secret',<br>
/* store */<br>
'store_name' => 'arc_tests',<br>
<font color="#3366ff"> /* network */</font><br>
<font color="#3366ff"> 'arc_reader_credentials' => array(<br>
'<a class="moz-txt-link-freetext" href="http://twitter.com/">http://twitter.com/</a>' => 'USER:PASS',<br>
'<a class="moz-txt-link-freetext" href="http://api.talis.com/stores/STORENAME/meta">http://api.talis.com/stores/STORENAME/meta</a>' => 'USER::PASS'</font><font
color="#3366ff"><b><br>
</b><b>),</b></font><br>
/* parsers */<br>
'bnode_prefix' => 'bn',<br>
/* sem html extraction */<br>
'sem_html_formats' => 'rdfa microformats',<br>
);<br>
<br>
<br>
Cheers<br>
Fabio<br>
<div class="moz-signature">
<p class="p2"><br>
</p>
<br>
</div>
<br>
<br>
Benjamin Nowack a écrit :
<blockquote cite="mid:PM-GA.20100125123524.CD33E.3.1D@semsol.com"
type="cite">
<pre wrap="">
Hi Fabio,
This is probably going to need some tweaks in the ARC core code and
as I can't rebuild your setup here, I'd suggest we approach it step
by step and see how far we get ;)
So, let me see if I understand correctly:
* the ARC-hosting server can access MySQL with the typical creds
(all fine here)
* In order to access data, ARC needs to go through a
password-protected proxy.
* In order to LOAD RDF data, ARC would need to provide user/pass
credentials which differ from the proxy creds.
Is that the setup, or am I missing something?
In general, ARC's HTTP client supports proxies, you can specify a
"proxy_host" and a "proxy_port" parameter. For auth stuff, there
are separate configuration options. Right now, ARC supports Basic
Auth and -since a few weeks- also Digest Auth. Proxy Auth would
still need to be added.
Here is an example $config snippet from an app that needs Basic
Auth to post to Twitter, and Digest Auth for a Talis platform store:
...
'arc_reader_credentials' => array(
'<a class="moz-txt-link-freetext" href="http://twitter.com/">http://twitter.com/</a>' => 'USER:PASS',
'<a class="moz-txt-link-freetext" href="http://api.talis.com/stores/STORENAME/meta">http://api.talis.com/stores/STORENAME/meta</a>' => 'USER::PASS',
),
...
A double colon between USER and PASS tells the ARC Reader to be
prepared for Digest Auth, otherwise it will use Basic Auth. The
URI keys tell ARC which URL patterns require authentication. I don't
know how Proxy Auth is working spec-wise, to be honest. Is it as
simple as Basic Auth, or does it use a nonce and require a handshake?
In the former case, I could maybe simply add the required headers
if:
* a "proxy_host" is specified
* the "arc_reader_credentials" contain an entry that matches the
"proxy_host" value.
Do you know of a simple example similar to [1] that illustrates
a proxy-plus-auth request? Then I could possibly extend the
Reader to also support proxy authentication.
Cheers,
Benji
[1] <a class="moz-txt-link-freetext" href="http://en.wikipedia.org/wiki/Basic_access_authentication">http://en.wikipedia.org/wiki/Basic_access_authentication</a>
On 19.01.2010 22:52:24, Fabio Ricci wrote:
</pre>
<blockquote type="cite">
<pre wrap="">
Hello everybody
I started today working witch ARC - thank you for the simplicity!
And I could already LOCALLY on my laptop load and query some RDF data.
BUT - when I deploy all the package (arc, my software) to another server
in a DMZ, problems arose...
The next problem is probably authorisation.
In this server there is no need to authorize to get to an internal mysql
database,
but there is the need of going through a proxy server with a user
authentication for the proxy,
and a protection authentication for the RDF data situated on the server
(in order to access to this data, one need some other user/passwd
authentification, which is different from the one for the proxy.).
My question: How is the $config Array to be configured in order to get
through all these authentication?
Thanks in advance
Fabio
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
</blockquote>
</body>
</html>
--------------050001010601050506030108--
""" ;
ns1:returnPath "<fabio.ricci@ggaweb.ch>" ;
ns1:xOriginalTo "arc-dev@semsol.org" ;
ns1:deliveredTo "web11p1@p15192371.pureserver.info" ;
ns1:received """from Jupiter.local (gprs01.swisscom-mobile.ch [193.247.250.1])
by popeye1.ggamaur.net (8.13.7/8.13.7/Submit) with ESMTP id o0Q9UMF0003603
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for <arc-dev@semsol.org>; Tue, 26 Jan 2010 10:30:26 +0100 (CET)
(envelope-from fabio.ricci@ggaweb.ch)""" ;
ns1:messageID "<4B5EB628.1020307@ggaweb.ch>" ;
ns1:date "Tue, 26 Jan 2010 10:30:16 +0100" ;
ns1:from "Fabio Ricci <fabio.ricci@ggaweb.ch>" ;
ns1:userAgent "Thunderbird 2.0.0.23 (Macintosh/20090812)" ;
ns1:mIMEVersion "1.0" ;
ns1:to "arc-dev <arc-dev@semsol.org>" ;
ns1:subject """Re: [arc-dev] LOAD data - but data is an authentication protected
file ... how 2 manage?""" ;
ns1:references "<4B562998.80301@ggaweb.ch> <PM-GA.20100125123524.CD33E.3.1D@semsol.com>" ;
ns1:inReplyTo "<PM-GA.20100125123524.CD33E.3.1D@semsol.com>" ;
ns1:contentType '''multipart/alternative;
boundary="------------050001010601050506030108"''' ;
ns1:xScannedBy "MIMEDefang 2.64 on 213.160.40.60" ;
ns1:xSpamCheckerVersion """SpamAssassin 2.64 (2004-01-11) on
p15192371.pureserver.info""" ;
ns1:xSpamLevel "" ;
ns1:xSpamStatus """No, hits=-0.7 required=5.0 tests=BAYES_20,HTML_FONTCOLOR_BLUE,
HTML_FONTCOLOR_RED,HTML_MESSAGE,HTML_TITLE_EMPTY,RCVD_IN_SORBS_WEB
autolearn=no version=2.64