Mailing list ARC-DEV: Archives

Re: [arc-dev] LOAD data - but data is an authentication protected file ... how 2 manage?

From: Fabio Ricci 
Subject: Re: [arc-dev] LOAD data - but data is an authentication protected
 file ... how 2 manage?
Date: Tue, 26 Jan 2010 10:30:16 +0100


This is a multi-part message in MIME format.
--------------050001010601050506030108
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit

Dear Benjamin

thank you very much for your advise!  It works fine this way.
Related to your question: I think the information on how to put together 
an auth. is given in the parser section. If you ask me, I would place 
another 3 lines on the spec in the first section related to $config - 
this would have saved some time. I know auth is not the main issue here, 
but ... it may change.

In http://arc.semsol.org/docs/v2/getting_started :
Instead of writing:
-----------------------
$config = array(
  /* db */
  'db_host' => 'localhost', /* default: localhost */
  'db_name' => 'my_db',
  'db_user' => 'user',
  'db_pwd' => 'secret',
  /* store */
  'store_name' => 'arc_tests',
  /* network */
  'proxy_host' => '192.168.1.1',
  'proxy_port' => 8080,
  /* parsers */
  'bnode_prefix' => 'bn',
  /* sem html extraction */
  'sem_html_formats' => 'rdfa microformats',
);

One could write:
-----------------------
$config = array(
  /* db */
  'db_host' => 'localhost', /* default: localhost */
  'db_name' => 'my_db',
  'db_user' => 'user',
  'db_pwd' => 'secret',
  /* store */
  'store_name' => 'arc_tests',
  /* network */
   'arc_reader_credentials' => array(
        'http://twitter.com/' => 'USER:PASS',
        'http://api.talis.com/stores/STORENAME/meta' => 'USER::PASS'*
    **),*
  /* parsers */
  'bnode_prefix' => 'bn',
  /* sem html extraction */
  'sem_html_formats' => 'rdfa microformats',
);


Cheers
Fabio





Benjamin Nowack a écrit :
> Hi Fabio,
>
> This is probably going to need some tweaks in the ARC core code and
> as I can't rebuild your setup here, I'd suggest we approach it step
> by step and see how far we get ;)
>
> So, let me see if I understand correctly:
> * the ARC-hosting server can access MySQL with the typical creds
>   (all fine here)
> * In order to access data, ARC needs to go through a 
>   password-protected proxy.
> * In order to LOAD RDF data, ARC would need to provide user/pass
>   credentials which differ from the proxy creds.
>
> Is that the setup, or am I missing something?
>
> In general, ARC's HTTP client supports proxies, you can specify a
> "proxy_host" and a "proxy_port" parameter. For auth stuff, there
> are separate configuration options. Right now, ARC supports Basic 
> Auth and -since a few weeks- also Digest Auth. Proxy Auth would 
> still need to be added.
>
> Here is an example $config snippet from an app that needs Basic
> Auth to post to Twitter, and Digest Auth for a Talis platform store:
>
> ...
> 'arc_reader_credentials' => array(
>   'http://twitter.com/' => 'USER:PASS',
>   'http://api.talis.com/stores/STORENAME/meta' => 'USER::PASS',
> ),
> ...
>
> A double colon between USER and PASS tells the ARC Reader to be 
> prepared for Digest Auth, otherwise it will use Basic Auth. The
> URI keys tell ARC which URL patterns require authentication. I don't
> know how Proxy Auth is working spec-wise, to be honest. Is it as
> simple as Basic Auth, or does it use a nonce and require a handshake?
> In the former case, I could maybe simply add the required headers
> if:
> * a "proxy_host" is specified
> * the "arc_reader_credentials" contain an entry that matches the
>   "proxy_host" value.
>
> Do you know of a simple example similar to [1] that illustrates 
> a proxy-plus-auth request? Then I could possibly extend the
> Reader to also support proxy authentication.
>
> Cheers,
> Benji
>
> [1] http://en.wikipedia.org/wiki/Basic_access_authentication
>
>
> On 19.01.2010 22:52:24, Fabio Ricci wrote:
>   
>> Hello everybody
>>
>> I started today working witch ARC - thank you for the simplicity!
>> And I could already LOCALLY on my laptop load and query some RDF data.
>>
>> BUT - when I deploy all the package (arc, my software) to another server 
>> in a DMZ, problems arose...
>>
>> The next problem is probably authorisation.
>>
>> In this server there is no need to authorize to get to an internal mysql 
>> database,
>> but there is the need of going through a proxy server with a user 
>> authentication for the proxy,
>> and a protection authentication for the RDF data situated on the server 
>> (in order to access to this data, one need some other user/passwd 
>> authentification, which is different from the one for the proxy.).
>>
>> My question: How is the $config Array to be configured in order to get 
>> through all these authentication?
>>
>> Thanks in advance
>> Fabio
>>
>>
>>
>>     
>
>
>
>   

--------------050001010601050506030108
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
  <title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Benjamin<br>
<br>
thank you very much for your advise!&nbsp; It works fine this way.<br>
Related to your question: I think the information on how to put
together an auth. is given in the parser section. If you ask me, I
would place another 3 lines on the spec in the first section related to
$config - this would have saved some time. I know auth is not the main
issue here, but ... it may change.<br>
<br>
In <a class="moz-txt-link-freetext" href="http://arc.semsol.org/docs/v2/getting_started">http://arc.semsol.org/docs/v2/getting_started</a> :<br>
Instead of writing:<br>
-----------------------<br>
$config = array(<br>
&nbsp; /* db */<br>
&nbsp; 'db_host' =&gt; 'localhost', /* default: localhost */<br>
&nbsp; 'db_name' =&gt; 'my_db',<br>
&nbsp; 'db_user' =&gt; 'user',<br>
&nbsp; 'db_pwd' =&gt; 'secret',<br>
&nbsp; /* store */<br>
&nbsp; 'store_name' =&gt; 'arc_tests',<br>
<font color="#ff0000">&nbsp; /* network */<br>
&nbsp; 'proxy_host' =&gt; '192.168.1.1',<br>
&nbsp; 'proxy_port' =&gt; 8080,<br>
</font>&nbsp; /* parsers */<br>
&nbsp; 'bnode_prefix' =&gt; 'bn',<br>
&nbsp; /* sem html extraction */<br>
&nbsp; 'sem_html_formats' =&gt; 'rdfa microformats',<br>
);<br>
<br>
One could write:<br>
-----------------------<br>
$config = array(<br>
&nbsp; /* db */<br>
&nbsp; 'db_host' =&gt; 'localhost', /* default: localhost */<br>
&nbsp; 'db_name' =&gt; 'my_db',<br>
&nbsp; 'db_user' =&gt; 'user',<br>
&nbsp; 'db_pwd' =&gt; 'secret',<br>
&nbsp; /* store */<br>
&nbsp; 'store_name' =&gt; 'arc_tests',<br>
<font color="#3366ff">&nbsp; /* network */</font><br>
<font color="#3366ff">&nbsp;&nbsp; 'arc_reader_credentials' =&gt; array(<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; '<a class="moz-txt-link-freetext" href="http://twitter.com/">http://twitter.com/</a>' =&gt; 'USER:PASS',<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; '<a class="moz-txt-link-freetext" href="http://api.talis.com/stores/STORENAME/meta">http://api.talis.com/stores/STORENAME/meta</a>' =&gt; 'USER::PASS'</font><font
 color="#3366ff"><b><br>
&nbsp;&nbsp;&nbsp; </b><b>),</b></font><br>
&nbsp; /* parsers */<br>
&nbsp; 'bnode_prefix' =&gt; 'bn',<br>
&nbsp; /* sem html extraction */<br>
&nbsp; 'sem_html_formats' =&gt; 'rdfa microformats',<br>
);<br>
<br>
<br>
Cheers<br>
Fabio<br>
<div class="moz-signature">
<p class="p2"><br>
</p>
<br>
</div>
<br>
<br>
Benjamin Nowack a &eacute;crit&nbsp;:
<blockquote cite="mid:PM-GA.20100125123524.CD33E.3.1D@semsol.com"
 type="cite">
  <pre wrap="">
Hi Fabio,

This is probably going to need some tweaks in the ARC core code and
as I can't rebuild your setup here, I'd suggest we approach it step
by step and see how far we get ;)

So, let me see if I understand correctly:
* the ARC-hosting server can access MySQL with the typical creds
  (all fine here)
* In order to access data, ARC needs to go through a 
  password-protected proxy.
* In order to LOAD RDF data, ARC would need to provide user/pass
  credentials which differ from the proxy creds.

Is that the setup, or am I missing something?

In general, ARC's HTTP client supports proxies, you can specify a
"proxy_host" and a "proxy_port" parameter. For auth stuff, there
are separate configuration options. Right now, ARC supports Basic 
Auth and -since a few weeks- also Digest Auth. Proxy Auth would 
still need to be added.

Here is an example $config snippet from an app that needs Basic
Auth to post to Twitter, and Digest Auth for a Talis platform store:

...
'arc_reader_credentials' =&gt; array(
  '<a class="moz-txt-link-freetext" href="http://twitter.com/">http://twitter.com/</a>' =&gt; 'USER:PASS',
  '<a class="moz-txt-link-freetext" href="http://api.talis.com/stores/STORENAME/meta">http://api.talis.com/stores/STORENAME/meta</a>' =&gt; 'USER::PASS',
),
...

A double colon between USER and PASS tells the ARC Reader to be 
prepared for Digest Auth, otherwise it will use Basic Auth. The
URI keys tell ARC which URL patterns require authentication. I don't
know how Proxy Auth is working spec-wise, to be honest. Is it as
simple as Basic Auth, or does it use a nonce and require a handshake?
In the former case, I could maybe simply add the required headers
if:
* a "proxy_host" is specified
* the "arc_reader_credentials" contain an entry that matches the
  "proxy_host" value.

Do you know of a simple example similar to [1] that illustrates 
a proxy-plus-auth request? Then I could possibly extend the
Reader to also support proxy authentication.

Cheers,
Benji

[1] <a class="moz-txt-link-freetext" href="http://en.wikipedia.org/wiki/Basic_access_authentication">http://en.wikipedia.org/wiki/Basic_access_authentication</a>


On 19.01.2010 22:52:24, Fabio Ricci wrote:
  </pre>
  <blockquote type="cite">
    <pre wrap="">
Hello everybody

I started today working witch ARC - thank you for the simplicity!
And I could already LOCALLY on my laptop load and query some RDF data.

BUT - when I deploy all the package (arc, my software) to another server 
in a DMZ, problems arose...

The next problem is probably authorisation.

In this server there is no need to authorize to get to an internal mysql 
database,
but there is the need of going through a proxy server with a user 
authentication for the proxy,
and a protection authentication for the RDF data situated on the server 
(in order to access to this data, one need some other user/passwd 
authentification, which is different from the one for the proxy.).

My question: How is the $config Array to be configured in order to get 
through all these authentication?

Thanks in advance
Fabio



    </pre>
  </blockquote>
  <pre wrap=""><!---->


  </pre>
</blockquote>
</body>
</html>

--------------050001010601050506030108--

""" ;
         ns1:returnPath "<fabio.ricci@ggaweb.ch>" ;
         ns1:xOriginalTo "arc-dev@semsol.org" ;
         ns1:deliveredTo "web11p1@p15192371.pureserver.info" ;
         ns1:received """from Jupiter.local (gprs01.swisscom-mobile.ch [193.247.250.1])
	by popeye1.ggamaur.net (8.13.7/8.13.7/Submit) with ESMTP id o0Q9UMF0003603
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
	for <arc-dev@semsol.org>; Tue, 26 Jan 2010 10:30:26 +0100 (CET)
	(envelope-from fabio.ricci@ggaweb.ch)""" ;
         ns1:messageID "<4B5EB628.1020307@ggaweb.ch>" ;
         ns1:date "Tue, 26 Jan 2010 10:30:16 +0100" ;
         ns1:from "Fabio Ricci <fabio.ricci@ggaweb.ch>" ;
         ns1:userAgent "Thunderbird 2.0.0.23 (Macintosh/20090812)" ;
         ns1:mIMEVersion "1.0" ;
         ns1:to "arc-dev <arc-dev@semsol.org>" ;
         ns1:subject """Re: [arc-dev] LOAD data - but data is an authentication protected
 file ... how 2 manage?""" ;
         ns1:references "<4B562998.80301@ggaweb.ch> <PM-GA.20100125123524.CD33E.3.1D@semsol.com>" ;
         ns1:inReplyTo "<PM-GA.20100125123524.CD33E.3.1D@semsol.com>" ;
         ns1:contentType '''multipart/alternative;
 boundary="------------050001010601050506030108"''' ;
         ns1:xScannedBy "MIMEDefang 2.64 on 213.160.40.60" ;
         ns1:xSpamCheckerVersion """SpamAssassin 2.64 (2004-01-11) on 
	p15192371.pureserver.info""" ;
         ns1:xSpamLevel "" ;
         ns1:xSpamStatus """No, hits=-0.7 required=5.0 tests=BAYES_20,HTML_FONTCOLOR_BLUE,
	HTML_FONTCOLOR_RED,HTML_MESSAGE,HTML_TITLE_EMPTY,RCVD_IN_SORBS_WEB 
	autolearn=no version=2.64