Information on resource 'purx Proxy Registry'

purx lets you register your services without having to run a full OAI-PMH endpoint, but still letting you programmatically control the VOResrouce XML.

What's this?

A proxy publishing registry – this lets you simply put up Virtual Observatory resource records on a plain webserver or generate them programmatically and then enroll their URLs at this service. We will then make sure the VO Registry will see your records. That's something you want because then your service will show up in popular VO clients like TOPCAT or Aladin.

How do I use it?

First, prepare a registry record. There are various different scenarios:

  • DaCHS users: DaCHS can produce resource records for all your services and tables. The URL is:

    <base uri>/getRR/<rdid>/<resourcename>
    

    (as in .../getRR/__system__/tap/run for the TAP service); you can also see a link to that way down on service info pages under “VOResource XML”.

  • Other toolkits: Perhaps your toolkit can produce VOResource already; in particular if you're publishing through TAP, you already have the difficult pieces capabilities and tables, as they are available on their VOSI endpoints (for TAP 1.0, that's the .../capabilities and .../tables children of your access URL). You can simply include that XML into the registry records as described below, adjusting the root elements (which can safely be done using regular expression). Toolkit authors: Let us know what to write here.

  • Write registry records from scratch

Unless your toolkit already gives you a URL, make the registry record built in this way available under an http(s) URL (http is preferred because there's less that can fail).

It is highly recommended to make sure the file is being served with last-modified headers and support for HTTP if-modified-since (that's the case if you use a plain file on a capable web server like apache). That way, our checks for updates (which we do every 80 ks) will consume almost no resources at all.

The system will work even without last-modified. You must, however, manage the updated attribute on your ri:Resource, element. purx will not update the record unless that date is updated.

Then submit the URL to the purx enrollment service. This will validate your resource record and generate an IVOA identifier for it if all checks out. It will then send a mail to the address given as contact in the resource record with something like an activation URL (which is valid for at least 150 ks). Once that URL is retrieved, you should see your record in the common registries within a day or so (these first need to hit purx's OAI-PMH face; the Registry is pull in this way rather than push).

You can always check the idea purx has of your service by entering your access URL at the purx status service.

What Identifier Will I Get?

One of the big advantages of purx is that you don't have to think of an authority and claim it. The downside is that we will assign the identifier. We're trying to be accomodating, though.

Just set the identifier element with an arbitrary (ignored by us) authority – ivo://ignored/generic/service, say. We will then take the path part (generic/service in this case; obviously, you should pick something descriptive here) and glue that together with the authority. That gives ivo://purx/generic/service, which would be your identifier.

If that identifier is already taken by someone else, we take the longest element of the host part of the URL we got the XML from, and stick that right behind the authority. For instance, if we got the document from http://ari.uni-heidelberg.de/doc.xml, the resulting identifier would be ivo://purx/uni-heidelberg/generic/service.

If that still clashes, we try to disambiguate by appending numbers, but I'd suspect somebody is trolling us if that happens.

Also note that the URL-ivoid relationship is fixed once the identifier is minted. Even if you change the identifier at your end, the identifier assigned by purx will remain.

Getting Out of purx

Purx will regularly re-retrieve your file. If that fails or if the document becomes invalid, purx will, after a couple of weeks, drop your service record (technically, it will henceforth publish a „deleted record” for it). It will send two mails explaining the situation to the contact person given in the record that it last saw.

If you want to immediately get rid of the record, just arrange for your webserver to return a 403 Forbidden HTTP status code to purx.

You can always resuscitate your record by resubmitting it to the purx enrollment service again. It will receive the ivoid it had before.

Write Registry Records from Scratch

That's a bit of work, in particular if you want to provide table metadata (which is highly recommended). Before starting you should have the following specs handy (in the sense of: searchable; don't even try to actually read them):

  • VOResource (basic definitions)
  • VODataService (definitions of services and table metadata elements)
  • SimpleDALRegExt (metadata for cone search and S*AP; the equivalent for TAP is TAPRegExt, but you don't need that because your TAP server already spits out ready-made content for that from capabilities).

You can use these to look up explanations for elements you don't understand in the sample records. There's also http://docs.g-vo.org/schemadoc/, giving a javadoc-like reference for the schema files used in the VO.

Now figure out which kind of service (Browser, SCS, TAP, SIAP...) you'd like to publish and grab a matching sample record below. Format it with whatever tool you prefer (our recommendation: xmlstarlet) and then just edit it, replacing content as appropriate. Quite a few elements can also be deleted if you absolutely cannot find something to put in there. Plan for a couple of rounds of validation (any XSD validator should do, or just use purx, though we're not putting any effort in the nice presentation of the diagnostics yet).

Oh, don't sweat the VOSI interfaces that you'll see in the sample records. If you don't have them, just remove the corresponding capabilities; most of the records below also have „auxiliary” capabilities (the standardID attribute has an aux in the fragment part). Unless you know that you want such capability declaration, just remove them.

Here are links to some sample records:

Services defined within this resource descriptor

Tables defined within this resource descriptor

Other Resources defined here

[Manage RD]