Table information for 'purx.sources'

General

Table Description: A table of source URLs and their metadata.

This table is not available for ADQL queries and through the TAP endpoint.

Resource Description: purx lets you register your services without having to run a full OAI-PMH endpoint, but still letting you programmatically control the VOResrouce XML.

For a list of all services and tables belonging to this table's resource, see Information on resource 'purx Proxy Registry'

Citing this table

To cite the table as such, we suggest the following BibTeX entry:

@MISC{vo:purx_sources,
  year=2017,
  title={purx Proxy Registry},
  author={Demleitner, M.},
  url={http://dc.zah.uni-heidelberg.de/tableinfo/purx.sources},
  howpublished={{VO} resource provided by the {GAVO} Data Center}
}

Resource Documentation

What's this?

A proxy publishing registry – this lets you simply put up Virtual Observatory resource records on a plain webserver or generate them programmatically and then enroll their URLs at this service. We will then make sure the VO Registry will see your records. That's something you want because then your service will show up in popular VO clients like TOPCAT or Aladin.

How do I use it?

First, prepare a registry record. There are various different scenarios:

Unless your toolkit already gives you a URL, make the registry record built in this way available under an http(s) URL (http is preferred because there's less that can fail).

It is highly recommended to make sure the file is being served with last-modified headers and support for HTTP if-modified-since (that's the case if you use a plain file on a capable web server like apache). That way, our checks for updates (which we do every 80 ks) will consume almost no resources at all.

The system will work even without last-modified. You must, however, manage the updated attribute on your ri:Resource, element. purx will not update the record unless that date is updated.

Then submit the URL to the purx enrollment service. This will validate your resource record and generate an IVOA identifier for it if all checks out. It will then send a mail to the address given as contact in the resource record with something like an activation URL (which is valid for at least 150 ks). Once that URL is retrieved, you should see your record in the common registries within a day or so (these first need to hit purx's OAI-PMH face; the Registry is pull in this way rather than push).

You can always check the idea purx has of your service by entering your access URL at the purx status service.

What Identifier Will I Get?

One of the big advantages of purx is that you don't have to think of an authority and claim it. The downside is that we will assign the identifier. We're trying to be accomodating, though.

Just set the identifier element with an arbitrary (ignored by us) authority – ivo://ignored/generic/service, say. We will then take the path part (generic/service in this case; obviously, you should pick something descriptive here) and glue that together with the authority. That gives ivo://purx/generic/service, which would be your identifier.

If that identifier is already taken by someone else, we take the longest element of the host part of the URL we got the XML from, and stick that right behind the authority. For instance, if we got the document from http://ari.uni-heidelberg.de/doc.xml, the resulting identifier would be ivo://purx/uni-heidelberg/generic/service.

If that still clashes, we try to disambiguate by appending numbers, but I'd suspect somebody is trolling us if that happens.

Also note that the URL-ivoid relationship is fixed once the identifier is minted. Even if you change the identifier at your end, the identifier assigned by purx will remain.

Getting Out of purx

Purx will regularly re-retrieve your file. If that fails or if the document becomes invalid, purx will, after a couple of weeks, drop your service record (technically, it will henceforth publish a „deleted record” for it). It will send two mails explaining the situation to the contact person given in the record that it last saw.

If you want to immediately get rid of the record, just arrange for your webserver to return a 403 Forbidden HTTP status code to purx.

You can always resuscitate your record by resubmitting it to the purx enrollment service again. It will receive the ivoid it had before.

Write Registry Records from Scratch

That's a bit of work, in particular if you want to provide table metadata (which is highly recommended). Before starting you should have the following specs handy (in the sense of: searchable; don't even try to actually read them):

You can use these to look up explanations for elements you don't understand in the sample records. There's also http://docs.g-vo.org/schemadoc/, giving a javadoc-like reference for the schema files used in the VO.

Now figure out which kind of service (Browser, SCS, TAP, SIAP...) you'd like to publish and grab a matching sample record below. Format it with whatever tool you prefer (our recommendation: xmlstarlet) and then just edit it, replacing content as appropriate. Quite a few elements can also be deleted if you absolutely cannot find something to put in there. Plan for a couple of rounds of validation (any XSD validator should do, or just use purx, though we're not putting any effort in the nice presentation of the diagnostics yet).

Oh, don't sweat the VOSI interfaces that you'll see in the sample records. If you don't have them, just remove the corresponding capabilities; most of the records below also have „auxiliary” capabilities (the standardID attribute has an aux in the fragment part). Unless you know that you want such capability declaration, just remove them.

Here are links to some sample records:

Columns

Sorted by DB column index. [Sort alphabetically]

NameTable Head DescriptionUnitUCD
source_url Source URL URL to pull the VOResource from N/A meta.ref.url
rectimestamp Updated Date of harvest of xml_source (compare against this on OAI-PMH requests). N/A time.epoch
upstream_update Upstream Update Date updated according to the upstream record (compare against this when determining if re-publication is necessary). N/A time.epoch
modification_date Modified Date in upstream Modified header (use this to create HTTP if-modified-since headers) s time.epoch
ivoid ivoid IVOA identifier allocated N/A meta.ref.ivoid
status Status Status of record; one of PENDING, OK, FAILn (n number of consecutive failures when trying to retrieve data), DROPPED. N/A meta.code
access_code Activation Activation code to move record from PENDING to OK. N/A N/A
contact_email Contact Last known contact e-mail (used when something goes wrong). N/A N/A
title Title Title of the resource as given in VOResource. N/A N/A
xml_source Source Resource record in utf-8 ready for inclusion into OAI-PMH N/A N/A

Columns that are parts of indices are marked like this.

Other

The following services may use the data contained in this table:

VOResource

VO nerds may sometimes need VOResource XML for this table.