Basic Interaction Model
All interactions with a Trove database are mediated through HTML pages
on a Web browser.
Users
Users (people looking for packages that match their requirements to
download) are presented with a search form. The search form allows
them to enter keyword search terms. The keywords may be selected with
buttons from a controlled vocabulary defined by site policy, or entered
as `roll-your-owns' in a text field.
Searches would yield all targets that are in the the intersection set of the
controlled-keyword hits, unioned with all hits from a search for
roll-your-own keywords in package text descriptions. There is a
more detailed proposal for the handling of controlled-vocabulary keywords.
The result of a search is a generated HTML catalog
listing. The body of a catalog listing consists of a series
of one-line entries each beginning with a package-name hotlink and
including a one-line package summary. The catalog has section headers
indicating which lines are controlled- keyword hits and which are
free-text hits.
Users looking at a catalog listing may either refine the search or
look at individual entries that interest them (by chasing the
package-name hotlinks). An individual entry displays all package
metadata contained in the Trove database, possibly including resource
links to a local cache of package resource files.
When an individual entry is selected, a user may take one of several
actions:
-
Chase a resource hotlink on the package metadata display (such as the
package home page URL, or a mailto URL for the package contact
person).
-
Download package resources (e.g. by chasing FTP hotlinks on the
package metadata display).
-
Subscribe or unsubscribe to the package's heads-up list (that is, the
list of people automatically notified by email whenever package
metadata or resources are changed). Unsubscription will be
prohibited to an unauthenticated user; this is to prevent bad
guys from masquerading as good guys in order to suppress notifications.
-
Attach a review annotation to the package. (This is a future feature
and has not yet been designed into the database schema.)
Contributors
Contributors (people updating package metadata or uploading new
associated resources such as source tarballs) use a Web form to
create or edit the metadata, and possibly to upload
package resource copies to the site's local FTP cache.
Description fields are interpreted according to the following rules:
Text is plain text. Paragraphs are separated by one or more blank
lines. No HTML tags are recognized; >, & and < mean
themselves. Normal paragraphs are word-filled. Indented text is
treated as-is and converted to <PRE>...</PRE> in HMTL
(tabs should be expanded to spaces here). A single word between
*asterisks* means <b>bold</b> and a single word in
_underscores_ means <i>italics</i> (even in indented
text). Any text that looks sufficiently like a URL
(e.g. http://www.python.org) is turned into a hyperlink with an
<A...>...</A> tag pair (even in indented text).
Administrators
Trove site administrators can use a web form to view a catalog of
recently added entries, and delete or modify them if there appears to
be some problem.
Administrators are also responsible for watching logs of roll-your-own
keyword entries and noticing when keywords should be migrated into the
core set described in site policy.
Security and Authentication
There are two levels of protection in the Trove design. Which will
operate depends on whether a contributor is authenticated or not.
How to Authenticate Users
To be authenticated, a user must register a PGP public key with a
Trove site. A user becomes authenticated by asking Trove to issue a
challenge. The user must then return the challenge encrypted with the
matching private key to become authenticated.
On success, Trove issues a timed cookie to the user. While
the cookie remains valid, the user is authenticated.
Security through Visibility
The contributor who creates a package entry, and anyone who changes
the package metadata or resources after the fact, will be put on the
package's heads-up list. Every time the package metadata is modified
after that, the updating contributor will be added to the heads-up
list, and everybody on the heads-up list will be notified.
The intent of this feature (and the prohibition on unsubscribing from
a heads-up list unless you're validated) is to make sure that all
metadata & resource changes are visible to everybody with a stake
in the package. In particular, any modifications an unauthorized
person succeeds in doing will be visible to the real package owners.
Security through Authentication
Either a resource or a package may be locked. When an
item (resource or package) is locked, you must be validated as
an owner to modify it.
Here are the rules of ownership:
-
The keeper of an item is the person who can add and delete
owners.
-
The keeper of an item is automatically an owner of the item.
-
The person who creates an item is its first keeper.
-
The keeper may pass the keeper role to another validated user.
-
Any owner of an item (package or resouece) can modify or delete the item.
-
The owners of a package may delete associated resources.
-
The owners of a package can modify its sticky bit:
-
If the sticky bit is off, anyone can attach resources the a package.
-
If the sticky bit is on, only owners of the package can attach
resources to the package. (Owners of other attached resources
have no automatic privilege.)
Package Authentication
To be specified. Base on the
JAR approach suggested by Jeremy Hylton?
Sketch of implementation
How will all this be done? Essentially, by replacing the
meta-information now stored in LSMs with a Web-accessible database.
A Trove site would consist of two parts:
-
The metadata. Metadata (defined by the
schema) would be stored in a database.
The metadata would include pointers to...
-
The resources. Resources are files that live in a site-local
FTP tree. Some will be created by contributors; some generated by
the Trove code itself (for example, it will pre-generate the HTML for
faster subsequent metadata display each time the metadata is
changed).
The CGIs constituting the web-accessible front end of the database
will be written in Python. A major open issue is what
database to use as the back end.
Eric S. Raymond <esr@snark.thyrsus.com>