Becoming a ProtoGENI Site

This page describes the process of integrating a working Emulab installation into the ProtoGENI federation. May of the pieces are already in place, and we are working on making the process entirely scripted. When that process is ready for use, we will post the detailed set of instructions here. For now, the following notes describe what the process will look like, as well as some of the requirements.

Notes as of November 24, 2008

Emulab Software

Must update to latest Emulab software release. Unfortunately that requires your boss and ops to be updated to freebsd 6.3. Our new Trac wiki has extensive upgrade and installation instructions.

The protogeni subsystem has its own installation script that creates various certificates and registers them at the clearinghouse. Local resources are also registered at the clearing house. Also some new ports are installed and some configuration files are modified. Its mostly automatic, with a few minor tasks that need to be done by the testbed administrator. Takes about 5 minutes.

Trust Model

Web of trusted roots. Each site has the complete set of root certificates from all of the members of the federation. The bottom line is that anyone in the federation implicitly trusts any certificate (or credential) from anyone else in the federation. Rights to do things are defined in the actual credentials. As new sites are added, a daemon (not yet implemented) running on each Boss will download a new set of root certificates.

Interface to ProtoGENI

Each boss runs two trusted XMLRPC servers. One implements the Slice Authority API and the other is the Component Manager API. The CM is actually an aggregate component manager.

Each user who wants to use the geni interfaces creates a password protected ssl certificate via the Emulab web interface. If not obvious, only registered emulab users at one of the federation sites can use the geni APIs. You then use your favorite xmlrpc client to talk to the servers; we prefer python cause its really easy to write a client program. The source code has a bunch of test programs that demonstrate how to do this and use the APIs.

Slivers

A "sliver" can consist of local cluster nodes, including vlan links between them, cluster nodes at different sites with gre tunnels between then, and planetlab nodes.

Unlike Emulab experiments, raw geni nodes are just that, raw. None of the emulab setup is done on those nodes, like experimental interfaces initializing interfaces with IP addresses, building accounts for other project members, starting programs automatically with the event system, etc. This will eventually be supported in the "cooked" interface.

Clearinghouse

The clearinghouse runs at Utah. The clearinghouse has a record of all of the slices created, and list of all resources, but does not currently record sliver creations or implement other clearinghouse functions (which are actually not well defined in the spec yet).

Policies

We have implemented a few simple policies that federates can apply to remote users. It is ultimately up to a federate whether they want to grant a specific ticket (promise of resources); the policies we have implemented so far are by no means a complete set, and it's possible for federates to implement their own policies if they wish.

  • Individual nodes may be prevented from being viewed and allocated by remote users
  • All access for remote users can be turned off (eg. during times of high local need, such as near the end of a semester)
  • The total number of nodes allocated to remote users can be limited

Pieces that need more work

  • The SA does not register new users yet. Only existing Emulab users can use the federation.
  • Credentials are currently all or nothing; no way to set or use the privilege bits that are in the credential.
  • No credential delegation.
  • Primitive resource discovery. Simply a ptop (Rob defines this) xml file of free nodes and their network interfaces. Also the features list for each node.
  • There is no emergency shutdown yet.
  • Its prototype prototype, so its all very fragile. We have to do a lot more robustness work.
  • Very little logging and statistics gathering.
  • Slice update is only partially done.
  • Firewalled slivers.