Home
Objectives
Scenarios
Architecture
Participants
Publications
News & Events
Master Thesis
Contact
Links
 
 

Architecture of SORMA - a first draft

This is a first draft of a technical architecture for SORMA provided by Lars Rasmusson from SICS. An architectural task force within SORMA is currently developing a consolidated architecture.

At the top of the layered architecture are user agents, which are software components that create resource specifications, encoded in an SLA language. The SLA language is also understood by brokers, software that can act to create the service requested in the project. The brokers need only to understand a subset of the SLA language. For instance, one broker may only be able to understand explicit requests for SUNgrid resources. Users and brokers register their requests and capabilities in a messaging framework, the Open Grid Market. (It can be either a centralized database, or a peer-to-peer system.) The grid market matches the requests with the advertised capabilities, and reports back to the requesting user.

Figure 1. The layered architecture. User agents and brokers talk the expressive SLA language. The Open Grid Market matches their requests and puts them in contact. Brokers and resource fabrics talk some fabric specific language to allow brokers to acquire resources. The surveillance of the running nodes is made through a shared information gathering layer to simplify the brokers’ tasks of discovering and keeping track of nodes and their status.

As an example (the numbers correspond to the numbers in Figure 1):

(1) a user submits a request for nodes that can run gcc2.96.

(2) A broker declares that it can provide nodes that run linux2.4 with a base redhat 8.1 distribution, including gcc2.96, and more. (1+2) The grid market forwards the users request to the broker, and forwards the broker’s id to the user agent.

(3) The user agent sends a request to the broker, who replies with an offer, which could either be ignored or followed by an accept from the user.

(4) The resource fabrics register their presence in the Information Gathering Infrastructure, which is a kind of bus where status messages and logs are aggregated.

(5) The brokers collect information from the bus about the available resources, and their status. Different resource fabrics can advertise different information, and it is up to the brokers to be able to interpret the information correctly.

(6) A broker that has got an accept on an offer (see msg 3 above), talks to the resource fabric market/reservation service in order to get resources in accordance with the SLA.

In this architecture there is no centralized bank or currency. It makes it easier to plug in other, already existing frameworks, if we don’t have to be able to convert the currencies. So at least initially, the currency is resource fabric specific. A user could for instance only request Tycoon nodes if it has money in the Tycoon bank.

The architecture sketch here does not yet include details about security nor specifies which specific protocols that should be used. These issues will be addressed by iteratively upgrading the protocols based on the use case requirements.

Example:

An initial simple SLA language can only talk about applications (yes, this is VERY simplified!).

SLA-version: 0.1

Action: (Tell|Ask)

Application:

Broker-IP: :

Host: :

A broker registers at the Grid Market by sending i.e.

SLA-version: 0.1

Action: Tell

Application: gcc4.0.2

Broker-IP: 193.10.66.141:7685

A user queries the Grid Market by sending i.e.

SLA-version: 0.1

Action: Ask

Application: gcc4.0.2

The Grid market has saved all the broker announcements, and does a simple string match on the SLA-version: and Application: fields to determine which brokers that can broker the request. In our example, the Grid Market replies to the user with

SLA-version: 0.1

Action: Tell

Application: gcc4.0.2

Broker-IP: 193.10.66.141:7685

The user can now connect directly to the broker and send its query again:

SLA-version: 0.1

Action: Ask

Application: gcc4.0.2

to which the broker replies

SLA-version: 0.1

Action: Tell

Application: gcc4.0.2

Broker-IP: 193.10.66.141:7685

Host: 193.10.66.20:8483

This means that the user can connect to a gcc service at that host. We of course have to specify if it is a SOAP, RPC, ssh, or something else interface. But for version 0.1, we are satisfied with a simple interface where one telnets the host, pipes a tarball of files to stdin, and reads the tarballed result from stdout. Note that the 0.1 protocol has no provision to have any other transport protocols. We will just leave that for later versions, perhaps by adding a Transport: field in the SLA.

Anyway, the point is that now the user can easily get jobs running. And, more importantly, the protocol is so simple, and not the least bit general, so you all are following what is happening this far.

Now, how did the broker know that there was a resource fabric host that had a gcc service running? The answer is that it talked to the information gathering infrastructure. In its first incarnation, it is very similar to the grid market. It understands messages of the format:

RESOURCE-version: 0.1

Action: (Tell|Ask)

Application:

Host: :

Poll: :

CPU-Load:

Date:

Just like a broker, a resource fabric node registers to the information gathering infrastructure, and then the information infrastructure starts polling the node every 10 seconds for its current CPU-Load. It records this information so that brokers can ask for load information, if they for instance want to recommend a lightly loaded node to its user. The Poll: fields gives the IP address to which one should telnet to get updated info.

The resource node sends

RESOURCE-version: 0.1

Action: Tell

Application: gcc4.0.2

Host: 193.10.66.20:8483

Poll: 193.10.66.20:8484

CPU-Load: 0.2

Date: 34857934

to the resource fabric. The broker queries the resource fabric with

RESOURCE-version: 0.1

Action: Ask

Application: gcc4.0.2

CPU-Load: 0.5

Date: 34857924

which in the protocol version 0.1 means that we want to have one registration where Date is greater than 34857924 and CPU-Load is less than 0.5, and Application is exactly gcc4.0.2. In our example, he gets back the message that the resource fabric node sent to the information gathering infrastructure. Again, more sophisticated queries like subscriptions, ranges, etc. are left for later versions of the protocol.

A simple version 0.1 implementation of the Information gathering Infrastructure will simply keep a list of all the Tell type messages it has received the last hour, and it will poll all registered resource nodes once every 10 seconds by telneting the Poll address and read stdout. Of course other implementations are possible. One could implement another infrastructure that capped the number of polls to one per second, etc. The point is that it can be done without changing the protocols or any of the other components.

So with this simple first version, we can now define which protocol things should be changed for the next versions. For instance, if it is necessary to use SOAP based messages, we should define a SOAP based interface for some later version of the protocol. Again, it should not be feature complete from start, but the transition from where we currently are to where we want to go, should be absolutely clear.

 


Front page

News

  • Oct 16, 2009:
    Final review meeting
  • Jun 10-11, 2009:
    Live demo at the GRID collaboration days
  • Feb 15-21, 2009:
    SORMA consortium and Developers meeting in Barcelona.
  • Sep 26, 2008:
    2nd Annual Review in Brussels
  •  

  • Sep 23-24, 2008:
    The second Concertation meeting within the SORMA project.

  • Top