Time plays a vital role when checking the validity of certificates. Consequently it is vital that the DataGrid machines be clients of a reliable time server.
If the machines at your site are not already syncronized, then you may use the xntp3 package distributed with the other external packages used by EDG (See 1.4). This package implements the network time protocol (ntp) and allows a machine to be a time client (as well as a time server).
If you use the xntp3 package, then configuring your machine as a time client is rather trivial. You must add at least one time server reference to the ntp configuration file /etc/ntp.conf and configure the machine to run the ntp daemon. The detailed steps are:
/usr/sbin/ntpdate ip-time-1.cern.chreplacing the CERN time server with the one you have chosen.
/sbin/hwclock --systohc
/sbin/chkconfig xntpd on
/sbin/service xntpd start(or reboot the machine). You can check the status of the time server by using the "/usr/sbin/ntpq" command.
The xntp3 package tries to be rather gentle to the system when readjusting the time. One result is that if the time is too far wrong (more than 1000s by default), then the daemon will simply refuse to reset the clock and will die. This is a common problem if you forget to use the ntpdate above.
It is extremely important that the hardware clock is synchronized to the system clock. If not, at the next boot an unsynchronized time will be reloaded and you risk having the time syncronization daemon stop.
If you have a large number of machines, you may wish to create a local time server. Refer to the local documentation of the xntp3 package for instructions.
Cryptographic certificates are used to attest to the identity of an user or machine to the extent specified in the issuing certification authority's (CA) policy documents. Users accessing DataGrid resources must have a valid certificate; similarly, hosts offering services within the testbed must also have one.
The EDG-approved CAs have service areas which cover most of Europe and the United States. (Consult the current list on the web.) If a user or site is not covered by an existing CA's service area, then one must either negociate with a CA to extend its service area or start a new CA.
It has been agreed that the CA operated in Lyon is responsible for users without access to a CA.
To use the Globus security infrastructure you must have your certificate in PEM format. Follow the instructions below if you need to change a P12-formatted certificate into a PEM-formatted certificate. You should then place the two files "usercert.pem" and "userkey.pem" into a ".globus" directory in your home area. The file permissions for the userkey file should be 0700 for the other 755 is appropriate.
Optionally, you may place your certificate and key in a non-standard location. In this case you must define the two environmental variables X509_USER_CERT and X509_USER_KEY to point to your certificate and key, respectively.
Host certificate/key pairs should be installed into the directory /etc/grid-security/. The host key must be readable only by root (chmod 0400 hostkey.pem); the host certificate can be world readable (chmod 0444 hostcert.pem).
These certificates may be installed in non-standard locations by
setting the values
X509_GATEKEEPER_CERT and X509_GATEKEEPER_KEY to the
fully-qualified location of the host certificate and key,
respectively.
Many of the certificate authorities deliver certificates through a web browser. To use these certificates with Globus, they must be exported from the browser and then reformatted for Globus. Exporting is browser-specific so you will need to follow the help provided with your browser. Once you have extracted the certificate you should have a file with a p12 extension. This is in the PKCS12 format; you will need to change this to PEM format. If the edg-utils package is installed on your machine, simply executing
/opt/edg/bin/pkcs12-extractwill create appropriate certificate and key files and place them in the standard location. This is a convenience method for the following:
openssl pkcs12 -nocerts \ -in cert.p12 \ -out ~user/.globus/userkey.pem
openssl pkcs12 -clcerts -nokeys -in cert.p12 -out ~user/.globus/usercert.pemThe first command gives you your private key; this file must be readable only by you. The second command gives your public certificate. The " user" should be replaced by the path to your home area. The ".globus" directory is standard place to put your certificates.
Popular browsers typically use certificates in PKCS12 format. Consequently you will need to modify the format of the PEM certificates used for Globus to use them within a browser. To change a certificate from PEM format into PKCS12 format (on a machine with edg-utils installed), just issue the following command:
/opt/edg/bin/grid-mk-pkcs12Again, this is a convenience method for the following:
openssl pkcs12 -export \ -out file_name.p12 \ -name "My certificate" \ -inkey ~user/.globus/userkey.pem \ -in ~user/.globus/usercert.pemwhere file_name.p12 is the name of the PKCS12 certificate, and the " user" in the last two lines should be replaced by the path to your home area. You must then import the certificate into your browser.
Having current certificate revocation lists (CRLs) is an extremely important aspect of the security framework. These lists identify certificates which have been revoked because the user no longer uses them, or they have been compromised. The CRLs can be updated with the command edg-fetch-crl. There is an associated daemon (edg-crl-upgraded) which can be started automatically to retrieve the CRLs periodically. It can be manipulated like all SysV daemon scripts.
Note: if the CRLs are out-of-date, certificates from the associated CA will not be accepted.
The current list of virtual organizations can be found on the web. If you did not register with a virtual organization when you signed the EDG Usage Guidelines (or wish to change your VO membership), then you must contact the VO manager directly.
Note: With the Testbed 1 software, membership in more than one virtual organization is not supported. When grid mapfiles are generated the actual organization you will be associated with depends on the order the virtual organizations are listed in a site's mkgridmap configuration file. There is no mechanism by which the user can indicate which virtual organization should be used.
If you really need different roles in the Testbed 1 context, you should request multiple certificates (with slightly different subject names) and register the different subject names with different virtual organizations.
Most of the virtual organization currently require that some VO-specific software is preinstalled at sites supporting that virtual organization. The list of VO-specific software is published into the information systems from the /opt/edg/info/mds/etc/ldif/ce-static.ldif file by setting one or more RunTimeEnvironment attributes.
The list of RPMs can be obtained from the edg repository.
Grid users are given access to a site's resources based on a local unix account. The Globus system uses a grid mapfile to map a user's certificate subject into a local account. The grid mapfile is generated from information contained in various virtual organization (VO) membership lists and a local configuration file.
The configuration file allows for three different strategies for creating the local user accounts, each with advantages and disadvantages. The first option is to create a unique local account for every grid user. This allows the environment for each user to be specifically tailored for that user and allows detailed accounting of resource usage through standard mechanisms. The disadvantage is that this involves a lot of maintainance by the system administrator and may involve a large number of accounts being created.
The second option is for all members of a particular virtual organization to be mapped into a shared account. Administratively this is the easiest solution as it usually involves only setting up one account per virtual organization. However, all detailed accounting information is lost, detailed access control is more difficult, and there are possible resource conflicts between multiple users at the same site.
The third option is creating pooled accounts. It is similar to the last option but instead pools of identical accounts are created and at any given time only one user (identified by subject name) is using one account. For example, for the Atlas VO a site may create a pool of accounts atlas001, atlas002, etc. This has the advantages that the accounts are easier to maintain and allow detailed accounting. However, there is a need to specify a policy for local resources when a given user stops using a pooled account. (E.g. how long local files are maintained, will the user get the same account when she/he returns, etc.)
To create a pool of accounts, you must setup individual unix user accounts whose names have a common prefix and a numeric suffix. For example, "atlas001", "atlas002", etc. To map users into this pool the prefix must be specified preceeded with a dot, i.e. ".atlas".
In addition, a gridmapdir must be created; it's default location is /etc/grid-security/gridmapdir, but may be set to a different location in the globus.conf file. An empty file must exist in the gridmapdir for each pooled account; the name of the file must match exactly the account name including both prefix and numeric suffix.
The mapping between a subject name and an individual account is based on the time stamps of the account entries in the gridmapdir and additional files named according to the URL-encoded subject names of the users.
Note: this mapping is fixed until the subject name entry is deleted. Currently this isn't done automatically and if the account pool is exhausted, users will get the same error as if they were not authorized to use the resource.
One important aspect for using pooled accounts is that the grid-mapfile and the /etc/grid-security/gridmapdir directory must be shared between all of the nodes in a site. If this is not done, then it is possible that the mapping will be done inconsistently depending on how a given machine is accessed.
The mkgridmap script generates a gridmap file based on user information in the LDAP servers of various virtual organizations.
The behaviour of the script can be highly customized via a
configuration file which is located in
/opt/edg/etc/mkgridmap.conf. In its simpliest
form, it simply lists the appropriate virtual organizations, the
accounts to map these users to, and an auth
directive to check that the users have signed the EDG Usage
Guidelines.
The following example file (appropriate for a computing element) maps users from the specified virtual organizations to pooled accounts with the given prefix.
group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org .alice group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org .atlas group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org .cms group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org .lhcb group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=org .biome group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=org .eo group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=org .iteam group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=org .wpsix auth ldap://marianne.in2p3.fr/ou=People,o=testbed,dc=eu-datagrid,dc=orgThis also checks the generated list of users against those who have signed the EDG Usage Guidelines. An example appropriate for a resource broker
group ldap://marianne.in2p3.fr/ou=guidelines,o=testbed,dc=eu-datagrid,dc=org dguser auth ldap://marianne.in2p3.fr/ou=People,o=testbed,dc=eu-datagrid,dc=orgchecks only the group of users who have signed the EDG Usage Guidelines and maps them into the user which runs the broker daemons.
Table 8.1 lists those ports used by various parts of the testbed software. Temporary ports used by Globus can be restricted to a particular range. Nearly all services can be configured to run on non-standard ports, if necessary.
Port | Service |
80 | HTTP server for Network Monitoring |
123 | Network Time Protocol |
2119 | Globus Gatekeeper |
2135 | MDS info port |
2169 | FTree info port |
2170 | Information Index |
2171 | FTree info port |
2811 | GSI ftp server |
3147 | RFIO |
7771 | Resource Broker |
7846 | Logging & Bookkeeping |
8080 | Tomcat Server (R-GMA, SpitFire) |
8881 | Job Sub. Service (client) |
9991 | Job Sub. Service (server) |
There is at least one additional port needed for a two-phase commit job submission. This port has not yet been identified; in the meantime, opening all ports above 1024 will work.
The client and server programs gsiklog and gsiklogd allow you to obtain an AFS token by presenting a Grid proxy rather than a Kerberos password.
This software has been produced by Doug Engert of Argonne, with some testing and bug fixes by Helmut Heller and Andrea Parrini.
The source code is available from Argonne National Laboratory and we have produced Linux RPM's built with the Testbed 1 Globus2.0 distribution available from the EDG software repository.
For the client, installation from RPM is very straightforward, with no post-install configuration if the machine is already running as an AFS client. (gsiklog uses the existing AFS configuration files of the afsd cache daemon.)
Once configured, AFS tokens can be acquired in a gsiklogd-enabled cell by simply using the grid-proxy-init and then gsiklog commands. (gsiklog -help lists additional options, including specifying the remote AFS username and remote cell.)
The gsiklogd daemon can most easily be installed on an existing AFS authentification server, as it needs access to the Kerberos key /usr/afs/etc/KeyFile for its cell.
It must also be provided with a Grid key and certificate pair in /etc/grid-security called afskey.pem and afscert.pem, and the distinguished name must end CN=afs/CELL where CELL is the AFS cell name.
Finally, a file /etc/grid-security/afsgrid-mapfile must exist, with the same format as a gatekeeper grid-mapfile, but specifying local AFS usernames rather than unix usernames.
The daemon supports and SysV interface and can be started, stopped, and set to autostart in the customary way (see 3.2).
For more information on GRM see GRM - Grid Application Monitor Users Manual. For more information on PROVE, see PROVE-Visualisation tool for Grid Applications.
For linux, installing the RPM does all necessary configuration.
For other operating systems, the following must be done. Replace the terms 'linux' and 'LINUX' in the "grm.spec" and "prove.spec" files with appropriate terms from 8.2. The 'linux' term signifies the architecture. 'LINUX' is the name of the (sub)directory that will contain the binary files for linux.
A configuration file with the same name ('linux.def') should be present in the conf/ directory of the source. In the conf/ file there are also configuration files for the irix and solaris operating systems as examples (irix-6-cc.def and solaris-2.6-gcc.def).
The software specific to various applications is available from the EDG package repository. You should install all of the application software necessary to support the users authorized to use your site.
When installing application software be sure that you update the RunTimeEnvironment flags in /opt/edg/info/mds/etc/ldif/ce-static.ldif/ and restart the information systems. This will publish via the information systems that you have installed the given set of software.
In order for the Job broker to find resouces on which to run a job, and storage elements on which to store data, an information provider needs to be setup. The GIIS, or Grid Index Information Service, is the type of information provider used to locate resources in Testbed-1. The GIIS is based on LDAP.
MDS is the LDAP based information provider which is part of Globus. WP3 has written it's own LDAP type of information provider, where the backend is cached in memory. This was written because performance tests with Globus MDS indicated that the performance of MDS was not adequate. The LDAP based information provider provided by WP3 is called ftree, and it is integrated with OpenLDAP2, not with MDS.
WP3 has also delivered schema files, which define the information to be displayed by both ftree and MDS. The same schema files and information providers are used by both MDS and ftree. This will allow comparative tests between ftree and MDS to take place, while providing the same information.
For further information on LDAP and MDS deployment, along with a description of the schema files see `MDS Deployment - Testbed 1.'
These documents are available in the documentation directory on the WP6 website for testbed-1 under documentation.
The Configuration instructions for the LDAP based information providers which follow apply regardless of whether Globus MDS or ftree is used. In some places they are slightly different, and this will be indicated.
The following instructions explain how to configure the information providers. The installation procedure is largely carried out by installing the appropriate RPMs for the type of machine (site cache/GIIS, SE (storage element) CE (computing element) or netmon (network monitor)). In addition to the Globus RPMs, an RPM needs to be installed for the ftree information service. An RPM, edg-info-main-*.rpm is provided to help configure the information providers and three RPMs are provided to install the information provider scripts, the ones to install are dependent on machine type. Following the installation of the RPMs copy /etc/edg/info-mds.conf.in to /etc/edg/info-mds.conf. Edit info-mds.conf, the variables prefixed with a hash (#) must be edited and the hash removed.
Install
openldap-ftree-*.rpm edginfo-main-*.rpm
Common settings for all configurations:
For all configuration, set the values in /etc/globus.conf to
GRIDMAP=/etc/grid-security/grid-mapfile GATE_KEEPER_PORT=2119 GLOBUS_LOCATION=/opt/globus/ #GRID_INFO_USER= - This should NOT be root, this should be set to a non privileged user GRID_INFO_GRIS=yes GRID_INFO_EDG=yesand in /etc/edg/info-mds.conf to
WP3_DEPLOY=/opt/edg/info/mds - The directory in which WP3 software is installed. If it is installed using the RPMs this does not need to be changed FTREE_INFO_PORT=2171 - The port number for the ftree information server FTREE_DEBUG_LEVEL=0 - The debug level for ftree, useful settings are 255 and 256 SITE_DN=Mds-Vo-name=local,o=grid - This should not contain any spaces and should end in o=grid, if left blank it will default to the hosts domain components, dc=...,dc= For use with MDS2 use Mds-Vo-name=local,o=grid
Set the following variables within /etc/globus.conf
#GRID_INFO_GIIS_1=ral - The site name #GRID_INFO_REG_GIIS=uk - The country #GRID_INFO__REG_HOST=gppmds.gridpp.rl.ac.uk - The country host
Set the following variables within /etc/edg/info-mds.conf
SITE_INFO=yes NETMON_PRESENT=no CE_PRESENT=no SE_PRESENT=no #SITE_NAME=RAL - The site name #SITE_INSTALLATION_DATE=20011115123410Z - This is in the format yyyymmddhhmmssZ SITE_SYSADMIN_CONTACT=grid.sysadmin@hostname SITE_USER_SUPPORT_CONTACT=grid.support@hostname SITE_SECURITY_CONTACT=grid.security@hostname SITE_DATAGRID_VERSION=1 #SITE_SE_HOSTS=gppse01.gridpp.rl.ac.uk,gppse02.gridpp.rl.ac.uk - This is a comma separated list with no spaces of the host names of the SEs #SITE_CE_HOSTS=gppa.gridpp.rl.ac.uk - This is a comma separated list with no spaces of the host names of the CEs #SITE_NETMON_HOST=gppnet.gridpp.rl.ac.uk - This is host name of the network monitor information provider
Install
edg-info-netmon-*.i386.rpm
Set the following variables within /etc/globus.conf
#GRID_INFO_GIIS_1=netmon - The GIIS name #GRID_INFO_REG_GIIS=ral - The site name #GRID_INFO__REG_HOST=gppmds.gridpp.rl.ac.uk - The site host
Set the following variables within /etc/edg/info-mds.conf
SITE_INFO=no NETMON_PRESENT=yes CE_PRESENT=no SE_PRESENT=no #NETMON_PINGER_HOST=network.rl.ac.uk - This is the machine on which the edg-pinger-*.i386.rpm is installed
Install
edg-info-se-*.i386.rpm perl-Filesys-DiskFree-*.rpm
Set the following variables within /etc/globus.conf
#GRID_INFO_GIIS_1=se - The GIIS name #GRID_INFO_REG_GIIS=ral - The site name #GRID_INFO__REG_HOST=gppmds.gridpp.rl.ac.uk - The site host
Set the following variables within /etc/edg/info-mds.conf
SITE_INFO=no NETMON_PRESENT=no CE_PRESENT=no SE_PRESENT=yes #SE_ID=gppse01.gridpp.rl.ac.uk - This may be set manually, if left blank it will default to the local hostname #SE_SIZE=500 - The size of the storage element in MB SE_CONTACT=grid.support@hostname SE_TYPE=disk #SE_FILESYSTEMS=/dev/hda2,/dev/hda4 - This is a comma separated list with no spaces, these values are used with df to obtain the free space of the SE #SE_CLOSE_CE=gppa.gridpp.rl.ac.uk - This is a comma separated list with no spaces, the values are the host names of the close computing elements SE_PROTOCOLS=gridftp,rfio,file - This is a comma separated list with no spaces, the values are the protocols supported by the storage element SE_PROTOCOL_PORTS=2811,3147, - This is a comma separated list with no spaces, these values must relate to the corresponding SE_PROTOCOLS
Install
CEInformationProviders-*.i386.rpm
Set the following variables within /etc/globus.conf
GRID_INFO_GIIS_1=ce - The GIIS name GRID_INFO_REG_GIIS=ral - The site name GRID_INFO__REG_HOST=gppmds.gridpp.rl.ac.uk - The site host
Set the following variables within /etc/edg/info-mds.conf
SITE_INFO=no NETMON_PRESENT=no SE_PRESENT=no CE_PRESENT=yes #CE_HOST=gppa.gridpp.rl.ac.uk - This may be set manually, if left blank it will default to the local hostname #CE_BATCHSYSTEM=pbs - Supported systems are pbs and lsf, bqs will be added shortly #CE_CLUSTER_BATCH_SYSTEM_BIN_PATH=/usr/pbs/bin - This is the path to the directory containing the queue management commands #CE_QUEUE=short,long - This is a comma separated list with no spaces of the queue names of the computing element #CE_CLOSE_SE_ID=gppse01.gridpp.rl.ac.uk, gppse02.gridpp.rl.ac.uk,gppse03.gridpp.rl.ac.uk - This is a comma separated list with no spaces of the names of close storage elements #CE_CLOSE_SE_MOUNT_POINT=usr/atlas,,usr/cms - This is a comma separated list with no spaces of the mount points of Close Storage Elements, these values must relate to the corresponding CLOSE_SE_ID's
The CEInformationProviders-*.rpm also installs the file /opt/edg/info/mds/etc/ldif/ce-static.ldif.in. This file has to be copied to /opt/edg/info/mds/etc/ldif/ce-static.ldif, the contents need to be changed to reflect the computing element environment. If required each queue can be customised using individualised static ldif files. If ce-static-queuename.ldif exists then this will be used in place of the ce-static.ldif.
The servers can be started and stopped via SysV scripts named edginfo-mds and globus-mds and can be set to autostart with the chkconfig command (see see 3.2).
Install the edg-info-main RPM. There are 3 files of interest
etc/info-vo.conf etc/rc.d/init.d/edginfo-vo opt/edg/info/mds/etc/testbed1-vo.ldif
The only file that should need editing is the testbed1-vo.ldif this needs to contain entries for the sites within the vo/country an example entry is given for RAL
The ftree vo/country server is now ready to roll: /etc/rc.d/init.d/edginfo-vo start .
There is a requirement for some sites to have more than one server running. Where a site has two or more sets of resources that are used by different VOs, a server will have to be run for each VO. If one or more VO shares the resources then only one server is required. Hence the need to have a .conf separate from globus.conf. A copy of the info.conf will be required for each server, as will copies of the edginfo and the contents of the /opt/edg/info/mds directory.
To set up another server a copies of the wp3-testbed-mds directory (e.g. wp3-testbed-mds-atlas), the edginfo (e.g. edginfo-atlas), and the info.conf (e.g. info-atlas.conf) file will have to be made.
The value for the WP3_DEPLOY in info-atlas.conf will have to be set to wp3-testbed-mds-atlas; the value for INFO_CONFIG in edginfo will have to be set to info-atlas.conf.
Finally, any other site- or VO-specific values will also have to be set in info-atlas.conf.
The Relational database based information provider has also been written by WP3. This is known as R-GMA, short for Relational Grid Monitoring Architecture. It is again possible to display information using the same schema as that displayed using LDAP and MDS. Again, the performance of this is being tested for comparison with the LDAP and MDS approach.
For further information on R-GMA see `R-GMA Relational Information Monitoring and Management System User Guide'
The R-GMA package consists of seven RPMs and depends on a number of external packages. These components can be obtained from the package repository. Each RPM is described below.
The external packages on which R-GMA depends are listed below. They can also be found from the above mentioned website.
<!-- Tomcat Root Context --> <!-- <Context path="" docBase="ROOT" debug="0"/> -->below that add the following set of lines:
<!-- R-GMA servlets directory --> <Context path="/R-GMA/ProducerServlet" docBase="/opt/edg/info/servlets/ProducerServlet" debug="0" reloadable="true"/> <Context path="/R-GMA/DBProducerServlet" docBase="/opt/edg/info/servlets/DBProducerServlet" debug="0" reloadable="true"/> <Context path="/R-GMA/ConsumerServlet" docBase="/opt/edg/info/servlets/ConsumerServlet" debug="0" reloadable="true"/> <Context path="/R-GMA/ArchiverServlet" docBase="/opt/edg/info/servlets/ArchiverServlet" debug="0" reloadable="true"/> </programlisting> To run the demo or when you need to run a RegistryServlet and a SchemaServlet for a VO also add: <programlisting> <Context path="/R-GMA/SchemaServlet" docBase="/opt/edg/info/servlets/SchemaServlet" debug="0" reloadable="true"/> <Context path="/R-GMA/RegistryServlet" docBase="/opt/edg/info/servlets/RegistryServlet" debug="0" reloadable="true"/>
The first part ensures that applications can use the produce- (persistently when using the DatabaseProducer) consume- and archive- services of the respective servlets. See the next section for more informtion. To be able to stream information from producers to consumers the attribute allowChunking of the Connector has to be set to false as in the following:
<Connector className="org.apache.catalina.connector.http.HttpConnector" port="8080" minProcessors="5" maxProcessors="75" allowChunking="false" enableLookups="true" redirectPort="8443" acceptCount="10" debug="0" connectionTimeout="60000"/>
To set up R-GMA for a virtual organization one has to run one RegistryServlet and one SchemaServlet. These make use of a database to store information about producers (RegistryServlet) and tables (SchemaServlet). These Servlets will initially be run at RAL and the URLs of those Servlets will be made available elsewhere. In the future each virtual organisation will run their own RegistryServlet and SchemaServlet.
To be able to produce information one has to run at least one ProducerServlet, however many producers can use the same ProducerServlet to publish data for them. In the same way one has to run at least one ConsumerServlet to be able to consume data that has been published by a producer. For every servlet the web.xml file describing the web-application has to be configured and for Consumer, Producer, DataBaseProduer and Archiver a properties file has to be configured. Each API class needs to know the location of the respective servlet which services it. The properties files are located in $RGMA_HOME and currently have to be copied into the home directory of the user running the application code that uses the API class in order to be found. The scripts to run the demos and the sensors do this automatically for you. If you are running Tomcat on the same machine as the application code that uses the producer/consumer/archiver APIs the default values for the ServletLocations suffice. The idea behind this setup is, that one can run a sensor that publishes information on each node of a cluster and have one ProducerServlet running on a head node to handle all the requests from consumers.
Each servlet has a number of init parameters that are set at the beginning of the servlet life cycle and are now discussed in turn. The sections about the SchemaServlet and RegistryServlet and Tools are only relevant if you need to build your own VO. The web.xml files have to be configured before Tomcat is started up since they are read at startup time only.
registryServletLocation is the URL of the RegistryServlet. The ProducerServlet has to be able to contact the Registry to register Producers.
registryServletLocation is the URL of the RegistryServlet. The DBProducerServlet has to be able to contact the Registry to register DataBaseProducers.
registryServletLocation is the URL of the RegistryServlet. The ConsumerServlet has to be able to contact the Registry to find out about Producers.
schemaDatabaseLocation is a JDBC URL for the location of the Schema database, see the documentation of your database for more information. The default setting is for a mysql database running on localhost. It probably makes sense to run the database on the same host as the SchemaServlet, but it is not mandatory.
schemaDatabaseUserName is the database user name of the schema database. The default is schema.
schemaDatabasePassword is the clear text password for the above user. The default is info.
registryDatabaseLocation is a JDBC URL for the location of the Registry database, see the documentation of your database for more information. The default setting is for a mysql database running on localhost. It probably makes sense to run the database on the same host as the RegistryServlet, but it is not mandatory.
registryDatabaseUserName is the database user name of the registry database. The default is registry.
registryDatabasePassword is the clear text password for the above user. The default is info.
schemaServletLocation is the URL of the SchemaServlet. The default is to run the SchemaServlet on the same host as the RegistryServlet, in which case the same database can hold both the registry and schema database.
To populate the Schema database with a set of known tables and to bring the registry database into a clean state with no registered producers the build file in $RGMA_HOME/tools/dbases has to be run. Since soft-state registration is currently not yet implemented the Registry can get into an inconsistent state. The administration of the registry database will be moved into the RegistryServlet in a future release.
The /opt/edg/info/demo directory contains two demos to illustrate the use of R-GMA. To run the demos just install the RPMs and don't configure the servlets or properties files. Also include the RegistryServlet and SchemaServlet in the server.xml file. Run a MySQL server with no password for the root user. Starting up tomcat makes sure all these services are available. In order to run the demo you need to change the permissions on the files named run in SimpleDemo and SimpleDemo/etc to be executable. E.g.
find . -name run -exec chmod 0755 {} \;Each subdirectory of the demo directory contains a script called run that takes the name of the respective demo SimpleDemo or ClusterLoad as an argument. The README in each subdirectory explains briefly what is happening.
In this section the available sensors that have been implemented using the R-GMA approach are discussed, with the emphasis upon how the sensors are used.
The purpose of the MDS Producer sensor is to publish all the information available from a Globus GRIS server or in fact from any LDAP server into the R-GMA and to permit a consumer to access this information via the normal R-GMA approach. Each Site that runs a GRIS/LDAP server should run a MDSProducer.
The Globus GRIS server publishes information about the status of the Grid and its components, such as available CPU nodes, available service types and the status of batch queues. The server is implemented using the LDAP protocol, with the Grid information stored in a hierarchical LDAP directory structure. Each piece of information is associated with an attribute, with the permitted attributes being defined and grouped by an LDAP schema or 'object class'. The context of the information is given by its position within the directory structure.
There are currently 6 schema defined in the Globus2 release:
globusBenchmarkInformation globusNetworkInterface globusQueue globusServiceJobManager globusSoftware grADSoftwareFurthermore EDG publishes information according to a number of objectclasses which are republished into the following set of R-GMA tables:
NetMonHostLoss NetMonHostRTT NetMonHostThroughput NetMonLossPacketSize NetMonRTTPacketSize NetMonThroughputBufferSize NetMonTooliperfER NetMonToolpingER SiteInfo StorageElement StorageElementProtocol StorageElementStatus
For the Globus GRIS here is exactly one table in R-GMA for each of the objectclasses. Since each schema consists of a number of attributes, these attributes form the column names of the relational table. An additional column is added to each table, giving the LDAP distinguished name (DN), or the context, of the entry. The way the EDG LDAP schemas are used is more complicated especially for the networking information, but there is a correspondence between a table and a certain combination of Objectclasses. R-GMA cannot currently republish information about the FileElement Objectclass because this information is not permanently held in the LDAP server but dynamically created requiring the knowledge of a local filename. The MDSProducer is completely generic and only assumes knowledge about the names of the objectclasses.
The MDS Producer is implemented in JAVA, in the class MDSProducer. The class is supplied with a properties file which points it to a particular LDAP server, and contains a list of table names and corresponding search filters to publish. The properties file, MDSProducer.props, consists of 5 properties in the standard JAVA key=value format with each property taking a new line:
The MDS Producer will then request all entries of each of the specified search filters, starting the search at the given base DN. The information from each of the entries found is then published, along with the DN of the entry, in the appropriate table.
The MDS Producer is likely to run nearby the LDAP server that it is polling, possibly upon the same machine, although this does not have to be the case since the LDAP servers are polled using the standard LDAP wire protocol. Currently there will be one MDS Producer for every GRIS server. It would be easy to implement a system to provide aggregate information about one or more sites. This would involve a simple Consumer-Producer model where the Consumer side subscribes to all the site MDSProducers and then publishes the aggregate information in some suitable format.
The MDSProducer class includes a main() method that runs the pollGRIS method of the MDSProducer class in an infinite loop. The time between subsequent polls is given as a commandline argument in milliseconds. There is currently a bug, which we don't understand, when accessing the ComputingElement objectclass which prevents us from republishing this information.
We assume that a ProducerServlet is deployed and properly configured (registryServletLocation points to your VO's RegistryServlet), tomcat is up and running, the MDSProducer.props file points to an LDAP server that runs the EDG information provider scripts, the property schemaServletLocation points to the VO's SchemaServlet and the file'' run'' in /opt/edg/info/sensors is executable. Now run the command ``run MDSProducer 10000'' which starts the MDSProducer and polls the LDAP server every 10000 milliseconds.
A GSI-enabled daemon must run on any node which needs to serve its local file system to remote users via GridFTP (i.e. via the client globus_url_copy which uses gridftp as the transport protocol). This includes the gatekeeper, resource broker, and storage element nodes.
Incoming requests are authorized via the grid-mapfile mechanism. Consequently, machines running the ftp daemon must have a full security installation. That is the machine must have a host certificate and key installed, a grid-mapfile, and all of the security RPMs which contain the Certificate Authority certificates and Certificate Revocation List URLs installed. The daemons which update the grid-mapfile (23) and CRLs (8.2.2) should also be running.
The FTP daemon is configured via the /etc/globus.conf file. 8.3 lists the relevant parameters, their default values, and their descriptions.
Parameter | Default | Description |
GLOBUS_LOCATION | /opt/globus | Installation root of Globus software. |
GLOBUS_GSIWUFTPD_PORT | 2811 | Port to use for GSI-enabled FTP. |
GLOBUS_GSIWUFTPD_LOG | /var/log/globus-gsi_wuftpd.log | Location of log file. |
X509_GSIWUFTPD_CERT | /etc/grid-security/hostcert.pem | Location of host certificate. |
X509_GSIWUFTPD_KEY | /etc/grid-security/hostkey.pem | Location of host key. |
GRID_GSIWUFTPD_USER | root | User to run FTP daemon. |
GLOBUS_GSIWUFTPD_OPTIONS | unspecified | Additional FTP daemon options. |
GLOBUS_TCP_PORT_RANGE | unspecified | Range of TCP ports (e.g. "30000,31000") |
GLOBUS_UDP_PORT_RANGE | unspecified | Range of UDP ports (as above) |
This daemon is controlled via a standard init.d-style script which support the start, stop, restart, and status directives. (See 3.2 for more details.)