The various computers involved in building a site of the DataGrid testbed fall into just a few functional categories. For a typical site this involves setting up a Gatekeeper node which acts as the portal for the site and as the front-end of the local batch system controlling a set of Worker nodes. Combined, this comprise a Computing Element (CE). Most sites operating a CE will provide some persistence storage with a Storage Element which is the interface to storage devices. To access the testbed for job submission a User Interface (UI) machine is required.
Some larger sites will offer additional services like the Job Submission Service which requires to install a Resource Broker (RB). If a RB is setup the site has to run a BDII information index server, too. This is a LDAP server that acts as an information index.
To allow users to run jobs with a livetime that exceeds the livetime of the proxy certificate a MyProxy machine that provides the proxy renewal service has to be setup.
Again, this is not necessary for the typical site.
Inside the testbed for each VO a replica catalog (RC) and a VO server has to be setup and maintained.
4.1 gives the list of different machine types and the services which must run on each.
Daemon | UI | IS | CE | WN | SE | RC | RB | MProxy | BDII |
Globus or Edg Gatekeeper | - | - | XX | - | XX | - | - | - | - |
Globus Rep. Cat. | - | - | - | - | - | XX | - | - | - |
GSI-enabled FTPD | - | - | XX | - | XX | - | XX | - | - |
Globus MDS | - | XX | XX | - | XX | - | - | - | - |
Info-MDS | - | XX | XX | - | XX | - | - | - | - |
Broker | - | - | - | - | - | - | XX | - | - |
Job submission serv. | - | - | - | - | - | - | XX | - | - |
Info. Index | - | - | - | - | - | - | - | - | XX |
Logging & Bookkeeping | - | - | - | - | - | - | XX | - | - |
Local Logger | - | - | XX | - | XX | - | XX | - | - |
CRL Update | - | - | XX | - | XX | - | XX | - | - |
Grid mapfile Update | - | - | XX | - | XX | - | XX | - | - |
RFIO | - | - | - | - | XX | - | - | - | - |
GDMP | - | - | - | - | XX | - | - | - | - |
MyProxy | - | - | - | - | - | - | - | XX | - |
Before a site is setup a few things common to all nodes should be mentioned. Please read, before you start the section about time synchronization 8.1. If you install your site using LCFG this service will be handled by the tool. In case you opt for manual configuration you have to install, configure and start the service on every node. In case you are using AFS make shure that your local AFS time server is in sync with the rest of the grid.
To get a better understanding of the security model used by EDG and globus read the section 8.2.
In case you need more information about a given service have a look at the sections in the Appendix.
Before the steps needed to configure individual nodes are described, a walkthrough the main configuration file (site-cfg.h) is given. For details about using LCFG follow the references given in the introduction (1).
In several places of this guide verbatim text from configuration files is used. In some cases the original lines have been to long to be given here. In these case the lines have been split. Whenever this was done a \ character at the end of the line was used. In case you use the sample code given here, please join these lines again.
The example site-cfg.h file has been taken from CERN. To keep the information compact we assume that only two VOs are supported.
/* site-cfg.h ================================================== SITE SPECIFIC CONFIGURATION */ /* COMMON LCFG DEFINITIONS ------------------------------------------------ */ #define LCFGSRV lxshare0371.cern.ch #define URL_SERVER_CONFIG http://lxshare0371.cern.chYou have to set this to the full name of your LCFG server.
/*SOURCE TREE LOCATIONS -------------------------------------------------- */ /* Define the root locations of the Globus tree and the EDG tree. These are used in many configuration files and for setting the ld.so.conf libraries. NOTE: the underscore at the end of the define. Used to avoid confusion with the GLOBUS_LOCATION and EDG_LOCATION tags in configuration files. */ #define GLOBUS_LOCATION_ /opt/globus #define EDG_LOCATION_ /opt/edg /* COMMON GRID DEFINITIONS ------------------------------------------------ */ /* CE AND SE HOST NAMES. These are defined here because they are used in some of the site definitions. */ /* ComputingElement hostname */ #define CE_HOSTNAME lxshare0227.cern.ch /* StorageElement hostname */ #define SE_HOSTNAME lxshare0393.cern.ch
To handle multiple CEs and SEs is possible and requires some modifications in the configuration files. This is beyond the scope of this guide.
/* COMMON SITE DEFINITIONS ------------------------------------------------ */ #define LOCALDOMAIN cern.ch #define SITE_MAILROOT SITE_MANAGERS_MAIL_ADDRESS@YOURSITE.ch #define SITE_GATEWAYS 137.138.1.1 /* Allowed networks (useful for tcpwrappers) */ #define SITE_ALLOWED_NETWORKS 127.0.0.1, 137.138., 128.141. #define SITE_NAMESERVERS 137.138.16.5 137.138.17.5Please note that some lists are comma separated while others, like the SITE_NAMESERVERS, are separated by a single white space !
/* The netmask */ #define SITE_NETMASK 255.255.0.0 /* NTP server (domain and hostname) */ #define SITE_NTP_DOMAIN cern.ch #define SITE_NTP_HOSTNAME ip-time-1 /* The time zone */ #define SITE_TIMEZONE Europe/Paris /* Site name */ #define SITE_NAME_ CERN-PRO-1-4This name must be unique inside the whole grid! Make sure you coordinate your choice with the other site administrators.
/* Site EDG version */ #define SITE_EDG_VERSION v1_4_3 /* Site installation date year month day time */ #define SITE_INSTALLATION_DATE_ 20021118120000Z /* Site distinguished name. */ #define SITE_DN_ \"dc=cern, dc=ch, o=Grid\"You can find this information in the host certificate of your CE node.
/* All the WN (used by /etc/export configuration of /home NFS Mount e.g. testbed*.lnl.infn.it. Needed by ComputingElement.h) */ #define SITE_WN_HOSTS lxshare*.cern.ch,tbed0*.cern.ch,adc*.cern.ch /* All the SE hosts (comma separated list) */ #define SITE_SE_HOSTS_ SE_HOSTNAME /* List (comma separated) of the Computing Element(s) of your site */ #define SITE_CE_HOSTS_ CE_HOSTNAME:2119/jobmanager-pbs-short,\ CE_HOSTNAME:2119/jobmanager-pbs-infiniteThis is the list of CEs and their local resource managers. 2119 is the port of the gate keeper.
/* The default configuration of MDS is that there is a GRIS running on each functional node (CE, SE). There is a single site-level GIIS running by default on the CE. This site-level GIIS then registers to the top-level GIIS for the production or development testbed. The details are handled via the globuscfg configuration object. */ /* Usually use a name like nikhefpro or nikhefdev for the production or development testbeds. */ #define SITE_GIIS cern #define SITE_GIIS_HOSTNAME CE_HOSTNAME /* These point to the next highest level in the MDS hierarchy. Ask to find out the parameters for this. At time of tagging these were: edgdev on lxshare0372.cern.ch for DEVELOPMENT Testbed edgpro on lxshare0373.cern.ch for PRODUCTION (Application) Testbed but DO ask to be sure.*/ #define TOP_GIIS edgpro #define TOP_GIIS_HOSTNAME lxshare0373.cern.chFor this information you should contact the integration team. Contact information is provided on the WP6 web page.
/* COMMON DEFAULT VALUES -------------------------------------------------- */ /* This defines the default location for the host certificates. If this is different for your site define the new value here. If you need to change it for the CE or SE separately, see below. */ #define SITE_DEF_HOST_CERT /etc/grid-security-local/hostcert.pem #define SITE_DEF_HOST_KEY /etc/grid-security-local/hostkey.pem #define SITE_DEF_GRIDMAP /etc/grid-security/grid-mapfile #define SITE_DEF_GRIDMAPDIR /etc/grid-security/gridmapdir/ /* DATA MGT PARAMETERS FOR SEVERAL NODE TYPES ---------------------------- */ /* These variables define which VOs your site supports. At least one must be defined. For each line the RC and GDMP configurations will be done and on the SE a GDMP server will be configured. It also will create 50 accounts for each defined VO. You must define the associated password for each of the supported VOs. Contact the site administrators to obtain the passwords. */ #define SE_VO_ALICE #define SE_GDMP_REP_CAT_ALICE_PWD ALICE_PASSWORD #define SE_VO_ATLAS #define SE_GDMP_REP_CAT_ATLAS_PWD ATLAS_PASSWORDFor this information contact the integration team or the VO managers.
/* COMPUTING ELEMENT DEFINITIONS ------------------------------------------ */ /* Subject of the certificate */ #define CE_CERT_SBJ \"/O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0227.cern.ch\" /* Some site and host information (it goes in globus.conf)*/ #define CE_HOST_DN \"hn=lxshare0227.cern.ch, dc=cern, dc=ch, o=Grid\" /* Full path of the certificate */ #define CE_CERT_PATH SITE_DEF_HOST_CERT /* Full path of the secret key */ #define CE_SECKEY_PATH SITE_DEF_HOST_KEY /* System administrator e-mail */ #define CE_SYSADMIN SITE_MAILROOT /* Space separated job manager list (e.g. fork, pbs, lsf), part of globus.conf. NOTE: To support the standard globus commands (in particular the globus-job-get-output command) the fork job manager must be listed first! I.e. the fork job manager must be the default. */ #define CE_JOBMANAGERS \"fork pbs\"Note fork has not only to be the first in the list, but it has to be always in the list!
/* Batch system adopted by CE (this info goes in info-mds.conf */ #define CE_BATCHSYSTEM_ pbs /* Binaries path of the batch system */ #define CE_BATCHSYSTEM_BIN_PATH /usr/pbs/bin /* Local queue names */ #define CE_QUEUE_ short,infinite /* List (comma separated no spaces) of StorageElement(s) close to this CE */ #define CE_CLOSE_SE_ID_ SE_HOSTNAME /* Mount point(s) of the SE(s) close to this CE */ #define CE_CLOSE_SE_MOUNTPOINT /flatfiles/SE00More information on mount points will be given in 4.3.1.There the layout of the shared file system is explained in more detail.
/* Disk description */ #define CE_DISK_DESC 15GB-EIDE /* CPU description */ #define CE_CPU_DESC DUAL-PIII-800 /* CE InformationProviders: MinPhysMemory */ #define CE_IP_MINPHYSMEM 512 /* CE InformationProviders: MinLocalDiskSpace */ #define CE_IP_MINLOCDISK 2048 /* CE InformationProviders: NumSMPs */ #define CE_IP_NUMSMPS 26 /* CE InformationProviders: MinSPUProcessors */ #define CE_IP_MINSPUPROC 2 /* CE InformationProviders: MaxSPUProcessors */ #define CE_IP_MAXSPUPROC 2 /* CE InformationProviders: MaxSI00. See some examples of SpecInt at http://www.specbench.org/osg/cpu2000/results/cint2000.html */ #define CE_IP_MAXSI00 380 /* CE InformationProviders: MinSI00 */ #define CE_IP_MINSI00 380 /* CE InformationProviders: AverageSI00 */ #define CE_IP_AVRSI00 380 /* CE InformationProviders: AFSAvailable: */ #define CE_IP_AFS_AFSAVAILABLE FALSE /* CE InformationProviders: OutboundIP */ #define CE_IP_OUTBOUNDIP TRUE /* CE InformationProviders: InboundIP */ #define CE_IP_INBOUNDIP FALSE /* CE InformationProviders: RunTimeEnvironment (1) */ #define CE_IP_RUNTIMEENV1 ATLAS-3.2.1 /* CE InformationProviders: RunTimeEnvironment (2) */ #define CE_IP_RUNTIMEENV2 ALICE-3.07.01 /* CE InformationProviders: RunTimeEnvironment (10) */ /*#define CE_IP_RUNTIMEENV10 ! define it if you need! */ /* This must be defined for your CE; it indicates that your site is running but hasn't yet been certified. Change this to EDG-CERTIFIED once your site has been tested by the ITeam. */ #define CE_IP_RUNTIMEENV15 EDG-TEST /*#define CE_IP_RUNTIMEENV15 EDG-CERTIFIED */By default 15 runtime environment variables can be defined. It is possible to add more by modifying the CE specific configuration file. It is important that you first set the EDG-TEST. For details about being certified for the edg testbed contact the integration team.
/* The mountpoint on the CE of the SE exported area via NFS */ #define CE_MOUNTPOINT_SE_AREA CE_CLOSE_SE_MOUNTPOINT /* Uncomment this below if you want to collect and publish data from a network monitor */ /* #define NETMON_HOST_ gppnm06.gridpp.rl.ac.uk */ /* STORAGE ELEMENT DEFINITIONS -------------------------------------------- */ /* Full path of the certificate */ #define SE_CERT_PATH SITE_DEF_HOST_CERT /* Full path of the secret key */ #define SE_SECKEY_PATH SITE_DEF_HOST_KEY /* Subject of the SE certificate */ #define SE_CERT_SBJ \"/O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0393.cern.ch\" /* Some site and host information (it goes in globus.conf) */ #define SE_HOST_DN \"hn=lxshare0393.cern.ch, dc=cern, dc=ch, o=Grid\" /* System administrator e-mail */ #define SE_SYSADMIN SITE_MAILROOT /* List (comma separated without spaces) of ComputingElement(s) close to the SE. */ #define SE_CLOSE_CE_ SITE_CE_HOSTS_ /* The value of SE_SIZE in info-mds.conf */ #define SE_DISKSIZE 15The SE_DISKSIZE and SE_FILESYSTEMS_ should be set in the node configuration file reflecting the actual available space and configuration.
/* comma separated list without spaces, values used in df to obtain freespace */ #define SE_FILESYSTEMS_ /dev/hda2 /* Disk description */ #define SE_DISK_DESC 15GB-EIDE /* CPU description */ #define SE_CPU_DESC DUAL-PIII-800 /* SE protocols */ #define SE_PROTOCOLS_ gridftp,rfio,fileNote that the file protocol is only available if you use a shared file system between the SE and the WNs.
/* SE protocols ports */ /* Note that although the IANA port for rfio is 3147, the software by default runs on 5001. */ #define SE_PROTOCOL_PORTS_ 2811,5001, /* GDMP area */ #define SE_GDMP_AREA /flatfiles/SE00 /* List of the supported VO. Add/remove the VO name for each VO that you support/do not support in both of the following defines. */ #define SE_GDMP_VOS alice,atlas #define SE_VO_ alice:SE_GDMP_AREA/alice,atlas:SE_GDMP_AREA/atlas /* WORKER NODE DEFINITIONS ------------------------------------------------ */ /* The mountpoint on the WN of the SE exported area via NFS. It should be the same used for the SE */ #define WN_MOUNTPOINT_SE_AREA CE_MOUNTPOINT_SE_AREA /* USER INTERFACE DEFINITIONS --------------------------------------------- */ /* Resource broker */ #define UI_RESBROKER lxshare0380.cern.ch /* Logging and Bookkeeping URL */ #define UI_LOGBOOK https://lxshare0380.cern.ch:7846If you only want to install and configure a UI, then the UI_RESBROKER and UI_LOGBOOK are the only two defines you need to change.
This section should be first read before moving on to the following sections describing the installation of individual nodes. Many things will be unclear reading this the first time, but will become clearer later.
The current setup is based on 3 RAID disk servers using EIDE disks. Each server is configured as an NFS server exporting 5 100GB partitions.
All partitions are inserted in a common file system naming schema which follows the pattern: /shift/ < server > / < disk >, e.g./shift/lxshare072d/data02
All client nodes (UI,CE,SE,WN) mount the needed partitions using the standard name as a mount point. Ad hoc links are then created on the nodes to point to these paths (examples of this later).
As we are managing several different testbeds, we added an extra path layer which specifies the testbed which is using a particular section of a file system e.g. /shift/lxshare072d/data02/site_pro-1.3 for EDG 1.3 production site.
For a given testbed, the following disk areas are located on the disk server (in parenthesis the nodes mounting the area):
1. User home directories: this area is mounted on all UI nodes, independently of the testbed they belong to. In this way users have a unique working space. This unique home directory structure is associated to a unique user account system based on a NIS server (more on this later).On all hosts the link for this area is (note the absence of the "testbed" path component):
/home -> /shift/lxshare072d/data01/UIhome
2. GRID security directory: the main reason to share this directory tree is that the grid-mapfile and the CA CRL files are regularly updated. Having each host to do it independently increases the strain on the servers providing the update information and is prone to misalignments, e.g. a user can start a job on a CE but then the job cannot access the local SE as the user's certificate is not yet accepted by that node. Sharing this directory requires that:
On all CE,SE,WN nodes belonging to the application testbed we have the link:
/etc/grid-security -> /shift/lxshare072d/data02/site_pro-1.3/grid-securityIn some occasion this directory cannot be shared. At CERN this happens on RBs and on a special SE node, lxshare0384. In both cases this is due to the fact that the grid-mapfile had to be different from the standard one: on the RB all authorized certificates must be mapped to the dguser account (gridmap directory is also not needed here), while on the special SE we wanted to limit access to a predefined set of users. On these special nodes the grid-security directory is local, all ca_< site > rpms are installed, and the update cron jobs are executed.
3. VO users home directories: sharing of these directories between CE and WNs is mandatory (see XX). There is no real need to share this dir with the SE, but as a gatekeeper is also running on that node, this reduces the number of directories to keep under control. The corresponding link on the application testbed is:
/home -> /shift/lxshare072d/data02/site_pro-1.3/CEhome
4. SE storage area: sharing of this area between SE and CE/WNs allows the activation of the "file" access protocol on the SE. Due to the limitation in the partition size on CERN disk servers, this area spans more than one mounted partition. On each of the nodes sharing this area we created a local /flatfiles directory tree. Within this tree we created links to the actual disk partition. The structure of the directory tree must of course be rigorously identical on all nodes sharing it.
As an example, this is the content of the /flatfiles on the application testbed at CERN:
[root@lxshare0393]# ls -l /flatfiles/SE00 alice -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/alice atlas -> /shift/lxshare072d/data03/site_pro-1.3/flatfiles/SE00/atlas biome -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/biome cms -> /shift/lxshare072d/data05/site_pro-1.3/flatfiles/SE00/cms dzero -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/dzero eo -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/eo flatfiles -> .. iteam -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/iteam lhcb -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/lhcb tutor -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/tutor wpsix -> /shift/lxshare072d/data04/site_pro-1.3/flatfiles/SE00/wpsix
From this we see that all VOs are sharing the same disk partition, /shift/lxshare072d/data04, with the exception of atlas and cms, using /shift/lxshare072d/data03 and /shift/lxshare072d/data05 respectively (note: this layout was chosen as atlas and cms were planning some production tests). Also note the flatfiles - > .. link: this must always be present to allow correct LFN-to-PFN mapping on all nodes.
nfsmount-cfg.h /* * Common nfsmount configuration file */ EXTRA(nfsmount.nfsmount) l072d01 l072d02 l072d03 l072d04 l072d05 nfsmount.nfsdetails_l072d01 /shift/lxshare072d/data01 edg-nfs00.cern.ch:\ /shift/lxshare072d/data01 rw,bg,intr,hard nfsmount.nfsdetails_l072d02 /shift/lxshare072d/data02 edg-nfs00.cern.ch:\ /shift/lxshare072d/data02 rw,bg,intr,hard nfsmount.nfsdetails_l072d03 /shift/lxshare072d/data03 edg-nfs00.cern.ch:\ /shift/lxshare072d/data03 rw,bg,intr,hard nfsmount.nfsdetails_l072d04 /shift/lxshare072d/data04 edg-nfs00.cern.ch:\ /shift/lxshare072d/data04 rw,bg,intr,hard nfsmount.nfsdetails_l072d05 /shift/lxshare072d/data05 edg-nfs00.cern.ch:\ /shift/lxshare072d/data05 rw,bg,intr,hard
Creation of the node specific links cannot, to our knowledge, be handled by LCFG, so we created node-type specific scripts to be executed on each node after installation.
As not all the users participating to the EDG project are CERN users, we set up a user account system independent from the standard CERN one which is based on AFS. To keep it simple, we used a standard NIS server installed on the disk server which also hosts the user home directory disk: this allows us to create users and home directories with a single command.
We then configured all UIs as NIS clients for the "edg-tb" NIS domain. This is only partially handled by LCFG. A full configuration requires the inclusion of a NIS client specific configuration file in the LCFG set up and then the execution of a script on each node.
The LCFG part is:
nisclient-cfg.h: +auth.nsswitch ignore +update.ypserver edg-nfs00.cern.ch EXTRA(boot.services) nsswitch EXTRA(boot.run) nsswitch EXTRA(profile.components) nsswitch +nsswitch.mods_passwd compat +nsswitch.mods_shadow files nis +nsswitch.mods_group files nis +nsswitch.mods_hosts files dns [NOTFOUND=return] +nsswitch.mods_netgroup files nisThis only creates the /etc/nsswitch.conf and /etc/ypserv.conf files.
The script to be executed on each node takes care of enabling and starting the ypbind server, also configuring the system DOMAINNAME variable:
/scripts/setup_enableNIS.sh: #!/bin/bash # Set the NISDOMAIN once and for all domainname edg-tb sed -e "s/^NISDOMAIN=.*$/NISDOMAIN=edg-tb/" /etc/sysconfig/network > /etc/sysconfig/network.new mv -f /etc/sysconfig/network /etc/sysconfig/network.old mv -f /etc/sysconfig/network.new /etc/sysconfig/network # Add the NIS entries to passwd and group #echo "ypserver edg-nfs00.cern.ch" >> /etc/yp.conf echo "+::::::" >> /etc/passwd echo "+:::" >> /etc/group # all is ready: start the ypbind daemon /etc/rc.d/init.d/ypbind start /sbin/chkconfig ypbind on # Once NIS client is up, groups exist and the auth object can be executed cd /etc/obj bin/runobj auth start
The User Interface Machine contains the client software necessary to communicate with the Resource Broker as described in the Users Guide document.
The list of RPM packages to be installed for each version of EDG software can be viewed and downloaded at the following address: http://datagrid.in2p3.fr/autobuild/rh6.2/rpmlist
The RPMs for each EDG component (CE, SE, ...) are divided into several categories (CA, Globus, EDG, ...) this allows to install only the required components.
If you are upgrading a machine where a previous version of EDG is already installed it is strongly recommended to uninstall the EDG software.
To install the edg and globus software you need super-user privileges. All the commands listed below assume to be executed under 'root'.
/opt/globus/lib /opt/edg/lib /opt/globus-24/libRun /bin/ldconfig
These 5 steps are common to most nodes. Steps 6 and 7 are specific to a UI node.
Modify the site-cfg.h file in the source directory of your LCFG server Apart from general settings you have to change the defines that represent the resource broker and L&B server.:
/* Resource broker */ #define UI_RESBROKER lxshare0380.cern.ch /* Logging and Bookkeeping URL */ #define UI_LOGBOOK https://lxshare0380.cern.ch:7846
Start the update or installation of you UI node as described in the LCFG guide. No additional manual intervention is needed.
A user may wish to customize the UI configuration file to use a different resource broker or to change the sandbox location, for example. This can be done by copying the standard UI configuration file, editing it appropriately, and then setting the environmental variable EDG_WL_UI_CONFIG_PATH to point to the new configuration file.
A rudimentary test of the user interface is to submit a "Hello World" example. Put the following into a file called hello.jdl:
Executable = "/bin/echo"; Arguments = "Hello"; StdOutput = "hello.out"; StdError = "hello.err"; OutputSandbox = {"hello.out","hello.err"}; Rank = other.MaxCpuTime;and submit this job with
dg-job-submit hello.jdlThe status of the job can be obtained with dg-job-status using the job identifier returned from the submit command. The output can be retrieved with dg-job-get-output again with the job identifier. The hello.out file should contain the word "Hello".
For a better introduction to using edg consult the EDG User Guide which can be found on WP6's web page.
A computing element consists of a gatekeeper and optionally a set of worker nodes joined by a local resource management system (batch system). If the computing element contains worker nodes, then the home areas of all of the accounts must be on a common shared file system with the gatekeeper node.
Follow the steps 1-5 described for the UI machine, then:
GLOBUS_LOCATION=/opt/globus-24 GLOBUS_CONFIG=/etc/globus2.conf
setenv GLOBUS_LOCATION /opt/globus-24 $GLOBUS_LOCATION/sbin/globus-initialization.sh
/etc/globus.conf /etc/edg/info-mds.confExamples for these files will be given later. File /etc/globus2.conf is used for configuring the EDG information system on your site. Below you'll find the file on the CE host ccgridli03.in2p3.fr which is a GRIS and a GIIS for the site cc-in2p3 which registers to the EDG applications testbed information system edgpro host lxshare0373.cern.ch. The user edginfo is used to run the LDAP daemons needed by the information system.
[common] X509_USER_CERT=/etc/grid-security/hostcert.pem X509_USER_KEY=/etc/grid-security/hostkey.pem GRIDMAP=/etc/grid-security/grid-mapfile [mds] user=edginfo [mds/gris/provider/gg] provider=globus-gris [mds/gris/provider/ggr] provider=globus-gram-reporter [mds/gris/provider/edg] [mds/gris/registration/cc-in2p3] regname=cc-in2p3 reghn=ccgridli03.in2p3.fr [mds/giis/cc-in2p3] name=cc-in2p3 [mds/giis/cc-in2p3/registration/edgpro] regname=edgpro reghn=lxshare0373.cern.ch [gridftp]
/sbin/chkconfig globus-gatekeeper on /etc/rc.d/init.d/globus-gatekeeper start /sbin/chkconfig globus-mds on /etc/rc.d/init.d/globus-mds start /sbin/chkconfig globus-gsincftp on /etc/rc.d/init.d/globus-gsincftp start /sbin/chkconfig localloger on /etc/rc.d/init.d/localloger start
echo 480000 > /proc/sys/fs/inode-max echo 120000 > /proc/sys/fs/file-max cp -f /etc/rc.d/rc.local /etc/rc.d/rc.local.orig cat >> /etc/rc.d/rc.local <<EOD # Increase some system parameters to improve EDG CE scalability if [ -f /proc/sys/fs/inode-max ]; then echo 480000 > /proc/sys/fs/inode-max fi if [ -f /proc/sys/fs/file-max ]; then echo 120000 > /proc/sys/fs/file-max fi EOD
touch `egrep "[a-z]+[0-9][0-9][0-9]" /etc/passwd | cut -d ":" -f 1`
echo "lxshare0227.cern.ch" > /usr/spool/PBS/server_name
lxshare0378.cern.ch np=2 edgpro``edgpro'' is an arbitrary name which has been used to configure the server.''np'' sets the number of concurrent jobs which can be run on the node.
/etc/rc.d/init.d/pbs stop /etc/rc.d/init.d/pbs startmake the daemon startup persistent with:
/sbin/chkconfig pbs on
/usr/pbs/bin/qmgr < /usr/spool/PBS/pbs_server.confA sample configuration supporting only two queues is given here:
# # Create queues and set their attributes. # # # Create and define queue short # create queue short set queue short queue_type = Execution set queue short resources_max.cput = 00:15:00 set queue short resources_max.walltime = 02:00:00 set queue short enabled = True set queue short started = True # # Create and define queue infinite # create queue infinite set queue infinite queue_type = Execution set queue infinite resources_max.cput = 72:00:00 set queue infinite resources_max.walltime = 240:00:00 set queue infinite enabled = True set queue infinite started = True # # Set server attributes. # set server scheduling = True set server acl_host_enable = False set server managers = root@lxshare0227.cern.ch set server operators = root@lxshare0227.cern.ch set server default_queue = short set server log_events = 511 set server mail_from = adm set server query_other_jobs = True set server scheduler_iteration = 600 set server default_node = edgpro set server node_pack = False
The Local Centre Authorization Service (LCAS) handles authorization requests to the local computing fabric.
In this release the LCAS is a shared library, which is loaded dynamically by the globus gatekeeper. The gatekeeper has been slightly modified for this purpose and will from now on be referred to as edg-gatekeeper.
The authorization decision of the LCAS is based upon the user's certificate and the job specification in RSL (JDL) format. The certificate and RSL are passed to (plug-in) authorization modules, which grant or deny the access to the fabric. Three standard authorization modules are provided by default:
For installation and configuration instructions on the edg-gatekeeper and LCAS modules, go to LCAS website.
The Gatekeeper must have a valid host certificate and key installed in the /etc/grid-security directory. These are usually links to files in the /etc/grid-security-local directory.
The Gatekeeper must have all of the security RPMs installed. In addition, the daemon which updates the certificate revocation lists (see 8.2.2) and that which updates the grid mapfile (see Section 23) must also be running. An example mkgridmap configuration file can be found on the Testbed website. The example maps users into pooled accounts based on membership in a virtual organization. These tasks are done by cron jobs.
Here is an example for a /etc/globus.conf file as it is used on the CERN application testbed. The node is lxshare0227.cern.ch.
GLOBUS_LOCATION=/opt/globus GLOBUS_GATEKEEPER_SUBJECT="/O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0227.cern.ch" GLOBUS_HOST_DN="hn=lxshare0227.cern.ch, dc=cern, dc=ch, o=Grid" GLOBUS_ORG_DN="dc=cern, dc=ch, o=Grid" GLOBUS_GATEKEEPER_HOST="lxshare0227.cern.ch" GATEKEEPER_PORT=2119 GATEKEEPER_LOG=/var/log/globus-gatekeeper.log X509_CERT_DIR=/etc/grid-security/certificates X509_GATEKEEPER_CERT=/etc/grid-security-local/hostcert.pem X509_GATEKEEPER_KEY=/etc/grid-security-local/hostkey.pem GRIDMAP=/etc/grid-security/grid-mapfile GLOBUS_JOBMANAGERS="fork pbs" GSIWUFTPPORT=2811 GSIWUFTPDLOG=/var/log/gsiwuftpd.log GLOBUS_FLAVOR_NAME=gcc32dbg GRID_INFO_EDG=yes GRIDMAPDIR=/etc/grid-security/gridmapdir/ GRID_INFO_GRIS_REG_GIIS=cern GRID_INFO_GRIS_REG_HOST=lxshare0227.cern.ch GLOBUS_GATEKEEPER=/opt/edg/sbin/edg-gatekeeper GLOBUS_GATEKEEPER_OPTIONS="-lcas_dir /opt/edg/etc/lcas -lcasmod_dir /opt/edg/lib/lcas/" GLOBUS_GSIWUFTPD_UMASK=002 GRID_INFO_GRIS=yes GRID_INFO_USER=edginfo X509_GSIWUFTPD_CERT=/etc/grid-security-local/hostcert.pem X509_GSIWUFTPD_KEY=/etc/grid-security-local/hostkey.pem GLOBUS_GRAM_JOB_MANAGER_QSUB=/usr/pbs/bin/qsub GLOBUS_GRAM_JOB_MANAGER_QDEL=/usr/pbs/bin/qdel GLOBUS_GRAM_JOB_MANAGER_QSTAT=/usr/pbs/bin/qstat GLOBUS_GRAM_JOB_MANAGER_MPIRUN=/usr/pbs/bin/qrun GLOBUS_GRAM_JOB_MANAGER_QSELECT=/usr/pbs/bin/qselect
To configure the /etc/edg/info-mds.conf start with what is present in /etc/edg/info-mds.conf.in.
The example given is taken from the CERN CE lxshare0227.cern.ch Read the section about the site-cfg.h file and the part about installing via LCFG to get more information about the parameters.
WP3_DEPLOY=/opt/edg/info/mds FTREE_INFO_PORT=2171 FTREE_DEBUG_LEVEL=0 SITE_DN=Mds-Vo-Name=local,o=Grid SITE_INFO=yes SITE_NAME=CERN-PRO-1-4 SITE_INSTALLATION_DATE=20021118120000Z SITE_CPU_RESOURCE_DESCRIPTION=DUAL-PIII-800 SITE_DISK_RESOURCE_DESCRIPTION=15GB-EIDE SITE_SYSADMIN_CONTACT=hep-project-grid-cern-testbed-managers@cern.ch SITE_USER_SUPPORT_CONTACT=hep-project-grid-cern-testbed-managers@cern.ch SITE_SECURITY_CONTACT=hep-project-grid-cern-testbed-managers@cern.ch SITE_DATAGRID_VERSION=v1_4_3 SITE_SE_HOSTS=lxshare0393.cern.ch SITE_CE_HOSTS=lxshare0227.cern.ch:2119/jobmanager-pbs-short,\ lxshare0227.cern.ch:2119/jobmanager-pbs-infinite NETMON_PRESENT=no NETMON_PINGER_HOST=lxshare0227.cern.ch CE_PRESENT=yes CE_HOST=lxshare0227.cern.ch CE_BATCHSYSTEM=pbs CE_CLUSTER_BATCH_SYSTEM_BIN_PATH=/usr/pbs/bin CE_STATIC_LDIF=/opt/edg/info/mds/etc/ldif/ce-static.ldif CE_QUEUE=medium,long,short,infinite CE_CLOSE_SE_ID=lxshare0393.cern.ch CE_CLOSE_SE_MOUNT_POINT=/flatfiles/SE00 GRID_INFO_USER=edginfo SITE_NETMON_HOST=no SITE_NETMON_HOSTS=none
Next copy the file
/opt/edg/info/mds/etc/ldif/ce-static.ldif.in
to /opt/edg/info/mds/etc/ldif/ce-static.ldif
and modify it to reflect the local environment. SpecInt2000
benchmarks can be found at SPEC
Website. (For valid tags for RunTimeEnvironment see
23.)
The file has been taken from the CE on lxshare0227.cern.ch. The nodes are dual 800MHz PIII nodes.
The text on the right hand side is not part of the configuration. It has been put here to
provide some description of the parameters.
Architecture: intel The architecture of the hosts composing the CE OpSys: RH 6.2 The operating system of the hosts composing the CE MinPhysMemory: 512 Minimum value of the physical memory of any WN MinLocalDiskSpace: 2048 The minimum local disk footprint NumSMPs: 26 Number of SMP hosts MinSPUProcessors: 2 The minimum number of SPU processors (for SMP hosts) MaxSPUProcessors: 2 The Maximum number of SPU processors (for SMP hosts AverageSI00: 380 Average of the SpecInt2000 benchmark of the WNs MinSI00: 380 Minimum value of the SpecInt2000 benchmark of the WNs MaxSI00: 380 Maximum value of the SpecInt2000 benchmark of the WNs AFSAvailable: FALSE Defines if AFS is installed OutboundIP: TRUE Defines if outbound connectivity is allowed InboundIP: FALSE Defines if inbound connectivity is allowed RunTimeEnvironment: CMS-1.1.0 RunTimeEnvironment: ATLAS-3.2.1 RunTimeEnvironment: ALICE-3.07.01 RunTimeEnvironment: LHCb-1.1.1 RunTimeEnvironment: IDL-5.4 RunTimeEnvironment: CERN-MSS RunTimeEnvironment: CMSIM-125 RunTimeEnvironment: CERN-PRO-1-4 RunTimeEnvironment: CMS-STRESSTEST-1.0.0 RunTimeEnvironment: EDG-TEST
Once the globus-mds service is started, to see which information is published you can use:
ldapsearch -LLL -x -H \ ldap://<CE-hostname>:2135 -b "mds-vo-name=local,o=grid" "(objectClass=*)"
After doing the appropriate configuration changes to site-cfg.h the file ComputingElement-cfg.h might need minor changes. These could be in the area of the LCAS object configuration and the parts that refer to the used NFS configuration. Check the users.h file and add or remove the virtual users appropriate for the VOs that you support.
Another configuration file that should be checked that it contains current information is rc-cfg.h. Make sure that the information about the replica catalogs given for the different supported VOs is correct. This information can be obtained from the VO managers or the Integration team.
If you plan to use PBS, include in the node specific configuration file pbs-cfg.h after the ComputingElement-cfg.h. This is needed due to dependencies. In case you plan to use your CE in addition as a WN then you have to replace pbs-cfg.h with pbsexechost-cfg.h and configure it as described in the section about setting up a WN.
Install the CE using LCFG. Then a few minor manual changes are required. Follow the steps described in the Manual Configuration section described in the items 6 to the end. Reboot the machine.
Check that the following services are running: pbs, globus-gsi_wuftpd, globus-gatekeeper, globus-mds and locallogger.
Check the information published by the node:
ldapsearch -LLL -x -H ldap://<CE-hostname>:2135 -b "mds-vo-name=local,o=grid" "(objectClass=*)"
Follow instructions given in the Users Guide and run from a UI node a globus-job-run command.
In a typical configuration all of the authorization for the worker node is handled by the associated gatekeeper. FTP transfers to and from are usually allowed but they must be initiated by the worker node as it does not run an FTP daemon.
As a consequence of this, the machine does not need to have a host certificate/key, grid-mapfile, or the security RPMs installed. On the other side, the /etc/grid-security/certificates directory must exists to allow the WNs to verify user and host certificates.
Note: A Worker Node cannot be a Storage Element as well and cannot host a Resource Broker or the Logging and Bookkeeping services.
Perform the steps 1-5 of the installation of a UI. Then configure the GDMP client in the same way it has been done on the UI.
Install the node(s). Check that the rc object has been run and set the access privileges for the /opt/edg/etc/ < VO > directories correctly. They have to be rwxr-xr-x and the owner must be set to root : < VOgrp >.
Access control to files is managed in the following way. On the SE group ID per VO has been created. All users belonging to a given VO have their certificates mapped in the grid-mapfile of the SE to a local user with the VOs group ID. The directory used by GDMP to replicate files is group writable by the VO group ID only and has a group sticky bit set (see the details further in these instructions). This prevents users belonging to other VOs to write/use the directory or files contained.
The storage element must also run a gatekeeper () and an FTP daemon (xref linkend="gsiftpd"/ ). See the appropriate sections for the proper configuration of these daemons.
To run a storage element a host certificate is required.
If you want to provide the access protocol ``file'' you have to use a shared file system and make the storage area accessible from the WNs. This area is on most systems at the location /flatfiles and contains directories for the various VOs.
In case you want to provide RFIO access install the castor-rfio and castor-rfio-devel RPMS. To use these make sure that the $PATH variable includes the path to the RFIO commands.
There is a script "rfiod.scripts" in the rfio sub-directory of the castor distribution that can be used to start/stop/restart/check the presence of rfiod (i.e. to start run rfiod.scripts start). This works for all Unix platforms.
Its not advisable to use inetd to start rfios as the requests can come very quickly and cause inetd to think the process is looping and refuse to start a new rfiod.
[common] GLOBUS_LOCATION=/opt/globus-24 globus_flavor_name=gcc32dbg X509_USER_CERT=/etc/grid-security-local/hostcert.pem X509_USER_KEY=/etc/grid-security-local/hostkey.pem GRIDMAP=/etc/grid-security/grid-mapfile GRIDMAPDIR=/etc/grid-security/gridmapdir/ [mds] user=edginfo [mds/gris/provider/gg] provider=globus-gris [mds/gris/provider/ggr] provider=globus-gram-reporter [mds/gris/provider/edg] [mds/gris/registration/site] regname=cern reghn=lxshare0227.cern.ch [gridftp]
GDMP_SHARED_CONF=/opt/edg/etc/gdmp.shared.conf GDMP_SERVICE_NAME=host/lxshare0393.cern.ch GDMP_VIRTUAL_ORG=alice GDMP_CONFIG_DIR=/opt/edg/etc/alice GDMP_VAR_DIR=/opt/edg/var/alice GDMP_TMP_DIR=/opt/edg/tmp/alice GDMP_GRID_MAPFILE=/opt/edg/etc/alice/grid-mapfile GDMP_SERVER_PROXY=/opt/edg/etc/gdmp_server.proxy GDMP_PRIVATE_CONF=/opt/edg/etc/alice/gdmp.private.conf GDMP_STORAGE_DIR=/flatfiles/SE00/alice GDMP_STAGE_FROM_MSS=/opt/edg/alice/bin/stage_from_mss.sh GDMP_STAGE_TO_MSS=/opt/edg/alice/bin/stage_to_mss.shand for /opt/edg/etc/alice/gdmp.private.conf:
GDMP_REP_CAT_HOST=ldap://grid-vo.nikhef.nl:10489 GDMP_REP_CAT_NAME=AliceReplicaCatalog GDMP_REP_CAT_MANAGER_CN=Manager GDMP_REP_CAT_MANAGER_PWD=THE PASSWORD GDMP_REP_CAT_CN=dc=eu-datagrid,dc=org GDMP_REP_CAT_FILE_COLL_NAME=Alice WP1 Repcat GDMP_REP_CAT_MANAGER_DN=cn=${GDMP_REP_CAT_MANAGER_CN},rc=${GDMP_REP_CAT_NAME},\ ${GDMP_REP_CAT_CN} GDMP_REP_CAT_URL=${GDMP_REP_CAT_HOST}/rc=${GDMP_REP_CAT_NAME},${GDMP_REP_CAT_CN} GDMP_REP_CAT_FILE_COLL_URL=${GDMP_REP_CAT_HOST}/lc=${GDMP_REP_CAT_FILE_COLL_NAME},\ rc=${GDMP_REP_CAT_NAME},${GDMP_REP_CAT_CN} GDMP_REP_CAT_OBJECTIVITY_COLL_URL=${GDMP_REP_CAT_HOST}/lc=${GDMP_REP_CAT_OBJYFILE_COLL_NAME},\ rc=${GDMP_REP_CAT_NAME},${GDMP_REP_CAT_CN}Note that this file contains the password for the VO specific replica catalog which you can get from the VO manager or Integration team.
/sbin/chkconfig globus-gatekeeper /etc/rc.d/init.d/globus-gatekeeper start /sbin/chkconfig globus-mds /etc/rc.d/init.d/globus-mds start /sbin/chkconfig globus-gsincftp /etc/rc.d/init.d/globus-gsincftp startNote: the GDMP server is started by 'inetd'.
There are now a few steps that the manual and LCFG based installation have in common.
touch `egrep "[a-z]+[0-9][0-9][0-9]" /etc/passwd | cut -d ":" -f 1`
After the initial installation and configuration make sure that a correct /opt/edg/etc/mkgridmap.conf file has been created.
Apart from the VO specific lines, giving the ldap addresses of the VOs, for the SE this file has to contain the special storage element VO.
To get some orientation have a look at the file used at CERN.
#### GROUP: group URI [lcluser] # # EDG Standard Virtual Organizations group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=org .alice group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=org .atlas group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=org .cms group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=org .lhcb group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=org .biome group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=org .eo group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=org .iteam group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=org .wpsix group ldap://grid-vo.nikhef.nl/ou=testbed1,o=dzero,dc=eu-datagrid,dc=org .dzero group ldap://marianne.in2p3.fr/ou=EDGtutorial,o=testbed,dc=eu-datagrid,dc=org .tutor # # Other Virtual Organizations #group ldap://grid-vo.cnaf.infn.it/ou=testbed1,o=infn,c=it .infngrid #group ldap://vo.gridpp.ac.uk/ou=testbed,dc=gridpp,dc=ac,dc=uk .gridpp #group ldap://babar-vo.gridpp.ac.uk/ou=babar,dc=gridpp,dc=ac,dc=uk .babar # # Following group is to get SE (GDMP) host certs ... #group ldap://grid-vo.nikhef.nl/ou=devtb,o=gdmpservers,dc=eu-datagrid,dc=org gdmp group ldap://grid-vo.nikhef.nl/ou=apptb,o=gdmpservers,dc=eu-datagrid,dc=org gdmp #### Optional - DEFAULT LOCAL USER: default_lcluser lcluser #default_lcluser . #### Optional - AUTHORIZED VO: auth URI auth ldap://grid-vo.nikhef.nl/ou=People,o=gdmpservers,dc=eu-datagrid,dc=org auth ldap://marianne.in2p3.fr/ou=People,o=testbed,dc=eu-datagrid,dc=org #### Optional - ACL: deny|allow pattern_to_match #allow *INFN* #### Optional - GRID-MAPFILE-LOCAL #gmf_local /opt/edg/etc/grid-mapfile-local
The grid-mapfile-local file contains a list of certificates which will be included in addition to the items added during the periodic update of the file.
Here the differences between manual and LCFG based installation are marginal.
For Testbed 1, the resource broker machine contains the resource broker itself, the job submission service, a logging and bookkeeping server. The information index has been moved to a different node and is replaced by the BDII, based on a standard LDAP server using the schemas previously used by the II. Each of these must be configured as well as some external software upon which these depend.
The resource broker machine must also be running an grid-ftp daemon.
For full functionality, sendmail must be available on the resource broker machine and must be in the path of the user running the various daemons.
Due to some limitations of the RB, multiple RBs in the application testbed have to be deployed. This means if you are a site running many UIs with active users you should consider setting up additional RBs.
The resource broker must have a valid host certificate and key installed in the /etc/grid-security directory. In addition, copies of these files must be in the .hostcert subdirectory of the account running the resource broker daemons, usually dguser.
The resource broker must have all of the security RPMs installed. In addition, the daemon which updates the certificate revocation lists (see 8.2.2) and that which updates the grid mapfile (see 23) must also be running. An example mkgridmap configuration file can be found on the EDG documentation web page.
The resource broker relies on CondorG and the ClassAds from the Condor team. The RPMs for these packages must be installed and can be obtained from the EDG package repository.
CondorG runs several daemons under an unpriviledged account. You must create this account before installing CondorG. The recommended name is "dguser".
First the procedures will be given that are unique to the manual part. After this the LCFG based procedure will be described and then the common, additional manual configuration steps are given.
Then configure /etc/globus.conf following the example given here:
GLOBUS_LOCATION=/opt/globus GLOBUS_HOST_DN="hn=lxshare0227.cern.ch, dc=cern, dc=ch, o=Grid" GLOBUS_ORG_DN="dc=cern, dc=ch, o=Grid" GRIDMAP=/etc/grid-security/grid-mapfile GRIDMAPDIR=/etc/grid-security/gridmapdir/ GSIWUFTPPORT=2811 GSIWUFTPDLOG=/var/log/gsiwuftpd.log GLOBUS_FLAVOR_NAME=gcc32dbg X509_GATEKEEPER_CERT=/etc/grid-security-local/hostcert.pem X509_GATEKEEPER_KEY=/etc/grid-security-local/hostkey.pem X509_GSIWUFTPD_CERT=/etc/grid-security-local/hostcert.pem X509_GSIWUFTPD_KEY=/etc/grid-security-local/hostkey.pem
Create the required local users: mysql, postgres and dguser
On the RB there are several services that require access to a valid proxy. The proxies are generated by the SysV startup script with a default time of 24 hours. Add
57 2,8,14,20 * * * root service broker proxy 57 2,8,14,20 * * * root service jobsubmission proxy 57 2,8,14,20 * * * root service lbserver proxy 57 2,8,14,20 * * * root service locallogger proxyto the /etc/crontab. In addition the gridmapfiles and the CRL files have to be updated on a regular basis. Add in addition:
53 1,7,13,19 * * * root /opt/edg/etc/cron/mkgridmap-cron 53 1,7,13,19 * * * root /opt/edg/etc/cron/edg-fetch-crl-cron
# Increase some default system parameters for out greedy RB echo 480000 > /proc/sys/fs/inode-max echo 120000 > /proc/sys/fs/file-max echo 1024 7999 > /proc/sys/net/ipv4/ip_local_port_range # To make these modifications permanent, we add them to rc.local cp -f /etc/rc.d/rc.local /etc/rc.d/rc.local.orig cat >> /etc/rc.d/rc.local <<EOD # Increase some system parameters to improve EDG RB scalability if [ -f /proc/sys/fs/inode-max ]; then echo 480000 > /proc/sys/fs/inode-max fi if [ -f /proc/sys/fs/file-max ]; then echo 120000 > /proc/sys/fs/file-max fi if [ -f /proc/sys/net/ipv4/ip_local_port_range ]; then echo 1024 7999 > /proc/sys/net/ipv4/ip_local_port_range fi EOD
ln -s /etc/grid-security-local/hostkey.pem /etc/grid-security/hostkey.pem ln -s /etc/grid-security-local/hostcert.pem /etc/grid-security/hostcert.pem
mkdir /home/dguser/.hostcert cp /etc/grid-security-local/* /home/dguser/.hostcert/ chown -R dguser:dguser /home/dguser/.hostcert
mv /opt/edg/etc/mkgridmap.conf /opt/edg/etc/mkgridmap.conf.orig cat > /opt/edg/etc/mkgridmap.conf.rb <<EOD group ldap://marianne.in2p3.fr/ou=guidelines,o=testbed,dc=eu-datagrid,dc=org dguser auth ldap://marianne.in2p3.fr/ou=People,o=testbed,dc=eu-datagrid,dc=org gmf_local /opt/edg/etc/grid-mapfile-local EOD cp /opt/edg/etc/mkgridmap.conf.rb /opt/edg/etc/mkgridmap.conf
su dguser /opt/CondorG/setup.sh
# .bashrc # User specific aliases and functions if [ -f ~/workload_setup.sh ]; then . ~/workload_setup.sh fi # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fiand workload_setup.sh:
# Point to the CondorG installation path and configuration file. CONDORG_INSTALL_PATH=/home/dguser/CondorG export CONDORG_INSTALL_PATH CONDOR_CONFIG=$CONDORG_INSTALL_PATH/etc/condor_config export CONDOR_CONFIG # Replica catalog API is needed by resource broker. GDMP_INSTALL_PATH=/opt/edg export GDMP_INSTALL_PATH # Setup the user and database area for the postgresql database. # This is used by the resource broker. PGSQL_USER=postgres export PGSQL_USER PGDATA=/opt/data export PGDATA PGSQL_INSTALL_PATH=/usr/bin/psql export PGSQL_INSTALL_PATH # Add paths to the shared library path. for p in \ "${CONDORG_INSTALL_PATH}/lib" \ "${GDMP_INSTALL_PATH}/lib" do if ! printenv LD_LIBRARY_PATH | grep -q "${p}"; then if [ -n "${LD_LIBRARY_PATH}" ]; then LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${p}" else LD_LIBRARY_PATH="${p}" fi fi done export LD_LIBRARY_PATH # Add condor binaries to the path. for p in \ "$CONDORG_INSTALL_PATH/sbin" \ "$CONDORG_INSTALL_PATH/bin" \ "/usr/sbin" do if ! printenv PATH | grep -q "${p}"; then PATH="${p}:${PATH}" fi done export PATH # MUST add the libraries for the 2.95.2 run time libraries. for p in \ "/usr/local/lib" do if ! printenv LD_LIBRARY_PATH | grep -q "${p}"; then if [ -n "${LD_LIBRARY_PATH}" ]; then LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${p}" else LD_LIBRARY_PATH="${p}" fi fi done export LD_LIBRARY_PATH
SKIP_AUTHENTICATION = YES AUTHENTICATION_METHODS = CLAIMTOBE DISABLE_AUTH_NEGOTIATION = TRUE GRIDMANAGER_CHECKPROXY_INTERVAL = 600 GRIDMANAGER_MINIMUM_PROXY_TIME = 180(changing the hostname to the host of your resource broker) and modify the following parameters to have the given values:
CRED_MIN_TIME_LEFT = 0 GLOBUSRUN = \$(GLOBUS\_LOCATION)/bin/globusrunYou may also wish to modify the CONDOR_ADMIN parameter to set the recipient of email to something other than the dguser account.
cat /etc/grid-security/certificates/*.signing_policy \ > /etc/grid-security/certificates/ca-signing-policy.conf
mkdir /opt/data chown postgres:postgres /opt/data su postgres initdb -D /opt/data exit
# Use EDG data location export PGDATA=/opt/datain the start() section just before:
# Check for the PGDATA structureThen change in the line:
su -l postgres -s /bin/sh -c "/usr/bin/pg_ctl -D \ $PGDATA -p /usr/bin/postmaster start > /dev/null 2>&1" < /dev/nullthe output file from /dev/null to
/var/tmp/postgres.log 2>&1As a last change add in the stop() section just before the su -l postgres
# Use EDG data location export PGDATA=/opt/dataNow to start it:
/sbin/chkconfig postgresql on /etc/rc.d/init.d/postgresql start
su postgres createuser <<EOD dguser y n EOD exit
[ MDS_contact = "lxshare0225.cern.ch"; MDS_port = 2170; MDS_timeout = 60; MDS_gris_port = 2135; MDS_basedn = "mds-vo-name=local,o=grid"; MDS_multi_attributes = { "AuthorizedUser", "RunTimeEnvironment", "CloseCE" }; LB_contact = "lxshare0380.cern.ch"; LB_port = 7846; JSS_contact = "lxshare0380.cern.ch"; JSS_client_port = 8881; JSS_server_port = 9991; JSS_backlog = 5; UI_backlog = 5; UI_server_port = 7771; RB_pool_size = 512; RB_notification_queue_size = 32; RB_purge_threshold = 600000; RB_cleanup_threshold = 3600; RB_sandbox_path = "/tmp"; RB_logfile="/var/tmp/RBserver.log"; RB_logfile_size=512000000; RB_logfile_level=7; RB_submission_retries=3; MyProxyServer="lxshare0375.cern.ch"; SkipJobSubmission = false; ]
/sbin/chkconfig broker on /etc/rc.d/init.d/broker start
[ Condor_submit_file_prefix = "/var/tmp/CondorG.sub"; Condor_log_file = "/var/tmp/CondorG.log"; Condor_stdoe_dir = "/var/tmp"; Job_wrapper_file_prefix = "/var/tmp/Job_wrapper.sh"; Database_name = "template1"; Database_table_name = "condor_submit"; JSS_server_port = 8881; RB_client_port = 9991; Condor_log_file_size = 64000; ]
mv /opt/edg/etc/wl-jss_rb-env.sh /opt/edg/etc/wl-jss_rb-env.sh.orig cat /opt/edg/etc/wl-jss_rb-env.sh.orig | \ sed -e "s/CONDOR_IDS=/CONDOR_IDS=\${CONDOR_IDS\:\-2002\.2002}/" \ > /opt/edg/etc/wl-jss_rb-env.sh.rb cp /opt/edg/etc/wl-jss_rb-env.sh.rb /opt/edg/etc/wl-jss_rb-env.sh
touch /var/tmp/CondorG.log chown dguser:dguser /var/tmp/CondorG.log
/sbin/chkconfig jobsubmission on /etc/rc.d/init.d/jobsubmission start
On a Resource Broker a MySQL has to be run under a non privileged account. The following steps walk you through the required configuration. You have to choose a password for the server. In this example the password is globus_admin.
mkdir /var/lib/mysql mkdir /var/lib/mysql/test mkdir /var/lib/mysql/mysql chown -R mysql /var/lib/mysql /usr/bin/mysql_install_db chown -R mysql /var/lib/mysql chmod -R og-rw /var/lib/mysql/mysql /sbin/chkconfig mysql on /etc/rc.d/init.d/mysql start
/usr/bin/mysqladmin -u root password 'globus_admin'Then use it for the next commands to setup the default tables for logging and bookkeeping:
/usr/bin/mysqladmin -u root -p create lbserver /usr/bin/mysql -u root -p -e \ 'grant create,drop,select,insert,update,delete on lbserver.* to lbserver@localhost' /usr/bin/mysql -u lbserver lbserver < /opt/edg/etc/server.sqlTo insure that MySQL is started, as required before logging and bookkeeping servers change the entry in /etc/rc.d/rc3.d
mv /etc/rc.d/rc3.d/S90mysql /etc/rc.d/rc3.d/S85mysql
/sbin/chkconfig lbserver on /sbin/chkconfig locallogger on /etc/rc.d/init.d/lbserver start /etc/rc.d/init.d/locallogger start
To quickly check that the Postgres installation worked, you can create a dummy database as the user running the resource broker daemons:
su - dguser createdb test psql testThere should be no errors from these two commands.
A quick check to see if the server is responding is the following:
openssl s_client -connect \ lxshare0380.cern.ch:7846 -state -debugThis should respond verbosely with information about the SSL connection. Any error indicates a problem with the certificates. You will have to interrupt this command to get back to the command line.
For long-lived jobs there is the possibility that the job will outlive the validity of its proxy causing the job to fail. To avoid this, the workload management tools allow a proxy to be automatically renewed via a MyProxy server. The MyProxy server manages a long-lived proxy generated by a user and gives updated proxies to properly authenticated processes on behalf of the user.
The usual configuration is to have one MyProxy server per resource broker machine. The MyProxy server should run on a separate well-secured machine.
The MyProxy must have a valid host certificate and key installed in the /etc/grid-security directory.
The MyProxy server must have all of the security RPMs installed. In addition, the daemon which updates the certificate revocation lists (see 8.2.2) must also be running.
There is a single configuration file /opt/edg/etc/edg-myproxy.conf which should be filled with the subject names of associated resource brokers.
The SysV initialization script remakes the configuration file from the information in the edg-myproxy.conf and from the "signing policy" files in /etc/grid-security/certificates. This is done every time the daemon is started, so all changes are reflected in the running daemon when it is restarted.
/O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0380.cern.ch /O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0383.cern.ch /O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0382.cern.ch /O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0381.cern.ch /C=IT/O=INFN/OU=host/L=CNAF/CN=grid010g.cnaf.infn.it/Email=sitemanager@cnaf.infn.it /C=IT/O=INFN/OU=host/L=CNAF/CN=grid004f.cnaf.infn.it/Email=sitemanager@cnaf.infn.it /C=IT/O=INFN/OU=www server/L=Catania/CN=genius.ct.infn.it/\ Email=falzone@ct.infn.it,roberto.barbera@ct.infn.it /C=IT/O=INFN/OU=User Interface/L=Catania/CN=grid008.ct.infn.it/\ Email=patrizia.belluomo@ct.infn.it /C=IT/O=INFN/OU=www server/L=Catania/CN=grid009.ct.infn.it/\ Email=falzone@ct.infn.it /C=IT/O=INFN/OU=gatekeeper/L=PD/CN=grid012.pd.infn.it/\ Email=Marco.Verlato@padova.infn.it /C=IT/O=INFN/OU=datagrid-genius/L=Pisa/CN=genius.pi.infn.it/\ Email=livio.salconi@pi.infn.it /C=IT/O=INFN/OU=GRID UI/L=CNAF Bologna/CN=genius.cnaf.infn.it/\ Email=stefano.zani@cnaf.infn.it /C=IT/O=INFN/OU=gatekeeper/L=CA/CN=grid004.ca.infn.it/\ Email=daniele.mura@ca.infn.it
To start the server:
/sbin/chkconfig myproxy on /etc/rc.d/init.d/myproxy start
+myproxy.trusted /O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0380.cern.ch \ /O=Grid/O=CERN/OU=cern.ch/CN=host/lxshare0383.cern.chCopy the host certificate to the locations given in /etc/globus.conf. Start the services as described in the previous section
3,13,23,33,43,53 * * * * EDG_LOCATION_/etc/cron/bdii-cron 1>/dev/null 2>&1to the /etc/crontab file.
Follow the steps that are common with the LCFG based installation.
To generate a new BDII_PASSWD(_PLAIN) pair you can use the /opt/openldap/sbin/slappasswd command which works just like passwd for Unix.
If you are installing the BDII node for a testbed different from the EDG application testbed, then you also need to change MDS_HOST which is the node highest in the MDS hierarchy.
cp /opt/edg/etc/init.d/bdii /etc/rc.d/init.d/bdii /sbin/chkconfig bdii on /etc/rc.d/init.d/bdii start
globuscfg.gris alledg globuscfg.giis edgproNo host cerificate is needed for this host. Install the node.
Since there is only a single Top-MDS node in EDG and this node has been setup using LCFG no information is given on manual installation.
System administrators should register their site with the appropriate GIIS at the next highest level.
Empty