A large part of user tasks on the grid consist of access to data and management of the files containing data. Most users will use the Replica Manager command line interface and API to perform data management tasks on the grid. The Replica Manager interacts with the Replica Location Service (RLS), the Replica Metadata Catalog (RMC), the Replica Optimization Service (ROS) and the Storage Elements (SE) to provide high-level functionality and concurrently to shield users from tedious details of direct RLS and SE interaction. Nonetheless some details concerning the RLS and SE help users understand how the Replica Manager performs its job.
Jargon unfortunately permeates the descriptions of the data management middleware. The following definitions will help to understand the typical terminology:
guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70
lfn:HiggsMonteCarlo.dat.
srm://grid02.lal.in2p3.fr/iteam/higgsCandidate.dat
rfio://grid02.lal.in2p3.fr/iteam/higgsCandidate.dat
The Replica Location Service (RLS) consists of two services: the Local Replica Catalog (LRC) and the Replica Location Index (RLI). The RLI allows the RLS to be geographically distributed; however, for EDG 2.0, this is not deployed. Therefore, a single LRC instance per Virtual Organization acts as a global registry for that VO's files. (Technically, the LRC contains the GUIDSURL mapping as well as some metadata concerning the physical file. See below for an explanation of the terminology.)
A service closely related to the RLS is the Replica Metadata Catalog (RMC). The RMC contains metadata about a VO's files. (Technically, the LFNGUID mapping and metadata tied to the GUID.)
The Replica Optimization Service (ROS) allows the Replica Manager to choose the ``closest'' file in terms of total transfer time.
Finally, the Storage Element (SE) provides a uniform interface to data storage. It provides a web service interface for management functions and typically allows for several types of direct access to data stored on the SE. The GridFTP protocol is supported by all SEs and can be used for wide-area access to the data. Typically ``file'' (i.e. standard POSIX access) and ``rfio'' access to the data are also provided to a ``close'' Storage Element. A SE and CE are in fact defined to be ``close'' if file access to the SE's data is possible from the CE.
The Replica Manager allows one to copy files into grid storage, register files, replicate files between SEs, delete individual replicas, delete all replicas of a particular file, among other things. All of these are available via the edg-replica-manager command or its abbreviated version edg-rm. Two general options to this command that are absolutely vital to correct operation are the vo and insecure6.1 options. The vo option takes the name of your VO as an argument.
A good first test is to execute the following on an User Interface machine:
edg-replica-manager --vo iteam --insecure printInfosubstituting ``iteam'' for the name of your VO. This prints a lot of information about exactly what services the replica manager command will use; the information is pulled from R-GMA. If there are problems with the Replica Manager commands, this command is often useful for debugging.
The subcommands for the Replica Manager also have shortened forms; for example the ``printInfo'' in the above command could have been replaced with ``pi''. A full list of the abbreviations is available from the command's usage obtained with the help option. The examples in this chapter will use the long forms for clarity.
Frequently used subcommands are:
The examples show typical data management cases and highlight the commands described above.
Often data files are first created in temporary scratch space or on computers outside of the grid. To make these data grid-accessible, they must be moved to a Storage Element; usually one wants to register these files in a VO's catalog as well. This example demonstrates how to do this.
First create a fresh proxy with grid-proxy-init. Although the registration is not currently secured, the file transfer is; therefore, valid credentials will be needed.
Create an empty local file to work with:
touch empty-local-fileand now perform a copyAndRegisterFile with the Replica Manager to copy this to a Storage Element and register the file.
>> edg-replica-manager --vo iteam --insecure \ copyAndRegisterFile file:`pwd`/empty-local-file \ --destination-file gppse05.gridpp.rl.ac.uk \ --logical-file-name lfn:my-demo-2003-10-01-1600 guid:b793f080-f417-11d7-b584-857330072702
To check that this file exists, use the listReplicas command:
>> edg-replica-manager --vo iteam --insecure \ listReplicas guid:b793f080-f417-11d7-b584-857330072702 srm://gppse05.gridpp.rl.ac.uk/iteam/generated/2003/10/01/fileb16684bf...Either the logical file name or GUID can be used. One sees that both commands return the same SURL (truncated here) for the replica and that this replica is indeed on the specified SE.
To delete this file, one can use the subcommand deleteFile, specifying the SURL to be deleted.
>> edg-replica-manager --vo iteam --insecure \ deleteFile \ srm://gppse05.gridpp.rl.ac.uk/iteam/generated/2003/10/01/fileb16684bf...Note that deleting the last replica of a file will also remove the GUID and LFN from the catalog. If you wish to remove all replicas, you can use the all option with specifying a GUID.
As the brokering system does not yet perform automatic replication of data files for jobs, it is often necessary to make several replicas of a file on different Storage Elements. To demonstrate this, repeat the previous example to bring a file ``lfn:my-second-demo-2003-10-01-1600'' onto the grid but fill the file with the string ``Hello There''. To verify this exists:
>> edg-replica-manager --vo iteam --insecure \ listGUID lfn:my-second-demo-2003-10-01-1600 guid:a3ac7647-f418-11d7-a57b-e4d5c9608efcwhich lists the GUID associated with this file. One could have also used listReplicas again
>> edg-replica-manager --vo iteam --insecure \ listReplicas lfn:my-second-demo-2003-10-01-1600 srm://gppse05.gridpp.rl.ac.uk/iteam/generated/2003/10/01/file9de8efe6...which shows that the file is on the gppse05.gridpp.rl.ac.uk SE.
You can use the edg-rgma to find another SE. Now to replicate this to another storage element:
>> edg-replica-manager --vo iteam --insecure \ replicateFile --destination se001.fzk.de \ lfn:my-second-demo-2003-10-01-1600 srm://se001.fzk.de/iteam/generated/2003/10/01/file42a1d2b2...which returns the SURL of the copy. Using listReplicas again shows the two distinct replicas:
>> edg-replica-manager --vo iteam --insecure \ listReplicas lfn:my-second-demo-2003-10-01-1600 srm://gppse05.gridpp.rl.ac.uk/iteam/generated/2003/10/01/file9de8efe6... srm://se001.fzk.de/iteam/generated/2003/10/01/file42a1d2b2...Leave these files on the grid for the next example.
The previous example showed how to bring data onto the grid and move it around. This one now demonstrates how to read the data from a job using the ``file'' protocol. It uses getBestFile to get the SURL of the local copy (making a copy if necessary) and then transforms that SURL into a filename which can be opened. The script calculates the checksum of the file.
Put the following JDL into a file called ``ReadData.jdl'':
Executable = "script.sh"; InputData = {"lfn:my-second-demo-2003-10-01-1600"}; DataAccessProtocol = {"file","gridftp","rfio"}; StdOutput = "std.out"; StdError = "std.err"; InputSandbox = {"script.sh"}; OutputSandbox = {"std.out","std.err"};and put the following into a file script.sh:
#!/bin/sh # Get SURL of local replica (making one if necessary). surl=`edg-replica-manager --vo iteam --insecure \ getBestFile lfn:my-second-demo-2003-10-01-1600` echo SURL: $surl # Get TURL of the local replica. turl=`edg-replica-manager --vo iteam --insecure \ getTurl $surl file` echo TURL: $turl # Strip off URL's scheme and fix multiple slashes. fname=`echo $turl | sed -r 's%/+%/%g' | sed s%file:%%` echo FILE: $fname # Get the check sum of this file. cksum $fnameChecking the matching with an edg-job-list-match should return Computing Elements at the two sites which have replicas of this file. Actually sending the job should return the correct checksum of the file in the std.out file.
More information on the Replica Manager and the underlying services discussed in this chapter can be found in the Users' Guides for the Replica Managerhttp://cern.ch/edg-wp2/replication/docu/edg-replica-manager-userguide.pdf, LRChttp://cern.ch/edg-wp2/replication/docu/edg-lrc-userguide.pdf, RMChttp://cern.ch/edg-wp2/replication/docu/edg-rmc-userguide.pdf, and ROShttp://cern.ch/edg-wp2/replication/docu/edg-ros-userguide.pdf.