Starting up a Globus Virtual Workspace with STAR’s image.

The steps:

1) login to stargrid01

2) Check that your ssh public key is at $home/.ssh/id_rsa.pub. This will be the key the client package copies to the gatekeeper and client nodes under the root account allowing local password free login as root, which you will need to install grid host certs.

a. Note the file name location must be as defined exactly as above or you must modify the path and name in the client configuration at ./workspace-cloud-client-009/conf/cloud.properties (more on this later).

b. If your using a Putty generated ssh public key it will not work directly. You can simply edit it with a text editor to get it in to this format. Below is an example of the right format A and the wrong format B. If it has multiple lines then it is the wrong format.

Right format A:

ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAIEAySIkeTLsijvh1U01ass8XvfkBGocUePTkuG2F8TwRilq1gIcuTP5jBFSCF0eYXOpfNcgkujIsRj/+xS1QqM7c5Fs0hrRyLzyxgZrCKeXojVUFYfg9QuokqoY2ymgjxAdwNABKXI2IKMvM0UGBtmxphCuxUSUpMzNfmWk9H4HIrE=

Wrong format B:

---- BEGIN SSH2 PUBLIC KEY ----
Comment: "imported-openssh-key"
AAAAB3NzaC1yc2EAAAABJQAAAIEAySIkeTLsijvh1U01ass8XvfkBGocUePTkuG2
F8TwRilq1gIcuTP5jBFSCF0eYXOpfNcgkujIsRj/+xS1QqM7c5Fs0hrRyLzyxgZr
CKeXojVUFYfg9QuokqoY2ymgjxAdwNABKXI2IKMvM0UGBtmxphCuxUSUpMzNfmWk
9H4HIrE=
---- END SSH2 PUBLIC KEY ----

 

3) Get the grid client. By copying the folder /star/u/lbhajdu/ec2/workspace-cloud-client-009 to your area. It is recommended you execute your commands from inside the workspace-cloud-client-009. The manual describes all commands and paths relative to this directory, I will do the same.

a. This grid client is almost the same as the one you download from globus except it has the ./samples/star1.xml, which is configured to load STAR’s custom image.

4) cp to the workspace-cloud-client-009 and type:

./bin/grid-proxy-init.sh  -hours 100

The output should look like this:

[stargrid01] ~/ec2/workspace-cloud-client-009/> ./bin/grid-proxy-init.sh
(Overriding old GLOBUS_LOCATION '/opt/OSG-0.8.0-client/globus')
(New GLOBUS_LOCATION: '/star/u/lbhajdu/ec2/workspace-cloud-client-009/lib/globus')
Your identity: DC=org,DC=doegrids,OU=People,CN=Levente B. Hajdu 105387
Enter GRID pass phrase for this identity:
Creating proxy, please wait...
Proxy verify OK
Your proxy is valid until Fri Aug 01 06:19:48 EDT 2008

Normal
0

false
false
false

MicrosoftInternetExplorer4

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}


 

5.) To start the cluster type:

./bin/cloud-client.sh --run --hours 1 --cluster samples/star1.xml

Two very important things you will want to make a note of from this output are the cluster handle (usually looks something like “cluster-025”) and the gatekeeper name. It will take about 10minutes to lunch this cluster. The cluster will have one gatekeeper and one worker node. The max life time of the cluster is set in the command line arguments, more parameters are in the xml file (you will want to check with Tim before changing these).
If the command hangs up really quickly (about a minute) and says something like “terminating cluster”, this usually means that you do not have a sufficient number of slots to run.

It should look something like this:

[stargrid01] ~/ec2/workspace-cloud-client-009/> ./bin/cloud-client.sh --run --hours 1 --cluster samples/star1.xml
(Overriding old GLOBUS_LOCATION '/opt/OSG-0.8.0-client/globus')
(New GLOBUS_LOCATION: '/star/u/lbhajdu/ec2/workspace-cloud-client-009/lib/globus')
SSH public keyfile contained tilde:
- '~/.ssh/id_rsa.pub' --> '/star/u/lbhajdu/.ssh/id_rsa.pub'
SSH known_hosts contained tilde:
- '~/.ssh/known_hosts' --> '/star/u/lbhajdu/.ssh/known_hosts'
Requesting cluster.
- head-node: image 'osgheadnode-012', 1 instance
- compute-nodes: image 'osgworker-012', 1 instance
Workspace Factory Service:
https://tp-grid3.ci.uchicago.edu:8445/wsrf/services/WorkspaceFactoryService
Creating workspace "head-node"... done.
- 2 NICs: 128.135.125.29 ['tp-x009.ci.uchicago.edu'], 172.20.6.70 ['priv070']
Creating workspace "compute-nodes"... done.
- 172.20.6.25 [ priv025 ]
Launching cluster-025... done.
Waiting for launch updates.
- cluster-025: all members are Running
- wrote reports to '/star/u/lbhajdu/ec2/workspace-cloud-client-009/history/cluster-025/reports-vm'
Waiting for context broker updates.
- cluster-025: contextualized
- wrote ctx summary to '/star/u/lbhajdu/ec2/workspace-cloud-client-009/history/cluster-025/reports-ctx/CTX-OK.txt'
- wrote reports to '/star/u/lbhajdu/ec2/workspace-cloud-client-009/history/cluster-025/reports-ctx'
SSH trusts new key for tp-x009.ci.uchicago.edu [[ head-node ]]

 

5) But hold on you can’t submit yet even thought the grid map file has our DNs in it, the gatekeeper is not trusted. We will need to install an OSG host cert on the other side. Not just anybody can do this. Doug and Leve can do this at least (and I am assuming Wayne). Open up another terminal and logon into the newly instantiated gatekeeper as root. Example here:

[lbhajdu@rssh03 ~]$ ssh root@tp-x009.ci.uchicago.edu
The authenticity of host 'tp-x009.ci.uchicago.edu (128.135.125.29)' can't be established.
RSA key fingerprint is e3:a4:74:87:9e:69:c4:44:93:0c:f1:c8:54:e3:e3:3f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'tp-x009.ci.uchicago.edu,128.135.125.29' (RSA) to the list of known hosts.
Last login: Fri Mar 7 13:08:57 2008 from 99.154.10.107

 

6) Create a .globus directory:

[root@tp-x009 ~]# mkdir .globus

7) Go back to the stargrid node and copy over your grid cert and key:

[stargrid01] ~/.globus/> scp usercert.pem root@tp-x009.ci.uchicago.edu:/root/.globus
usercert.pem 100% 1724 1.7KB/s 00:00

[stargrid01] ~/.globus/> scp userkey.pem root@tp-x009.ci.uchicago.edu:/root/.globus
userkey.pem 100% 1923 1.9KB/s 00:00


8)
Move over to /etc/grid-security/ on the gate keeper:

cd /etc/grid-security/

9) Create a host cert here:

[root@tp-x009 grid-security]# cert-gridadmin -host 'tp-x002.ci.uchicago.edu' -email lbhajdu@bnl.gov -affiliation osg -vo star -prefix tp-x009
checking script version, V2-4, This is ok. except for gridadmin SSL_Server bug. Latest version is V2-6.
Generating a 2048 bit RSA private key
.......................................................................................................+++
.................+++
writing new private key to './tp-x009key.pem'
-----
osg
OSG
OSG:STAR
The next prompt should be for the passphrase for your
personal certificate which has been authorized to access the
gridadmin interface for this CA.
Enter PEM pass phrase:
Your new certificate and key files are ./tp-x009cert.pem ./tp-x009key.pem
move and rename them as you wish but be sure to protect the
key since it is not encrypted and password protected.

 

10) Change right on the credentialed:

[root@tp-x009 grid-security]# chmod 644 tp-x009cert.pem
[root@tp-x009 grid-security]# chmod 600 tp-x009key.pem

11) Delete the old host credentialed:

[root@tp-x009 grid-security]# rm hostcert.pem
[root@tp-x009 grid-security]# rm hostkey.pem


12)
Rename the credentials:

[root@tp-x009 grid-security]# mv tp-x009cert.pem hostcert.pem
[root@tp-x009 grid-security]# mv tp-x009key.pem hostkey.pem

 

13) Check grid functionality back on stargrid01

[stargrid01] ~/admin_cert/> globus-job-run tp-x009.ci.uchicago.edu /bin/date
Thu Jul 31 18:23:55 CDT 2008

14) Do your grid work

15) When its time for the cluster to go down (if there is unused time remaining) run the below command. Note that you will need the cluster handle from the command used to bring up the cluster.

./bin/cloud-client.sh --terminate --handle cluster-025

 

If there are problems:

If there are problems try this web page:
http://workspace.globus.org/clouds/cloudquickstart.html
If there are still problems try this mailing list:
workspace-user@globus.org
If there are still problems contact Tim Freeman (tfreeman at mcs.anl.gov).