|Bracing against the wind|
Wednesday, December 08, 2010
Here's what I had to do to make it work. So far it's fine.
1. If you install Condor as an RPM, it will default to running everything as the user "condor". This will prevent the condor machinery from smartly forking jobs as "the user who submitted them". Especially, if you're on a private network, this makes no sense though, so open it up with CONDOR_UIDS=0.0
2. Lots of machines mounted to the same share? This one was useful:
LOCAL_CONFIG_FILE = /mnt/blah/condor/$(HOSTNAME).local
Now I can keep configs for each machine, without having to worry about condor_config -set. I use symbolic links to create "machine classes" that I add machines to when I bring up new nodes.
3. condor_config_val -set ... seems nice, but don't bother. You really need to keep your configs well groomed. I wasted time getting it to work... and then deleted everything I set. Must be nice on heterogeneous public networks where the config file isn't on some shared mount. (In fact I just disabled it).
4. Trust everyone... let the firewall protect you. TRUST_UID_DOMAIN = True. This is good for getting started, so you're not bumping into roadblocks all the time. Afterwards, tinker with password auth. Kerberos and the like are a bit much, basic password encryption is fine, IMO, and will work everywhere.
5. GET RID of stuff that the RPM stuffs in your local config. Things like "Suspend=False" and "Start=True" are in there... and they are bad. Since your machines don't have a keyboard, ditch the Keyboard stuff and go with "START=$CPUIdle". Since you don't want your jobs to get randomly killed, go with "SUSPEND=$CPUBusy". (You can just delete the stuff in the local config and go with the UWCS defaults... they are really good enough for most things.). Your global config should be the SAME on all boxes. Your local config should be minimal... just the stuff for that machine.
1. condor_run has to be run *from a shared directory* or it won't work. It says that in the docs, but I didn't read it.
2. Likewise, for vanilla jobs, make sure the paths to all your files are the same on every machine you run on. Trying to sort it out in a script is a mess. Fix paths first. Then run.
[View/Post Comments] [Digg] [Del.icio.us] [Stumble]
| Bloghop: | Blogarama | Technorati | Blogwise