Fermilab

Computing CDF D0 USCMS

Introduction


The Runjob Project was created in February of 2004 following on the successful attempt to isolate and re-engineer the common compoenents between the existing (and related) DZero Runjob softare and CMS MCRunjob software. The common core software was named "Shahkar" during the pilot phase of the project, but it is really called the Runjob Core sofware. The component diagram below is more fully explained in the architecture section. But the basic idea is that a component called a Configurator is responsible for individual service and/or application configurations. Each service of application is described with metadata, and the Configurators implement special framework call handlers that modify the metadata or perform real work.

The container for Configurators is called the Linker. It is responsible for maintaining dependencies among the Configurators, facilitating linkage among



metadata elements of different Configurators, and emitting the framework calls that cause the Configurators to do work. Historically, the framework call handlers are used for job building. Thus one Configurator may contact a parameetr DB such as SAM in DZero or the RefDB in CMS, and this may be followed by a handler that configures a real application and packages it in a ScriptObject as a job, and a submission service that submits the SciriptObject on some arbitrary submit resource.

Though we are currently forbidden by the project definition from including actual submitters in the Runjob project code itself, submitters exist for SAMGrid, FBS, LSF, PBS, Condor, Condor-G, and LCG in various experiment specific projects that depend upon the Runjob code. Furthermore, an threaded runtime extension package called ShREEK (also forbidden from inclusion in the Runjob core code by the project definition) could be used in conjuntion with real framework calls to implement a real workflow planner.

Architecture


This section describes the architecture of the Runjob Core code in more detail.

Strategy


This section describes the strategy of the Runjob project and plans for the future after the original planned end date of February 2005.

Developer's Guides


This section contains quick pointers to Developer Guides and documentation that mostly exist on the Wiki.

User's Guides


This section contains quick pointers to User Guides and documentation that mostly exist on the Wiki.

RunJob Pilot Project

A RunJob pilot project was initiated in the Spring of 2003 to explore the feasibility of a common project. The RunJob pilot produced a common core package, code named ShahKar (from Urdu for "Great Job") that was integrated with the CMS version of MCRunjob in the Fall of 2003.