Summary of data management meeting Monday, July 5th, 2004 16:30 Bat 2-R030 ============================================================================== Present: N.Brook, P.Charpentier, J.Closier, C.Cioffi, M.Frank, R.Graciani, M.Sanchez, J.Blouw 1) Updating the bookkeeping database a) We will reschedule jobs, which ran only a minimal amount of time (These jobs are obvious failures) b) We will reschedule jobs which never started c) We will reschedule jobs with a clear disk-space problem d) Otherwise we deal with the following table of failures Jobs with files in: Information content: BkI: Bookkeeping provenance information BkR: Bookkeeping replica information FC: Alien Faile Catalog information Castor: File in castor @ cern Log: Information in log files present @ cern Actions: (1) Nothing to do, but no log files for history (2) Try to recover from local SE using Alien (2.1) Try to recover from local SE using Bookkeeping (3) Recover Alien information from bookkeeping (4) Recover bookkeeping information from Log-files (4.1) Recover bookkeeping replica information from Alien (4.3) Recover bookkeeping replica information from castor file (5) Recover bookkeeping information from prod db (6) Recover Alien from bookkeeping (6.1) Recover Alien using castor file Steps with one failure: ======================= BkI BkR FC Castor Log Y Y Y Y N (1) Y Y Y N Y (2) Y Y N Y Y (3) Y N Y Y Y (4) N Y Y Y Y ---- non existing by definition Steps with 2 failures: ====================== BkI BkR FC Castor Log Y Y Y N N (1) + (2) Y Y N Y N (1) + (6) Y N Y Y N (4.1) N Y Y Y N ---- not existent Y Y Y N N (1) + (2) Y Y N Y N (1) + (6) Y N Y Y N (1) + (4.1) N Y Y Y N ---- not existent Y Y N N Y (2.1) + (3) Y N Y N Y (2) + (4.1) N Y Y N Y ---- not existent Y Y Y N N (1) + (2.1) Y N N Y Y (4.3) + (6) N Y N Y Y ---- not existent Y Y N Y N (1) + (6) Y Y N N Y (6) + (2.1) N N Y Y Y (4) + (4.1) Y N Y Y N (1) + (4.1) Y N Y N Y (4.1) + (2) Y N N Y Y (4) + (4.3) + (6.1) N Y Y Y N ---- non existent N Y Y N Y ---- non existent N Y N Y Y ---- non existent N N Y Y Y (4) + (4.1) Steps with 3 failures: ====================== BkI BkR FC Castor Log N N N Y Y (5) + (4.3) + (6.1) N N Y N Y (4) + (4.1) + (2.1) N Y N N Y ---- non existent Y N N N Y ***NOT RECOVERABLE*** N N N Y Y (4) + (4.3) + (6) N N Y Y N (4) + (4.1) + (2.1) N Y N Y N ---- non existent Y N N Y N (1) + (6.1) + (4.1) N N Y N Y (4) + (4.1) + (2.1) N Y Y N N (1) + (5) + (2) Y N Y N N (1) + (4.1) + (2) N Y Y N N ---- non existent N Y Y N N (1) + (5) + (2) Y Y N N N (1) + (4.1) + (2) N Y N N Y (4) + (3) + (2) N Y N Y N ---- non existent Y N N N Y ***NOT RECOVERABLE*** Y N N Y N (1) + (6.1) + (4.1) Y N Y N N (1) + (4.1) + (2) Y Y N N N (1) + (6) + (2) Steps with 4 failures: ====================== BkI BkR FC Castor Log N N N N Y ***NOT RECOVERABLE*** N N N Y N (1) + (5) + (4.3) + (6) N N Y N N (1) + (5) + (4.1) + (1) N Y N N N ---- non existent Y N N N N ***NOT RECOVERABLE*** Pretty impressive...isn't it? But: it's not yet complete, because it only includes replica @ cern Probably we have to find some iterative procedure. 2) Discussion of the EGEE architecute document Major discussion items: - The use of interfaces within the EGEE architecture and the possibility to replace components - The possibility to register files with externally created GUIDs - The possibility to update files - The description of the pull mechanism Minutes: Markus Frank