GridPP PMB Meeting 703

GridPP PMB Meeting 703 (01.04.19)
Present: Pete Clarke (Chair), Tony Cass, David Colling, Alastair Dewhurst, Tony Doyle, Pete Gronbech, Jon Hays, Roger Jones, Steve Lloyd, Andrew McNab, Gareth Roy, Louisa Campbell (Minutes).

Apologies: Dave Britton, Dave Kelsey, Andrew Sansum.

1) STFC HW Status
Additional funding – GR updated that Glasgow has received the equipment, Jonathan confirmed theirs has arrived and Matt advised their equipment should arrive soon. AM confirmed Manchester has been received and paperwork is being processed. At Imperial the equipment arrived during March and was authorised in March, but award letter still awaited for second set of equipment. If still ongoing next week at Imperial DC may discuss with Charlotte.

2) GridPP6 Response
DB circulated questions from STFC. The PMB agreed the response should reiterate that GridPP differs from other projects they are accustomed to.

  1. JeS forms deadline 8/4

Deadline for return is 8 April. On JeS forms instructions will be provided to make it a ‘linked’ proposal – Imperial has to made some changes to Indirect costs and Edinburgh has to remove a non-funded Co-I then resubmit. Glasgow must attach everything and resubmit. GR circulated the code and Manchester has already resubmitted. Imperial has not yet been returned, but 9 others have – GR can see when these have been actioned.

  1. Named Posts

GR has updated with AD’s Tier-1 so there is now a completed table with all effort and nominal names attached. There are two tables: name table; and updated financial table 20 in the proposal that must match the JeS forms, Imperial need to resubmit so that DB can update.

Risk Register will be re-submitted.

  1. Metrics & Milestones

There is an incomplete set of metrics and milestones that require input from the PMB. PMB agreed the wording style is useful regarding GridPP not being a construction project similar to others normally considered by the PPRP and the metrics have been very effective for previous iterations of GridPP.

The Metrics table was discussed. AD will check and add a few more milestone. WP 1b is fine; WP1c is just metrics and fine; WP2 requires input, particularly CMS which needs numbers (DC will input, e.g. Alice) and clarification on timing for reporting which GR will amend to quarterly for consistency for most and a few monthly for reviewing (AD will make appropriate suggestions); WP3 for requires some working up and David Crooks will add some Security actions; WP4 was questioned as providing services to LHC since other service providers are not asked to provide milestones 4 years in advance which supports our suggestions that milestones are not appropriate for all aspects of the GridPP (e.g. unknown development requirements) except, for example, tape storage, cloud, etc that can be planned; and WP5 is management. PC will progress with input from AD where relevant.

ACTION 703.1: DC to provide figures for WP2 numbers.

3) Q4/18 reports missing

GR confirmed these are all required and written up on web.

  • Tier-1 –Darren has completed but needs finance table from AD
  • Atlas – Tim will complete and AD will prioritise
  • CMS – Katie will pull together and send to DC for review
  • Operations (Matt will try and pull something together)
  • Northgird (awaiting Manchester narrative, but complete)

ACTION 703.2: AD will contact Darren (Tier-1), Tim (Atlas) and Katie (CMS) for Q4 reports.

DC has been filling in metrics and some in the current quarterly sections are involving other Tier-1s which is more challenging to express and asked for clarity on how best to refer to other sites. GR suggests it should be fine to refer to other sites, e.g. efficiency and drawing comparisons to efficiency percentages with other sites.

5. Standing Items

SI-0 Bi-Weekly Report from Technical Group (DC)
AD hosted a meeting on Friday regarding HTC Condor. Liverpool are up and running and there will be a further meeting in 3 months to discuss this as well as other sites with some in testing and associated documentation to improve and simplify accounting process. There was discussion about having HTC Condors running at larger sites before other jobs are attempted.

SI-1 ATLAS Weekly Review and Plans (RJ)
RJ recently reported that RAL jobs were falling over and confirmed this has now been fixed. RAL Echo quota for Atlas is being increased and will have associated migration of some services.

SI-2 CMS Weekly Review and Plans (DC)
DC noted that offline computing is week this week but DC will be attending IRIS instead, though will remotely join some discussions.

SI-3 LHCb Weekly Review and Plans (PC)
Nothing to report.

SI-4 Production Manager’s report (JC)
No report submitted.

SI-5 Tier-1 Manager’s Report (AD)
– The Tier-1 has met its 2019 pledge for CPU and Disk. The disk capacity is available on Echo so LHCb and ALICE will only see the increase once they migrate. Tape capacity is provided as required.

– gdss733 had a double disk failure in Castor. Was out of production from 27th – 29th March.

– CMS CPU efficiency has been poor all last week (25% – 40%). Efficiency started dropping around the 20th March. Tier-1 CPU usage is attached.

– We fixed a problem that prevent LSST jobs from running (HTCondor had not been updated with the new voms server). We will tweak the fairshares today to allow them to run quite a bit more for the next few days.

SI-6 LCG Management Board Report of Issues (DB)
Meeting planned for tomorrow. Dune is now informally part of WLCG management board.

SI-7 External Contexts (PC)
PC mentioned briefly UKRI roadmaps


644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.

702.1: DC to identify an LZ presentation for GridPP42. Ongoing.

702.2: ALL to contribute paragraphs based on economic impact and evidence of GridPP working with industrial partners (Update: PC has sent a version). Done.

702.3: GR to Update is required to Table 20 to bring numbers in line with the returned JeS forms. Ongoing.

702.4: GR & DB to update risk section. Done.

702.5: PC to draft a set of milestones for WP4. (Update: AD will add content where necessary) Ongoing.

702.6: PC & DB to add some additional text to bring things together (WP1c). Ongoing.

ACTIONS AS OF 01.04.19

644.4: AD will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY? AD will now progress. 08/10/18 – Leicester are producing a PO for tapes and will send to AD to produce an invoice). Ongoing.

702.1: DC to identify an LZ presentation for GridPP42. Ongoing.

702.3: GR to Update is required to Table 20 to bring numbers in line with the returned JeS forms. Ongoing.

702.5: PC to draft a set of milestones for WP4. Ongoing.

702.6: PC & DB to add some additional text to bring things together (WP1c). Ongoing.

703.1: DC to provide figures for WP2 numbers.

703.2: AD will contact Darren (Tier-1), Tim (Atlas) and Katie (CMS) for Q4 reports.