GridPP PMB Meeting 660

GridPP PMB Meeting 660 (12.02.18)
Present: Pete Gronbech (Chair), Tony Cass, Jeremy Coles, David Colling, Roger Jones, Dave Kelsey, Steve Lloyd, Andrew Sansum, Louisa Campbell (Minutes).

Apologies: Dave Britton, Pete Clarke, Tony Doyle, Andrew McNab, Gareth Smith.

1. OSC Documents
CMS input should be submitted today.
PG uploaded copies of OSC documents, all of which require input and editing. Members were reminded of the timetable – final check by 19 February for submission on 26 February.
Introduction section (DB) needs to be updated following input on other sections.
Wider Context (PC) has text and a diagram (there was some discussion on whether the diagram should be removed), AS and DK will update info including EOS hub.
GridPP5 – PG is finalising.
Project Map – awaiting Q4 reports for completion.
Risk Register – recently reviewed and updated, has been included and summarised in the report.
Tier1 Status – now complete.
Deployment Status – JC will complete by tomorrow/Wednesday (some comment from PG and DC may be useful, especially on the Tier2 Evolution section). Some input is required from Duncan on the London Grid, Other VOs, DC will prompt.
Atlas Report – now complete.
CMS Report – DC is completing.
LHCb Report – now complete.
Other VOs – follows on from LHCb reports and PG will tweak.
PG reminded that all contributions are now required as a matter of urgency to meet the timetable. A copy of the latest finances for the Financial Document would also be useful.

2. Questionnaire
STFC questionnaire about the balance of programmes has been circulating and discussed by the members over the last week. SL has completed sections and DB circulated some draft text expressing some concerns. PC has circulated additional comments which may be merged with DB’s text. Researchfish is mentioned in the questionnaire – the submission period has started – PG has uploaded c. 100 Atlas and 100 CMS papers and has asked DC for input to ensure all relevant papers are included. This topic will be further discussed via email and perhaps at next week’s PMB.
Usual request for other sections of Researchfish, e.g. positions held in external bodies etc. For example, DC is on new committees since last year so these will be updated.

3. GridPP40 Agenda Items
PG provided the link to the draft GridPP40 agenda and asked for input/suggestions for inclusion. This is shaping up very well:
Session 1 Status – is well developed.
Session 2 Accountancy & Efficiency – is well developed.
Session 3 Storage – well developed, perhaps to include hybrid infrastructure from Jens and Dan (some clarification on this would be useful). IPV6 should be included – perhaps Duncan can speak. Data science infrastructure is a suggestion from Jens’ email that could perhaps be better defined.
Session 4 Network & Security – DC may ask someone from Network Services to contribute. Perhaps some input from Janet’s perspective, re issues discussed in Japan – the next LHCone/LHCOPN meeting is taking place in the UK in March and may result in some interesting topics to discuss. DK cannot chair as he will not be attending.
Session 5 Operations – needs to be developed. SL7 is included – containers should perhaps be covered (Alessandra has volunteered to cover).
Session 6 TBC – needs to be developed. Future directions/status/ requirements from other experiments may be of interest. This will be covered again this week.


5. Standing Items

SI-0 Bi-Weekly Report from Technical Group (DC)
There has not been a meeting – this is due on Friday.

SI-1 ATLAS Weekly Review and Plans (RJ)
RJ reported that the consolidated grant submission is imminent.

SI-2 CMS Weekly Review and Plans (DC)
Nothing significant to report.

SI-3 LHCb Weekly Review and Plans (PC)
Nothing significant to report.

SI-4 Production Manager’s report (JC)
There are no pressing updates from the operations area. Patching due to Spectre/Meltdown continues. Some items that may be of interest:

1. There was a pre-GDB on 2nd February looking at HPC utilization:

Driving questions were:

* Which resource share comprises HPC now and in the future?

* Are HPC systems part of the pledge (will they become part of the pledge)?

* What sort of experiment tasks can/should we run on supercomputers?

* Are there uncommon infrastructure pieces/features in the middleware that HPC systems critically rely on?

* Do the HEP software stacks need to assimilate, e. g. MPI based job scheduling?

2. Tomorrow (13th February) there is discussion about GPU utilisation: with a lot of UK participation.

3. The February GDB is on Wednesday 14th with this agenda: It includes an update on benchmarking results following Spectre and Meltdown patching.

4. The EGI RP/RC A/R Report for January 2018 is available and shows the UK as green ( ) with 95% overall. Results were pulled down due to issues on 18th/19th January.

DC noted the CERN VM images had not yet been patched and this may be an issue. JC noted there had been some mention of VMs being patched, but this may not be for CERN. JC will check the meeting overview and circulate.

SI-5 Tier-1 Manager’s Report (GS)
GS was not in attendance, no report submitted.
AS noted:
Tier1 Manager position advert closed on 11 February, shortlisting will commence now with interview in 2-4 weeks.
There was a small glitch on the Echo storage system, but this was not significant (ran out of active ports on the Firewall which can be expanded).
A CPU order went out early last week and the other is imminent if it has not already been submitted (AS will circulate a summary).
Closing dates of Castor disk service and considering h/w are currently being plotted – last generation of Castor h/w will expire at the end of March 2019. Consideration is being given to setting a firm turn off of the service at the end FY18. There is no h/w that will be operational after March 2019 so there is sufficient time now to ensure this cut-off date is circulated. PG asked what percentage was on this – CMS has ½ PB now but Chris aims to complete by July, Atlas is being slightly held up and wants more disk to complete their migration. At the moment the plan is to deploy remaining disk in Echo, LHCb do not require it yet and there is another tranche of h/w due soon. This is currently going well and Echo remains in good shape.
LHCb are undertaking some tests now – the Echo project meeting on Monday went round each experiment seeking specific input and this should be further clarified at the LHCb meeting this afternoon.

SI-6 LCG Management Board Report of Issues (DB)
DB was not in attendance and no report was submitted.

SI-7 External Contexts (PC)
AS noted the final UKT0 proposal (£4M for next 4 years) is due for submission in the next few days. Preliminary review was positive and the Executive Board are satisfied – Dave Corney will attend BEIS at the beginning of March and the decision should be received on the same day. Feedback from BEIS to STFC appears to suggest there is funding available if a convincing case can be made.
UKT0 activity is gearing up on the technical roadmap (AS is temporarily chairing the group with a meeting in Manchester last Thursday and a planned meeting on Monday 19th on implementing the AAII infrastructure). DC and AM are involved so it should align with GridPP aspirations.
There will be a UKT0 collaboration meeting 14-16 March.

644.3: AS put together a starting plan for staff ramp-down. (Update: a draft will be produced in January). Ongoing.
644.4: AS will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY?) Ongoing.
OC documents MUST be done and submitted to PG this week.
649.2: PC will write Wider Context of OS documents. Ongoing.
649.4: GS and AS will write the Tier1 Status section of OS documents. Ongoing.
649.5: JC will write Deployment Status section of OS documents with input from PG. Ongoing.
649.6: RJ, DC and AM will write LHC section of User Reports in OS documents. Ongoing.
649.7: JC will write Other Experiments section of User Reports in OS documents with input from DC and PG. Ongoing.
655.3: PG to consider the agenda and date for Tier1 review and include disaster recovery plans. (UPDATE: appropriate dates are being considered with AS). Ongoing.
656.1: DK will report before the end of February on any actions GridPP should take to comply with GDPR. Ongoing.
656.2: DC will report on CPU efficiencies. Ongoing.
657.2: DC to report on the CMS taskforce. Ongoing.

ACTIONS AS OF 12.02.18
644.3: AS put together a starting plan for staff ramp-down. (Update: a draft will be produced in January). Ongoing.
644.4: AS will progress capture of funds for Dirac with Mark Wilkinson. (Update: funding from DIRAC. AS has emailed Mark. They are now using it more heavily. Could use the money for tape, but have to be careful not to buy tape we won’t use. May be better charging later rather than during this FY?) Ongoing.
OC documents MUST be done and submitted to PG this week.
649.2: PC will write Wider Context of OS documents. Ongoing.
649.4: GS and AS will write the Tier1 Status section of OS documents. Ongoing.
649.5: JC will write Deployment Status section of OS documents with input from PG. Ongoing.
649.6: RJ, DC and AM will write LHC section of User Reports in OS documents. Ongoing.
649.7: JC will write Other Experiments section of User Reports in OS documents with input from DC and PG. Ongoing.
655.3: PG to consider the agenda and date for Tier1 review and include disaster recovery plans. (UPDATE: appropriate dates are being considered with AS). Ongoing.
656.1: DK will report before the end of February on any actions GridPP should take to comply with GDPR. Ongoing.
656.2: DC will report on CPU efficiencies. Ongoing.
657.2: DC to report on the CMS taskforce. Ongoing.