Difference between revisions of "EPSCI Group Meeting Apr. 6, 2021"
Jump to navigation
Jump to search
(Created page with " The meeting time is 10:00am. === Connection Info: === <div class="toccolours mw-collapsible mw-collapsed"> You can connect using [https://bluejeans.com/253300597 BlueJeans V...") |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 34: | Line 34: | ||
#: | #: | ||
# Announcements | # Announcements | ||
− | #* Torri Jeske | + | #* Welcome to Torri Jeske |
− | #* David | + | #* David gone Thursday 4/8 through Friday 4/16 (back on Monday 4/19) |
− | #** | + | #** No EPSCI Group meeting Monday April 12th |
− | + | #* [[Fortnight Papers|Fortnight paper]] for May. 1st: [https://www.sciencedirect.com/science/article/abs/pii/S0010465521000151 HEP-Frame: Improving the efficiency of pipelined data transformation & filtering for scientific analyses] (delayed?) | |
− | #* [[Fortnight Papers|Fortnight paper]] for | ||
#: | #: | ||
#: | #: | ||
# Conferences and Workshops | # Conferences and Workshops | ||
− | |||
− | |||
#* [https://autonomous-discovery.lbl.gov/ Autonomous Discovery in Science and Engineering workshop] (April 20-22) | #* [https://autonomous-discovery.lbl.gov/ Autonomous Discovery in Science and Engineering workshop] (April 20-22) | ||
#* [https://indico.cern.ch/event/948465/page/21488-bulletin-1 vCHEP2021] (May 17-21) | #* [https://indico.cern.ch/event/948465/page/21488-bulletin-1 vCHEP2021] (May 17-21) | ||
Line 56: | Line 53: | ||
#* Scientific Software support | #* Scientific Software support | ||
#** JLab Common Environment (CE) + SPACK | #** JLab Common Environment (CE) + SPACK | ||
− | #*** | + | #*** EPSCI are now responsible for ROOT builds on CUE |
+ | #*** CentOS8 support | ||
#*** ServiceNow [https://jlab.servicenowservices.com/nav_to.do?uri=%2Fincident.do%3Fsys_id%3D8443178a1b782450f0b4dc6ce54bcb80%26sysparm_record_target%3Dincident%26sysparm_record_row%3D2%26sysparm_record_rows%3D3%26sysparm_record_list%3Dactive%3Dtrue%5Ecaller_id%3Djavascript:gs.getUserID()%5EORu_affected_user%3Djavascript:gs.getUserID()%5EORwatch_listCONTAINSjavascript:gs.getUserID()%5EORDERBYDESCopened_at (mapmanager, fputil, fpack, bos, bankdef)] | #*** ServiceNow [https://jlab.servicenowservices.com/nav_to.do?uri=%2Fincident.do%3Fsys_id%3D8443178a1b782450f0b4dc6ce54bcb80%26sysparm_record_target%3Dincident%26sysparm_record_row%3D2%26sysparm_record_rows%3D3%26sysparm_record_list%3Dactive%3Dtrue%5Ecaller_id%3Djavascript:gs.getUserID()%5EORu_affected_user%3Djavascript:gs.getUserID()%5EORwatch_listCONTAINSjavascript:gs.getUserID()%5EORDERBYDESCopened_at (mapmanager, fputil, fpack, bos, bankdef)] | ||
#** EIC | #** EIC | ||
+ | #*** Collaboration with ANL | ||
+ | #**** Gaudi -> JANA2 | ||
#*** ACTS | #*** ACTS | ||
− | |||
#** Offline frameworks (CLARA, JANA2) | #** Offline frameworks (CLARA, JANA2) | ||
#: | #: | ||
#* Data Transport | #* Data Transport | ||
− | #** | + | #** Meeting with ESnet this afternoon |
+ | #** Status of proposal | ||
#: | #: | ||
#* DAQ systems | #* DAQ systems | ||
Line 72: | Line 72: | ||
#: | #: | ||
#* A.I. | #* A.I. | ||
− | #** Multiple [https://docs.google.com/presentation/d/1lYenr970yuYyzvPz8MXb_pmHxvB8GX2DLPn6Lny8T1U/edit?usp=sharing FOAs] | + | #** Multiple [https://docs.google.com/presentation/d/1lYenr970yuYyzvPz8MXb_pmHxvB8GX2DLPn6Lny8T1U/edit?usp=sharing FOAs] + JLab LDRD |
− | #*** | + | #*** Collaboration with Theory on MCGen project [https://science.osti.gov/-/media/grants/pdf/foas/2021/SC_FOA_0002493.pdf DE-FOA-0002493] |
− | #*** | + | #*** Collaboration with BNL on AI scheduling [https://science.osti.gov/-/media/grants/pdf/foas/2021/SC_FOA_0002493.pdf DE-FOA-0002482] |
− | #** | + | #*** Collaboration with INDRA-ASTRA |
− | + | #*** Collaboration with Sergey F. on AI + FPGA | |
+ | #*** Surrogate Models proposal (NP, ASCR, LDRD?) | ||
+ | #*** Amplitude Analysis Inverse Problem (LDRD) | ||
#** Jupyterhub + GPU | #** Jupyterhub + GPU | ||
#** Experimental Controls | #** Experimental Controls | ||
Line 86: | Line 88: | ||
#** OSG | #** OSG | ||
# AOT | # AOT | ||
+ | |||
+ | <div class="toccolours mw-collapsible mw-collapsed"> | ||
+ | Message from Bob Michaels officially handing over ROOT responsibilities to EPSCI <font size="-3">(Click "Expand" to the right for details -->):</font> | ||
+ | <div class="mw-collapsible-content"> | ||
+ | <pre> | ||
+ | BTW, I'm officially passing this job (building and maintaining ROOT) to you, now, David. | ||
+ | If you need some help, let me know. Of course, I can answer questions and help resolve | ||
+ | problems with the old builds. | ||
+ | |||
+ | yours | ||
+ | Bob | ||
+ | |||
+ | Dr. Robert Michaels | ||
+ | Staff Scientist, Jefferson Lab | ||
+ | </pre> | ||
+ | </div> | ||
+ | </div> | ||
<hr> | <hr> | ||
Line 91: | Line 110: | ||
=== Minutes: === | === Minutes: === | ||
− | + | Attendees: David L., Carl T., Nathan B., Kishan R., Vardan G., Thomas B., Mike G., Torri J. | |
+ | |||
+ | * SPACK | ||
+ | ** Still some work needed for fully functional deployment | ||
+ | ** EPSCI has now taken over responsibility for building ROOT on CUE from Bob Michaels | ||
+ | *** Need some testing procedure to verify builds since it is more important than most software packages | ||
+ | ** CentOS8 has very limited support dates | ||
+ | *** We should drop spack support for CentOS8 and replace it with another OS based on what SciComp Ops is thinking | ||
+ | |||
+ | * EIC | ||
+ | ** Met w/ Dmitry last week to discuss merging of efforts with ANL | ||
+ | *** Nathan looking at clarifying scope of project to convert ANL code from GAUDI to JANA2 | ||
+ | *** Discussed need for additional personpower for supporting this effort. Request sent to upper management | ||
+ | ** ACTS | ||
+ | *** Nathan working on implementing ACTS examples with JANA2 to learn more about system. | ||
+ | |||
+ | * JANA2 | ||
+ | ** Nathan working on integrating with CLARA as a microservice | ||
+ | *** Some differences with basic data/execution flow between JANA and CLARA that need to be worked out | ||
+ | |||
+ | * CLARA | ||
+ | ** Issue with occasional (<1%) of files being truncated when processing multiple files | ||
+ | ** With Raphaella's help, ran ~100 farm jobs and was able to decipher cause from log files. | ||
+ | ** Issue had to do with lost synchronization for one thread and a subsequent thread launched to process next file in list killed thread where original issue developed, masking it. | ||
+ | |||
+ | * Data Transport -> EJFAT | ||
+ | ** EJFAT = ESnet/JLab + FPGA + Accelerated Transport (pronounced "Edge Fat" = fat data pipe from the edge) | ||
+ | ** Meeting today to discuss data format | ||
+ | |||
+ | * SRO | ||
+ | ** Monday meeting had only a few participants and technical issues prevented lots of discussion | ||
+ | ** Some discussion of EVIO format of transient data (Dave A., Carl, T., Vardan G.) | ||
+ | ** Another test run by Vardan using software source: | ||
+ | *** 12GB RAM, 15 cores, 2.2GB/s | ||
+ | ** Some work with object pools | ||
+ | ** David challenged Carl to learn how to reproduce Vardan's performance tests independently | ||
+ | |||
+ | * CODA | ||
+ | ** Carl continues work on EVIO-6 event viewer GUI | ||
+ | ** two minor user requests: | ||
+ | *** More verbose info from user scripts run during transitions | ||
+ | *** Support for setting more environmental variables in COOL | ||
+ | |||
+ | * AI | ||
+ | ** FOAs + LDRD | ||
+ | *** Many discussions last week. We have potential involvement in several. Primary authorship on 1. | ||
+ | *** Thomas working on LDRD proposal to support work related to Early Career Award | ||
+ | ** Jupyterhub | ||
+ | *** Kishan tested running training on GPU via Jupterhub and the epsci-notebook. Some library errors. | ||
+ | **** Communication with Wes led to adding secret libs directory (.singularity.d/libs) to LD_LIBRARY_PATH coupled with CUDA installation in /apps formed working system. | ||
+ | ** Experimental Controls | ||
+ | *** Torri met with Noami yesterday who pointer her to some software and gave tour of DB. | ||
+ | *** Able to run plugins over raw data and generate ROOT files. Next step is to examine contents. | ||
+ | |||
+ | * Offsite Computing | ||
+ | ** GlueX is working on revised XSEDE proposal (Due April, 15th) | ||
+ | ** OSG | ||
+ | *** Job queue has been steadily catching up since removing lustre mounts from scosg16 | ||
+ | *** Some issues have arisen in the last day that looked to have caused a slow down. They are being investigated. | ||
+ | *** Changes made to monitoring that make it appear as though it is updating faster |
Latest revision as of 16:50, 6 April 2021
The meeting time is 10:00am.
Connection Info:
You can connect using BlueJeans Video conferencing (ID: 253 300 597). (Click "Expand" to the right for details -->):
Meeting URL https://bluejeans.com/253300597?src=join_info Meeting ID 253 300 597 Want to dial in from a phone? Dial one of the following numbers: +1.888.240.2560 (US Toll Free) (see all numbers - https://www.bluejeans.com/premium-numbers) Enter the meeting ID and passcode followed by # Connecting from a room system? Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode
Agenda:
- Previous meeting
- Announcements
- Welcome to Torri Jeske
- David gone Thursday 4/8 through Friday 4/16 (back on Monday 4/19)
- No EPSCI Group meeting Monday April 12th
- Fortnight paper for May. 1st: HEP-Frame: Improving the efficiency of pipelined data transformation & filtering for scientific analyses (delayed?)
- Conferences and Workshops
- Autonomous Discovery in Science and Engineering workshop (April 20-22)
- vCHEP2021 (May 17-21)
- Thomas, Kishan: Hydra
- Vardan, Nathan, David (+Hall-B, Fast Electronics, and TriDAS groups): TriDAS + JANA2 SRO
- David: HOSS!
- ACAT2021 (Nov. 29 - Dec. 3)
- Ongoing Activities
- Scientific Software support
- JLab Common Environment (CE) + SPACK
- EPSCI are now responsible for ROOT builds on CUE
- CentOS8 support
- ServiceNow (mapmanager, fputil, fpack, bos, bankdef)
- EIC
- Collaboration with ANL
- Gaudi -> JANA2
- ACTS
- Collaboration with ANL
- Offline frameworks (CLARA, JANA2)
- JLab Common Environment (CE) + SPACK
- Data Transport
- Meeting with ESnet this afternoon
- Status of proposal
- DAQ systems
- SRO
- SAMPA + ERSAP + JANA2 + INDRA-ASTRA = April 1st + 2 weeks
- CODA (CODA3 support, EVIO-6)
- SRO
- A.I.
- Multiple FOAs + JLab LDRD
- Collaboration with Theory on MCGen project DE-FOA-0002493
- Collaboration with BNL on AI scheduling DE-FOA-0002482
- Collaboration with INDRA-ASTRA
- Collaboration with Sergey F. on AI + FPGA
- Surrogate Models proposal (NP, ASCR, LDRD?)
- Amplitude Analysis Inverse Problem (LDRD)
- Jupyterhub + GPU
- Experimental Controls
- Multiple FOAs + JLab LDRD
- Offsite Computing
- NERSC, PSC, IU
- XSEDE application for PSC bridges-2 being updated for resubmission (due April 15th)
- OSG
- NERSC, PSC, IU
- Scientific Software support
- AOT
Message from Bob Michaels officially handing over ROOT responsibilities to EPSCI (Click "Expand" to the right for details -->):
BTW, I'm officially passing this job (building and maintaining ROOT) to you, now, David. If you need some help, let me know. Of course, I can answer questions and help resolve problems with the old builds. yours Bob Dr. Robert Michaels Staff Scientist, Jefferson Lab
Minutes:
Attendees: David L., Carl T., Nathan B., Kishan R., Vardan G., Thomas B., Mike G., Torri J.
- SPACK
- Still some work needed for fully functional deployment
- EPSCI has now taken over responsibility for building ROOT on CUE from Bob Michaels
- Need some testing procedure to verify builds since it is more important than most software packages
- CentOS8 has very limited support dates
- We should drop spack support for CentOS8 and replace it with another OS based on what SciComp Ops is thinking
- EIC
- Met w/ Dmitry last week to discuss merging of efforts with ANL
- Nathan looking at clarifying scope of project to convert ANL code from GAUDI to JANA2
- Discussed need for additional personpower for supporting this effort. Request sent to upper management
- ACTS
- Nathan working on implementing ACTS examples with JANA2 to learn more about system.
- Met w/ Dmitry last week to discuss merging of efforts with ANL
- JANA2
- Nathan working on integrating with CLARA as a microservice
- Some differences with basic data/execution flow between JANA and CLARA that need to be worked out
- Nathan working on integrating with CLARA as a microservice
- CLARA
- Issue with occasional (<1%) of files being truncated when processing multiple files
- With Raphaella's help, ran ~100 farm jobs and was able to decipher cause from log files.
- Issue had to do with lost synchronization for one thread and a subsequent thread launched to process next file in list killed thread where original issue developed, masking it.
- Data Transport -> EJFAT
- EJFAT = ESnet/JLab + FPGA + Accelerated Transport (pronounced "Edge Fat" = fat data pipe from the edge)
- Meeting today to discuss data format
- SRO
- Monday meeting had only a few participants and technical issues prevented lots of discussion
- Some discussion of EVIO format of transient data (Dave A., Carl, T., Vardan G.)
- Another test run by Vardan using software source:
- 12GB RAM, 15 cores, 2.2GB/s
- Some work with object pools
- David challenged Carl to learn how to reproduce Vardan's performance tests independently
- CODA
- Carl continues work on EVIO-6 event viewer GUI
- two minor user requests:
- More verbose info from user scripts run during transitions
- Support for setting more environmental variables in COOL
- AI
- FOAs + LDRD
- Many discussions last week. We have potential involvement in several. Primary authorship on 1.
- Thomas working on LDRD proposal to support work related to Early Career Award
- Jupyterhub
- Kishan tested running training on GPU via Jupterhub and the epsci-notebook. Some library errors.
- Communication with Wes led to adding secret libs directory (.singularity.d/libs) to LD_LIBRARY_PATH coupled with CUDA installation in /apps formed working system.
- Kishan tested running training on GPU via Jupterhub and the epsci-notebook. Some library errors.
- Experimental Controls
- Torri met with Noami yesterday who pointer her to some software and gave tour of DB.
- Able to run plugins over raw data and generate ROOT files. Next step is to examine contents.
- FOAs + LDRD
- Offsite Computing
- GlueX is working on revised XSEDE proposal (Due April, 15th)
- OSG
- Job queue has been steadily catching up since removing lustre mounts from scosg16
- Some issues have arisen in the last day that looked to have caused a slow down. They are being investigated.
- Changes made to monitoring that make it appear as though it is updating faster