Difference between revisions of "EPSCI Group Meeting Oct. 5, 2020"

From epsciwiki
Jump to navigation Jump to search
 
(2 intermediate revisions by the same user not shown)
Line 45: Line 45:
 
#* JANA2 (Nathan)
 
#* JANA2 (Nathan)
 
#** GlueX port
 
#** GlueX port
 +
#** LDRDFest 3pm Friday (bluejeans test 9:30am Thurs.)
 
#* A.I.
 
#* A.I.
 
#** [https://www.jlab.org/AI JLab AI webpage] (Kishan)
 
#** [https://www.jlab.org/AI JLab AI webpage] (Kishan)
Line 66: Line 67:
 
<hr>
 
<hr>
 
=== Minutes: ===
 
=== Minutes: ===
<!-- Attendees: David L., Carl T., Nathan B., Thomas B., Vardan G., Kishan R., Graham H. -->
+
Attendees: David L., Carl T., Nathan B., Thomas B., Kishan R.
 +
 
 +
'''Future Trends Conference'''
 +
* Common theme to many such conferences where lengthy discussion regarding cooperation and standards is had, but small effect on industry
 +
* Significant discussion on Research Software Engineer as a career path
 +
 
 +
'''Offsite Computing'''
 +
* NERSC
 +
** AY2021 request submitted for GlueX
 +
** Agreed to return 10M of 50M unit AY2020 allocation to George Fai to distribute to other NP projects
 +
** Job bundling work anticipated to start next week
 +
* IU - Big Red 3
 +
** Work starting this week to get GlueX jobs running on IU Big Red 3
 +
* OSG
 +
** Some small issues recently, but are not due to JLab end
 +
** Making JLab SciComp available for OSG jobs
 +
*** Need to set up ''Compute Element'' which is process that runs on head node at JLab and turns OSG requests into submissions to our SLURM system
 +
*** Examples and documentation are identified, but implementation work needs to be done
 +
*** Hopefully completed in Oct.
 +
 
 +
'''EVIO-6'''
 +
* Numerous bugs identified and squashed in buffer reading code
 +
** Inherited code was initially written to read from files, but was modified to read from buffers
 +
* Started looking into ReadTheDocs for package documentation
 +
** Proof-of-principle test performed to auto-generate documentation from GitHub source
 +
** Most existing documentation in MS Word format. ReadTheDocs uses Sphinx
 +
** Some tools exist to convert MS Word into Sphinx, but need to be investigated to see if the format conversion produces something appropriate.
 +
 
 +
'''JANA2'''
 +
* 25/28 modules now ported. (Latest was ANALYSIS which was particularly complicated).
 +
* Currently tackling event sources and plugins (event processors). The latter are more straight forward, but the former prevents linking and testing to begin.
 +
** Nearly have JEventSource_EVIOpp compiling without errors (primary EVIO event source class for GlueX)
 +
 
 +
'''AI'''
 +
* Kishan running on sciml GPUs without out-of-memory issues seen a few weeks ago. Assume problem is fixed.
 +
* Web page
 +
** Kishan has prepared some slides to show regarding a proposed content/format/management plan
 +
** Kishan, Thomas, and David will meet Tuesday to go over. (Chris T. is out of town this week).
 +
* Hydra
 +
** Working now with Tensorflow 2.3
 +
** Some issues getting MirroredStrategy to work with flow_from_* methods (preloading images runs out of memory)
 +
** Common problem that does not seem to have a well-known solution at the moment.
 +
** May not be worth pursuing right now since training times have been reduced to a few hours
 +
* DOE Report on AI town halls
 +
** Thomas read through and was inspired to think about a general use AI report class
 +
* Accounts set up for the Experimental Controls projects. Job postings will be entered into the system this week.
 +
 
 +
'''SPACK'''
 +
* Thomas has started reviewing documentation again in order to pick up where he left off before
 +
* A group has already been set up on the CUE
 +
* Permissions to some directories created by Wouter were made to be group writable
 +
 
 +
'''SRO'''
 +
* Vardan out of town so no SRO report this week

Latest revision as of 16:00, 5 October 2020

The meeting time is 10:00am.

Connection Info:

You can connect using BlueJeans Video conferencing (ID: 253 300 597). (Click "Expand" to the right for details -->):

Meeting URL
 https://bluejeans.com/253300597?src=join_info

Meeting ID
253 300 597

Want to dial in from a phone?

Dial one of the following numbers:
+1.888.240.2560 (US Toll Free)
(see all numbers - https://www.bluejeans.com/premium-numbers)

Enter the meeting ID and passcode followed by #

Connecting from a room system?
Dial: bjn.vc or 199.48.152.152 and enter your meeting ID & passcode

Agenda:

  1. Previous meeting
  2. Announcements
  3. Future Trends in Nuclear Physics Computing
  4. Ongoing Activities
    • Offsite Computing
      • BigRed3 at IU (David)
      • NERSC (David)
        • AY21 Request submitted
        • Job bundling to start mid-late Oct.
      • OSG (Thomas)
    • EVIO-6 (Carl)
    • JANA2 (Nathan)
      • GlueX port
      • LDRDFest 3pm Friday (bluejeans test 9:30am Thurs.)
    • A.I.
      • JLab AI webpage (Kishan)
      • Hydra (Kishan/Thomas)
      • Experimental Controls (David)
    • SRO (Vardan)
    • JLab Common Environment (CE) + SPACK (Thomas)
  5. GUI for Calorimeter calibration scripts (Hall-D Request) (Thomas)
  6. Publications
    • publishable projects
  7. Coding Standards
  8. AOT



Minutes:

Attendees: David L., Carl T., Nathan B., Thomas B., Kishan R.

Future Trends Conference

  • Common theme to many such conferences where lengthy discussion regarding cooperation and standards is had, but small effect on industry
  • Significant discussion on Research Software Engineer as a career path

Offsite Computing

  • NERSC
    • AY2021 request submitted for GlueX
    • Agreed to return 10M of 50M unit AY2020 allocation to George Fai to distribute to other NP projects
    • Job bundling work anticipated to start next week
  • IU - Big Red 3
    • Work starting this week to get GlueX jobs running on IU Big Red 3
  • OSG
    • Some small issues recently, but are not due to JLab end
    • Making JLab SciComp available for OSG jobs
      • Need to set up Compute Element which is process that runs on head node at JLab and turns OSG requests into submissions to our SLURM system
      • Examples and documentation are identified, but implementation work needs to be done
      • Hopefully completed in Oct.

EVIO-6

  • Numerous bugs identified and squashed in buffer reading code
    • Inherited code was initially written to read from files, but was modified to read from buffers
  • Started looking into ReadTheDocs for package documentation
    • Proof-of-principle test performed to auto-generate documentation from GitHub source
    • Most existing documentation in MS Word format. ReadTheDocs uses Sphinx
    • Some tools exist to convert MS Word into Sphinx, but need to be investigated to see if the format conversion produces something appropriate.

JANA2

  • 25/28 modules now ported. (Latest was ANALYSIS which was particularly complicated).
  • Currently tackling event sources and plugins (event processors). The latter are more straight forward, but the former prevents linking and testing to begin.
    • Nearly have JEventSource_EVIOpp compiling without errors (primary EVIO event source class for GlueX)

AI

  • Kishan running on sciml GPUs without out-of-memory issues seen a few weeks ago. Assume problem is fixed.
  • Web page
    • Kishan has prepared some slides to show regarding a proposed content/format/management plan
    • Kishan, Thomas, and David will meet Tuesday to go over. (Chris T. is out of town this week).
  • Hydra
    • Working now with Tensorflow 2.3
    • Some issues getting MirroredStrategy to work with flow_from_* methods (preloading images runs out of memory)
    • Common problem that does not seem to have a well-known solution at the moment.
    • May not be worth pursuing right now since training times have been reduced to a few hours
  • DOE Report on AI town halls
    • Thomas read through and was inspired to think about a general use AI report class
  • Accounts set up for the Experimental Controls projects. Job postings will be entered into the system this week.

SPACK

  • Thomas has started reviewing documentation again in order to pick up where he left off before
  • A group has already been set up on the CUE
  • Permissions to some directories created by Wouter were made to be group writable

SRO

  • Vardan out of town so no SRO report this week