Difference between revisions of "Online Recon - Expert"
(28 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
== Servers == | == Servers == | ||
− | |||
− | + | Currently everything in this section is running on <tt>clonfarm11</tt>, but that may change. | |
− | + | ==== Software Installations ==== | |
− | |||
− | |||
− | |||
− | + | The core HPS java software builds for online, multi-threaded reconstruction live in <tt>~hpsrun/online_recon</tt> and were installed using the [https://confluence.slac.stanford.edu/display/hpsg/Online+Reconstruction+Tools instructions on confluence]. | |
− | + | We may decide the hps-java installation there should be a symlink to a standard one elsewhere, but note it requires an extra installation step and (currently) a non-master git branch. | |
− | * <tt> | + | If the tomcat servlet needs updating, it will need to be rebuilt and redpolyed. |
+ | |||
+ | ==== Operations ==== | ||
+ | |||
+ | Everything operates as <tt>user=hpsrun</tt> and startup originates in <tt>$EPICS/apps/iocBoot/procServ.conf</tt>. There are 4 components, started sequentially, named: | ||
+ | |||
+ | # <tt>dqm_et</tt> | ||
+ | #* just an ET ring, offline only | ||
+ | # <tt>dqm_evio2et</tt> | ||
+ | #* pipes an EVIO file to the ET ring, offline only | ||
+ | # <tt>dqm_server</tt> | ||
+ | #* receives instructions from the client below | ||
+ | # <tt>dqm_client</tt> | ||
+ | #* instructs the server on parameters, e.g. #threads, steering file | ||
+ | |||
+ | Each component is spawned automatically in procServ and is accessible via a telnet session by running | ||
+ | * <tt>softioc_console NAME</tt> | ||
+ | Once connected, individual components can be interactivly killed, paused, restarted, using the <tt>ctrl-X/T</tt> sequences (which is printed when you connect). The <tt>dqm_client</tt> process can also be used interactively within telnet. To disconnect from any telnet session, <tt>ctrl-]</tt> and then type <tt>quit</tt>. | ||
+ | |||
+ | Configuration files for all are currently in their startup directory <tt>$EPICS/apps/iocBoot/dqm</tt>. | ||
− | = | + | A full teardown and restart of all components, including the tomcat server, is in <tt>user=hpsrun</tt>'s <tt>$PATH</tt>: |
− | * | + | * <tt> hps-dqm-restart.sh</tt> |
− | |||
== Clients == | == Clients == | ||
Both should work from any clon machine: | Both should work from any clon machine: | ||
− | |||
* web browser @ <tt>http://clonfarm2.jlab.org:8080/HPSRecon/</tt> | * web browser @ <tt>http://clonfarm2.jlab.org:8080/HPSRecon/</tt> | ||
+ | ** this does not update automatically (currently) but requires a manual refresh | ||
+ | ** but clicking on a detector displays many plots simultaneously | ||
+ | * jas3-->Tools-->Connect [[Image:jas3-onlinerecon.png|200px]] | ||
+ | ** this has the advantage that it refreshes automatically | ||
+ | |||
+ | == Miscellaneous == | ||
+ | |||
+ | ==== To Do ==== | ||
+ | |||
+ | * The <tt>dqm_server</tt> configuration will need updating for the online ET ring. | ||
+ | * The <tt>dqm_client</tt> configuration will need updating for different steering files and run numbers. | ||
+ | * The stop/start functionality in the server appears maybe unreliable, and the tomcat servlet also appears to need to be restarted if the recon server is restarted. These things will need to be done in order to zero the histograms, which will be every run and every time beam is away for a significant amount of time. | ||
+ | ** Meanwhile, a script to do a full teardown and restart is available and may well be sufficient. | ||
+ | |||
+ | ==== Setup ==== | ||
+ | |||
+ | To setup on a new clon machine, this may be necessary: | ||
+ | * <tt>yum install tomcat tomcat-admin-webapps ksh telnet</tt> | ||
+ | * edit <tt>/etc/tomcat/tomcat-users.xml</tt> and enable the "manager-gui" role and "admin" user | ||
+ | |||
+ | To enable <tt>user=hpsrun</tt> to restart the tomcat server in a batch script (which happens in the aformentioned restart script): | ||
+ | * <tt>visudo</tt> on the machine running the tomcat server and add these lines: | ||
+ | %onliners ALL=NOPASSWD:/usr/bin/systemctl stop tomcat | ||
+ | %onliners ALL=NOPASSWD:/usr/bin/systemctl start tomcat | ||
+ | %onliners ALL=NOPASSWD:/usr/bin/systemctl restart tomcat | ||
+ | |||
+ | <strike>Tomcat is somehow unhappy on <tt>clonfarm3</tt>, appears to be a permissions issue. Tried various things but haven't tracked it down.</strike> | ||
+ | |||
+ | On some machines, tomcat installations suffer from some runtime directory permissions issue. It appears possibly related with JLab's LDAP changes in previous year, where groups of the same name as a user got appended with <tt>-grp</tt>. The current workaround is to modify the tomcat service to run as root in <tt>/usr/lib/systemd/system/</tt>. |
Latest revision as of 08:04, 26 August 2021
Servers
Currently everything in this section is running on clonfarm11, but that may change.
Software Installations
The core HPS java software builds for online, multi-threaded reconstruction live in ~hpsrun/online_recon and were installed using the instructions on confluence.
We may decide the hps-java installation there should be a symlink to a standard one elsewhere, but note it requires an extra installation step and (currently) a non-master git branch.
If the tomcat servlet needs updating, it will need to be rebuilt and redpolyed.
Operations
Everything operates as user=hpsrun and startup originates in $EPICS/apps/iocBoot/procServ.conf. There are 4 components, started sequentially, named:
- dqm_et
- just an ET ring, offline only
- dqm_evio2et
- pipes an EVIO file to the ET ring, offline only
- dqm_server
- receives instructions from the client below
- dqm_client
- instructs the server on parameters, e.g. #threads, steering file
Each component is spawned automatically in procServ and is accessible via a telnet session by running
- softioc_console NAME
Once connected, individual components can be interactivly killed, paused, restarted, using the ctrl-X/T sequences (which is printed when you connect). The dqm_client process can also be used interactively within telnet. To disconnect from any telnet session, ctrl-] and then type quit.
Configuration files for all are currently in their startup directory $EPICS/apps/iocBoot/dqm.
A full teardown and restart of all components, including the tomcat server, is in user=hpsrun's $PATH:
- hps-dqm-restart.sh
Clients
Both should work from any clon machine:
- web browser @ http://clonfarm2.jlab.org:8080/HPSRecon/
- this does not update automatically (currently) but requires a manual refresh
- but clicking on a detector displays many plots simultaneously
- jas3-->Tools-->Connect
- this has the advantage that it refreshes automatically
Miscellaneous
To Do
- The dqm_server configuration will need updating for the online ET ring.
- The dqm_client configuration will need updating for different steering files and run numbers.
- The stop/start functionality in the server appears maybe unreliable, and the tomcat servlet also appears to need to be restarted if the recon server is restarted. These things will need to be done in order to zero the histograms, which will be every run and every time beam is away for a significant amount of time.
- Meanwhile, a script to do a full teardown and restart is available and may well be sufficient.
Setup
To setup on a new clon machine, this may be necessary:
- yum install tomcat tomcat-admin-webapps ksh telnet
- edit /etc/tomcat/tomcat-users.xml and enable the "manager-gui" role and "admin" user
To enable user=hpsrun to restart the tomcat server in a batch script (which happens in the aformentioned restart script):
- visudo on the machine running the tomcat server and add these lines:
%onliners ALL=NOPASSWD:/usr/bin/systemctl stop tomcat %onliners ALL=NOPASSWD:/usr/bin/systemctl start tomcat %onliners ALL=NOPASSWD:/usr/bin/systemctl restart tomcat
Tomcat is somehow unhappy on clonfarm3, appears to be a permissions issue. Tried various things but haven't tracked it down.
On some machines, tomcat installations suffer from some runtime directory permissions issue. It appears possibly related with JLab's LDAP changes in previous year, where groups of the same name as a user got appended with -grp. The current workaround is to modify the tomcat service to run as root in /usr/lib/systemd/system/.