Difference between revisions of "Discussion of: DOE/NSF Workshop on Correctness in Scientific Computing"

From epsciwiki
Jump to navigation Jump to search
Line 5: Line 5:
 
observations (e.g., pointwise validation against real-world measurements), and statistical methods (e.g.,
 
observations (e.g., pointwise validation against real-world measurements), and statistical methods (e.g.,
 
similar to those used to provide projections for hurricane trajectories)."
 
similar to those used to provide projections for hurricane trajectories)."
*:
+
 
*Top of page 7: "The opportunity costs of scientific computing software bugs can be arbitrarily high. If a bug surfaces during an experiment that cannot be repeated (e.g., processing a rare natural phenomenon), the world community
+
* Top of page 7: "The opportunity costs of scientific computing software bugs can be arbitrarily high. If a bug surfaces during an experiment that cannot be repeated (e.g., processing a rare natural phenomenon), the world community
 
stands to lose. This is one reason why we absolutely must have hardened and trustworthy components at
 
stands to lose. This is one reason why we absolutely must have hardened and trustworthy components at
 
our disposal. Unfortunately, practical experience (e.g., inability to handle “difficult matrices [132]”) suggests
 
our disposal. Unfortunately, practical experience (e.g., inability to handle “difficult matrices [132]”) suggests
 
that we are really not prepared in this manner."
 
that we are really not prepared in this manner."
*:
+
 
*Middle of 3rd paragraph on page 7: "... novel types of interconnect technologies ..."
+
* Middle of 3rd paragraph on page 7: "... novel types of interconnect technologies ..."
*:
+
 
 
* "last-mile neglect”  
 
* "last-mile neglect”  
*:
+
 
 
* "...loating-point details of GPUs—especially Tensor Cores—can vary in subtle ways with respect to their rounding behaviors..."
 
* "...loating-point details of GPUs—especially Tensor Cores—can vary in subtle ways with respect to their rounding behaviors..."
 +
 +
* " Ideally, “compatible” must mean well below—meaning, the numerical errors must ideally be a small fraction of the modeling error. While this fact is somewhat well-known, studies/tools to check whether such relationships hold are lacking..."

Revision as of 14:39, 17 January 2024

David:

  • Bottom of page 5: "The overall approach taken to achieve end-to-end correctness will typically involve the

use of uncertainty quantification (e.g., to account for noisy data), checks of agreement with real-world observations (e.g., pointwise validation against real-world measurements), and statistical methods (e.g., similar to those used to provide projections for hurricane trajectories)."

  • Top of page 7: "The opportunity costs of scientific computing software bugs can be arbitrarily high. If a bug surfaces during an experiment that cannot be repeated (e.g., processing a rare natural phenomenon), the world community

stands to lose. This is one reason why we absolutely must have hardened and trustworthy components at our disposal. Unfortunately, practical experience (e.g., inability to handle “difficult matrices [132]”) suggests that we are really not prepared in this manner."

  • Middle of 3rd paragraph on page 7: "... novel types of interconnect technologies ..."
  • "last-mile neglect”
  • "...loating-point details of GPUs—especially Tensor Cores—can vary in subtle ways with respect to their rounding behaviors..."
  • " Ideally, “compatible” must mean well below—meaning, the numerical errors must ideally be a small fraction of the modeling error. While this fact is somewhat well-known, studies/tools to check whether such relationships hold are lacking..."