Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bless: Transparently logging program outputs

Bless: Transparently logging program outputs

From https://scientificcomputing.rs/2024/talks/goswami.html

We introduce bless, a tool geared towards multiple runs of codes in flux. The key idea is to use Rust's logging libraries to transparently annotate outputs and additionally use a MongoDB database to store the output and metadata. We find this raises interesting problems while handling the outputs of MPI commands asynchronously. Additionally we provide alternatives to (PROG 2>&1) | tee blah.out, with the ability to use a configuration file and prepare a tar.gz of outputs and inputs. This is invaluable when working with minor changes to multiple codebases, as is typical in the early stages of a scientific project.

Rohit Goswami

July 21, 2024
Tweet

More Decks by Rohit Goswami

Other Decks in Programming

Transcript

  1. + Prof. Dr. - Ing. Morris Riedel + Prof. Birgir

    Hrafnkelsson CECAM Workshop :: Electronic Structure Software Development: Advancing the Modular Paradigm
  2. Reproducible Research (1) Community, T. T. W.; Arnold, B.; Bowler,

    L.; Gibson, S.; Herterich, P.; Higman, R.; Krystalli, A.; Morley, A.; O’Reilly, M.; Whitaker, K. The Turing Way: A Handbook for Reproducible Data Science, 2019. https://doi.org/10.5281/zenodo.3233986 (accessed 2020-09-20).
  3. Workflow engines • Snakemake • Fireworks • AiiDA (1) Pizzi,

    G.; Cepellotti, A.; Sabatini, R.; Marzari, N.; Kozinsky, B. AiiDA: Automated Interactive Infrastructure and Database for Computational Science. Computational Materials Science 2016, 111, 218–230. https://doi.org/10.1016/j.commatsci.2015.09.013.
  4. Registering workers • Plugin ecosystems are present • Tasks are

    composable But rely on STABLE underlying codes! (1) Rüßmann, P.; Bertoldo, F.; Blügel, S. The AiiDA-KKR Plugin and Its Application to High-Throughput Impurity Embedding into a Topological Insulator. npj Comput Mater 2021, 7 (1), 1–9. https://doi.org/10.10 38/s41524-020-004 82-5.
  5. Workflow limitations • Development code is typically iterated over very

    quickly ◦ No stable API • Debugging via stdout is common • Logging is unreliable • Outputs can be variable • Timing information may / may not be provided
  6. Life without bless • 1 is stdout • 2 is

    stderr blah > foo.out Pipe into tee to view while writing ./a.out > i_swear_this_is_an_important_labeled_run …
  7. Desiderata • Better logging • Labeling runs • Unified output

    • Compression? • File (.gz) output • MongoDB output • Labels and UUID bless Feature
  8. File logging • Additional information ◦ trace! ◦ UUID, Label,

    Time • Time stamped stdout ◦ info! • bless errors ◦ error! • Time stamped stderr ◦ warn! bless --label "orca_nopar" -- $(which orca) orca.inp # Generates export FNAME='orca_nopar_cd523354-35f5-4d1d-a627-082139d243 ef.log.gz' zcat $FNAME | head -n5 [2024-07-17T12:57:20Z TRACE] Label: orca_nopar UUID: cd523354-35f5-4d1d-a627-082139d243ef zcat $FNAME | tail -n 3 [2024-07-17T12:57:20Z INFO] ****ORCA TERMINATED NORMALLY**** [2024-07-17T12:57:20Z INFO] TOTAL RUN TIME: 0 days 0 hours 0 minutes 0 seconds 197 msec [2024-07-17T12:57:20Z TRACE] orca orca.inp took 206ms 450us 233ns to complete.
  9. MongoDB Backend • No additional metadata in the .gz •

    Binary stores warn! and info! #[derive(Serialize, Deserialize, Clone, Debug)] pub struct SaveGzipBlobParams<'a> { pub cmd: &'a str, pub args: &'a str, pub label: &'a str, pub duration: &'a str, pub uuid: &'a str, pub file_path: &'a Path, #[serde(with = "bson_datetime_as_rfc3339_string")] pub start_time: DateTime, #[serde(with = "bson_datetime_as_rfc3339_string")] pub end_time: DateTime, }
  10. Using mongo { _id: ObjectId('66975007f17cad7b36c2430c'), cmd: '/local/orca', args: 'orca.inp', label:

    'orca_nopar_mongod', run_uuid: 'c58d557d-9783-4c48-8f5d-f1c56508730f', duration: '185ms 385us 646ns', start_time: ISODate('2024-07-17T05:00:55.231Z'), end_time: ISODate('2024-07-17T05:00:55.417Z') }, npx mongsh use local # Show all entries db.commands.find() # Suppress blob data db.commands.find({}, { gzip_blob: 0 }) python scripts/get_db_gzip.py --db-name local --collection-name commands --query-field args --query-value orca.inp
  11. Timing Commands let stdout_handle = std::thread::spawn(move || { for line

    in stdout_reader.lines() { match line { Ok(line) => info!("{}", line), Err(e) => error!("Error reading stdout: {}", e), } } }); let start_time = std::time::SystemTime::now(); if let Err(e) = run_command(command, args).await { error!("Failed to run command: {} {}", command, args.join(" ")); error!("Error: {}", e); } let end_time = std::time::SystemTime::now(); Process timer Time-stamping stdout
  12. Benchmarks Command Mean [ms] Min [ms] Max [ms] Relative MONGO

    run 203.7 ± 10.3 187.3 227.8 1.08 ± 0.07 orca orca.inp > b 188.7 ± 8.2 179.8 205.5 1.00 FileWriter run 193.0 ± 8.6 180.4 204.7 1.02 ± 0.06 MONGODB_URI="mongodb://localhost:27017/" bless --label "orca_nopar_mongod" --use-mongodb -- $(which orca) orca.inp bless --label "orca_nopar" -- $(which orca) orca.inp
  13. Takeaways • Near zero overhead ◦ Better development logging •

    Uses less space • Adds context • Provenance, beyond git https://github.com/haozeke/bless Contributors welcome!! v0.1.0