XAS Data Interchange: A file format for a single XAS spectrum
This is my presentation on XDI for the XAFS16 Satellite meeting on Data acquisition, treatment, storage – quality assurance in XAFS spectroscopy at DESY on August 21 2015.
A file format for a single XAS spectrum Bruce Ravel1, Matt Newville2 1 NIST and NSLS-II, 2 University of Chicago Data acquisition, treatment, storage – quality assurance in XAFS spectroscopy DESY 21-22 August 2015 PDF of this talk: https://goo.gl/Nv1l3G 1 / 10 XAS Data Interchange
the XAS world Even small data volumes and the simplest XAS experiments have persistent problems 1 Data archaeology – have you ever tried to extract data from the Ferrel Lytle archive at IIT? 2 Moving data from the beamline to the data analysis package 3 Sharing data between different analysis packages 4 Submitting supplemental data with a publication 5 Building data-centered applications for the web, the desktop, and the palmtop (e.g. an editable archive of standards) 6 Extracting XAS data from a multispectral data set 2 / 10 XAS Data Interchange
formats They require additional processing in order to display µ(E), including Conversion to energy Dead-time or other corrections Merging of 10s, 100s, or 1000s of scans and/or detectors Ambiguous metadata, for instance How is the beamline identified? What consitutes a user comment? What describes the condition of the source or the beamline? XAS data analysis software and plotting software may have difficulty importing and interpreting the data This data is probably not appropriate for submission to a journal as supplemental material Data interchange A standard for the interchange of µ(E) data would help address most of these concerns. 4 / 10 XAS Data Interchange
data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
data interchange format The smallest unit of currency is the µ(E) spectrum. 1 Be easy for a human to read. Be easy for a computer to read. 2 Establish a common language for transferring µ(E) spectra between XAS experimenters, data analysis packages, web applications, journals and anything else that needs to process XAS data. This will enhance the user experience. 3 Increase the relevance and longevity of experimental data by reducing the amount of data archaeology future interpretations of that data will require. 4 Provide a mechanism for extracting and preserving a single XAS or XAS-like data set from a multispectral experiment or from a complex data structure. 5 Be a building block for hierarchical or relational data structures. 5 / 10 XAS Data Interchange
Interchange XDI is an ad hoc format loosely based on the format of e-mail and structured in a way that looks like a familiar column data file. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
Interchange The data table is clearly organized into columns of numbers, with the abscissa (energy, in this case) as the left-most column. The non-data part of the file is clearly demarcated. This file can be imported as is into most common plotting and data processing tools (such as Excel, Origin, KaleidaGraph, and many others∗ ) # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange ∗ such as that silly thing...
Interchange Metadata is clearly identified and grouped into useful “namespaces”. The data columns are identified and, where appropriate, units are given. For the programmers in the audience, XDI headers map directly onto an associative array. (Other programming languages call this a dictionary, symbol table, hash, or map.) The metadata dictionary defines 8 families, the six shown here plus Detector. and Sample. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
Interchange Three pieces of metadata are required to be in the XDI file. The monochromator d-spacing is required if a correction needs to be made to the energy axis of the data. The symbol and edge of the element are required to unambiguously identify the data. For example, both Cr K and Ba LI have tabulated energies of 5989 eV, while Se K and Tl LIII are both at 12658 eV. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
Interchange Extension metadata – metadata specific to a beamline, a data acquisition system, or a data processing program – is specified by “extension headers”. These use the same format as standard metadata headers, but with a domain specific “namespace”. # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
Interchange User supplied comments (typically, but not exclusively, at the time of data acquisition) are clearly demarcated by a line of slashes and line of dashes. White space must be preserved, that is, a user comment like this must be preserved faithfully: -- -- / \ / \ / \/ \ \ Best / \ Sample / \ Ever / \ !! / \ / \/ # XDI/1.0 MX/2.0 # Beamline.name: 10ID # Beamline.harmonic_rejection: flat Rh-coated mirror # Facility.name: APS # Facility.xray_source: undulator A # Facility.energy: 7.00 GeV # Mono.name: Si 111 # Mono.d_spacing: 3.1356 Angstrom # Element.symbol: Fe # Element.edge: K # Scan.edge_energy: 7112.00 eV # Scan.start_time: 2005-03-08T20:08:57 # Column.1: energy eV # Column.2: mutrans # Column.3: i0 # MX.Num-regions: 1 # MX.SRB: 6900 # MX.SRSS: 0.5 # MX.SPP: 0.1 # MX.Settling-time: 0 # MX.Offsets: 11408.00 11328.00 13200.00 10774.00 # MX.Gains: 8.00 7.00 7.00 9.00 #/// # Fe K-edge, Lepidocrocite powder on kapton tape, RT # 4 layers of tape # exafs, 20 invang #--- # energy mutrans i0 6899.9609 -1.3070486 149013.70 6900.1421 -1.3006104 144864.70 6900.5449 -1.3033816 132978.70 6900.9678 -1.3059724 125444.70 6901.3806 -1.3107085 121324.70 (....etc....) 6 / 10 XAS Data Interchange
A specification 2 A dictionary of metadata families and items with guidelines for appropriate values 3 An API written in C 4 An API written in Fortran 5 Bindings to the C API in Python and Perl 6 A test suite of valid and invalid XDI data 7 Dynamic analysis with Valgrind – no memory leaks in the C library! XDI is ready for use We hope it will be picked up by authors of data acquisition and data analysis software. 9 / 10 XAS Data Interchange
1 Dictionary definitions for metadata related to non-monochromatic sources, such as dispersive optics or plasma sources. 2 Dictionary items related to grating monochromators could be stronger. 3 Bindings for many popular languages: Matlab, R, IDL, LabView, Ruby, Lua, Mathematica, . . . . . . . . . . . . . . . and whatever language you like best. Contribute new bindings! Fork the repository, add more language bindings, make a pull request. We’ll take all comers! 10 / 10 XAS Data Interchange