Difference between revisions of "ToF-AMS Analysis Software"

From Jimenez Group Wiki
Jump to: navigation, search
(Squirrel (ToF_AMS Unit Resolution Analysis software))
Line 51: Line 51:
  
 
==General FAQ==  
 
==General FAQ==  
 +
 +
 +
===What do I need to run Squirrel?===
 +
 +
You need:
 +
# Version 5.0.5.7 or 6.11 (or the latest update) of Igor.
 +
# The HDF5 xop to be placed in your Igor Extensions folder. Igor Extensions are modular bits of code for extending functionality to Igor. The Igor Extensions folder resides in your Igor Pro Folder, and everything in this folder gets loaded automatically everytime you open Igor. The Igor extension you need is called HDF5.xop and in fresh installations of Igor resides in the "Igor Pro Folder/More Extensions/File Loaders" folder. Simply move or copy this file, and along with it's matching help file, HDF5 Help.ihf, into the Igor Extensions folder.
 +
# The latest Squirrel software, a packed Igor template, downloadable from this link.
 +
Squirrel is tested mainly on pcs using Windows XP, but should work on Macintosh OSX systems.
  
 
===Why Squirrel?===
 
===Why Squirrel?===
Line 67: Line 76:
  
 
Instead of simply performing a data analysis task on a wave in memory, a call to SQUIRREL is made. The call consists of a ‘to-do’ list, the operation that you wish to perform on the data, some operation-specific parameters and a list of the data types that the functions operate on. SQUIRREL takes a look at the to-do list, retrieves the data and passes it to the function. In the case that there is too much data to analyse in one go (e.g. when analysing PTOF data), the task is broken into chunks (known internally as acorns) which results in multiple function calls. In order to access the data in the HDF files, an index needs to be built, which is handled by a separate function, assumed to have been run before the function call.
 
Instead of simply performing a data analysis task on a wave in memory, a call to SQUIRREL is made. The call consists of a ‘to-do’ list, the operation that you wish to perform on the data, some operation-specific parameters and a list of the data types that the functions operate on. SQUIRREL takes a look at the to-do list, retrieves the data and passes it to the function. In the case that there is too much data to analyse in one go (e.g. when analysing PTOF data), the task is broken into chunks (known internally as acorns) which results in multiple function calls. In order to access the data in the HDF files, an index needs to be built, which is handled by a separate function, assumed to have been run before the function call.
 
===What do I need to run Squirrel?===
 
 
You need:
 
# Version 5.0.5.7 or 6.11 (or the latest update) of Igor.
 
# The HDF5 xop to be placed in your Igor Extensions folder. Igor Extensions are modular bits of code for extending functionality to Igor. The Igor Extensions folder resides in your Igor Pro Folder, and everything in this folder gets loaded automatically everytime you open Igor. The Igor extension you need is called HDF5.xop and in fresh installations of Igor resides in the "Igor Pro Folder/More Extensions/File Loaders" folder. Simply move or copy this file, and along with it's matching help file, HDF5 Help.ihf, into the Igor Extensions folder.
 
# The latest Squirrel software, a packed Igor template, downloadable from this link.
 
Squirrel is tested mainly on pcs using Windows XP, but should work on Macintosh OSX systems.
 
  
 
===What are the very basic steps in analyzing a data set with Squirrel?===
 
===What are the very basic steps in analyzing a data set with Squirrel?===

Revision as of 19:42, 27 October 2009

ToF-AMS Analysis Software Wiki

All the text from BEGIN to END will be removed by the 2009 User's meeting.

BEGIN

A copy of the pdf sent to Pika 1.06J beta testers can be downloaded here

Pika 1.06J beta testers:

  • Upgrade to Igor 6.11

To use a new Pika template:

  • Download new .pxt at here


To upgrade an existing experiment:

  • Upgrade Squirrel ipfs to 1.47H. Download zipped file of ipfs at here.
  • Upgrade Pika ipfs to 1.06J. Note that there are two new ipfsin 1.06J: PK_frag_xx and Pk_elemAnal_xx. There is now a total of 7 PK_yy_xx_ipfs. Download zipped file of ipfs at here.
  • Create a data folder called HR_frag and load all waves from the file HR_fragWaves_1_06J.itx into it. Download zipped itx file at here.
  • Kill and recreate the main Squirrel panel and the main Pika panel.

(After this step it would be a good idea to save your experiment as version 1_06J or something else).

  • Play!


Initial comments From Doug, Manjula, Jose, roughly transcribed by Donna:

  • It is important to keep the HR species and SQ species synchronized. At the few places where the default SQ frag table accounts for an ion and the HR species do not, or visa versa, we should reconcile the two.
  • The 'converted' HR frag table will likely be too confusing for typical users, as they aren't supposed to edit anything here. So this table should not have a button on the panel.
  • Jose suggested that perhaps there should be an option for adding in the squirrel sticks beyond the max HR fit m/z to the HR organic. Response from Donna: Folks can do that 'by hand' by making a new, duplicate Squirrel frag_organic that is blank up to the max HR m/z. Users can then manually add this extra organic to the HR organic time series, or the HR organic average stick spectra to plots themselves. For now, I think it a good thing that we keep HR organic as what we explicitly define it to be.
  • Add a small APES picture in the HR Results tab, Elemental Analysis group box?


Details:

Users should be aware that a few sulfate entries in the default list of all ions incorrect. Either the entry for the isotopic abundance is missing, or the parent ion is not the same formula as is given in the all ions table, or the entry refers to a wrong parent ion. I really hope the All HR ion group of Puneet, Delphine, Qi, and Niall will have this all straightened out!

END


Contents

Squirrel (ToF_AMS Unit Resolution Analysis software)

General FAQ

What do I need to run Squirrel?

You need:

  1. Version 5.0.5.7 or 6.11 (or the latest update) of Igor.
  2. The HDF5 xop to be placed in your Igor Extensions folder. Igor Extensions are modular bits of code for extending functionality to Igor. The Igor Extensions folder resides in your Igor Pro Folder, and everything in this folder gets loaded automatically everytime you open Igor. The Igor extension you need is called HDF5.xop and in fresh installations of Igor resides in the "Igor Pro Folder/More Extensions/File Loaders" folder. Simply move or copy this file, and along with it's matching help file, HDF5 Help.ihf, into the Igor Extensions folder.
  3. The latest Squirrel software, a packed Igor template, downloadable from this link.

Squirrel is tested mainly on pcs using Windows XP, but should work on Macintosh OSX systems.

Why Squirrel?

The ToF-AMS can generate large data sets (> 10s Gigabytes) very quickly. The need for a somewhat standard user interface for analyses of these often large data sets was identified, and Squirrel was born. Squirrel is a software tool using Igor on HDF files for analyzing ToF-AMS data. Squirrel is an ongoing, collaborative effort between researchers using the ToF-AMS instrument. Its development to date has been lead by the University of Manchester, University of Colorado, Boulder, Max-Plank Institute, Mainz, and Aerodyne Research, Inc. It is free and is covered by the GNU's General Public License, which means we want to keep it free and give all users the freedom to improve and redistribute the software. An older version of software that has some of the same functionality is called TADA. TADA will eventually not be supported. Squirrel's foundation using HDF files will allow for analysis and manipulation of data sets larger than TADA can handle.

What is Squirrel?

SQUIRREL (SeQUential Igor data RetRiEvaL ) is a data management utility for Igor designed around the random access of AMS data in HDF (version 5) files and within memory. It is a task-based system that exists as a layer between the function calls and the data processing functions; the idea is that all data processing is done through SQUIRREL. The SQUIRREL name may or may not really come from a White Stripes song.

What are the advantages of SQUIRREL over the existing methods?

First and foremost, it eliminates the limits imposed by having to load all the data into memory. Previously, with the Q-AMS data, we were limited to experiment files of less than a gigabyte in size, which is going to hamstring us even further with the advent of the ToF-AMS instruments. But using this system, the binary data is kept on a hard drive as much as possible. Also, as the data is accessed selectively, you only grab as little or as much is needed, greatly speeding up the processing times for simple tasks while still able to perform the big ones. Finally, because it adopts a pseudo-object orientated approach, it should make the development of new analysis methods that can access the data much easier.

How does SQUIRREL work?

Instead of simply performing a data analysis task on a wave in memory, a call to SQUIRREL is made. The call consists of a ‘to-do’ list, the operation that you wish to perform on the data, some operation-specific parameters and a list of the data types that the functions operate on. SQUIRREL takes a look at the to-do list, retrieves the data and passes it to the function. In the case that there is too much data to analyse in one go (e.g. when analysing PTOF data), the task is broken into chunks (known internally as acorns) which results in multiple function calls. In order to access the data in the HDF files, an index needs to be built, which is handled by a separate function, assumed to have been run before the function call.

What are the very basic steps in analyzing a data set with Squirrel?

Analysis steps are generally placed top to bottom, left to right in the ams panel. Briefly:

  1. Gather all the hdf files you wish to analyze in one folder.
  2. Press the Get Index button. A prompt ask you to identify the folder of your hdf files. This function identifies runs and gathers basic information about the data set. The program will search in folder you chose and all subfolders (and subfolders...).
  3. Press the Pre-Process button. This may takes some time to complete. This function generates more handy, organized versions of the data, called intermediate hdf files. A prompt will ask you for a location to put these intermediate hdf files and will save the experiment (will prompt you for a name and location of this experiment).
  4. Go to the MS and PToF tabs and generate graphs of your choosing, such as time series and size distributions.

What does Squirrel have to do with HieDI?

HieDI is a software tool for converting ToF-AMS .itx and .bin files into .hdf files. The .itx and .bin files are generated using older acquisition software (mostly pre-2007). Analysis of the .itx and .bin files can be done through an Igor software package called TADA. Eventually TADA will not be supported and everything will be done through Squirrel and .hdf files.

What is an HDF file?

HDF5 is a general purpose library and file format for storing scientific data. In Igor, one can browse the contents of an hdf file via the Data - Load Waves - New HDF Browser feature. Non-Igor tools for browsing hdf files can be found at this link.

Are there problems with Squirrel?

There are fewer problems as time goes by. The essential tasks of generating times series and average mass spectra for species and the conversion to ug/m3 units is considered to be robust.

Technical FAQ

Todo Waves

What is a to-do wave?

The to-do list is simply an integer wave containing a list of all the runs that are to be operated on. For example, if you wanted to average all the runs between 102 and 105, the todo wave would contain four points, equal to 102, 103, 104 and 105. If you wanted to generate a time series for the entire dataset, using all of the runs, it would be a list of every single run in the experiment. To-do lists can also be generated based on mask waves, so you can selectively process data based on an inlet condition, wire position, meteorological conditions or anything else. Several to-do waves are automatically generated and updated -'all', and 'new' for example. In the code and notes these waves are spelled as 'ToDo' waves.

What's some good advice regarding ToDo waves?

Don't use Igor 'liberal' names (Don't use spaces, begin with an alphabetic characters, etc.). Don't use the reserved ToDo wave names all, new, and blacklist. Several graphs use default colors which depend on the wave name, such as green for those waves with "Org" in the name. If you make a todo wave using these default names, the colors for graphs will be off. So use "HighNitrate" instead of "HighNO3", or "HydrocarbonPlume" instead of "OrgPlume".

How can I convert a 'regular' wave to a ToDo wave?

Often users have found it handy to create their own wave of run numbers, and they want to know how to make this wave appear in the todo wave drop-down menu in the panel. In general, I recommend that a user duplicate an existing todo wave, such as the all todo wave. One can then delete all points in this duplicate wave, and then fill it in with run numbers of their choosing. Then simply select the 'Get List' option and the new todo wave should appear. The other way is to straight-forward but technical. A todo wave needs to be a 32 bit unsigned integer type; the wave type can be changed in the redimension window. Also, the todo wave must have this text in it's wave note: "TYPE:todo" (without the quotes). One can add a wave note via the info area in the Data Browser window or by using the Igor Note command (Note mywave,"TYPE:todo").

Can I un-blacklist a run?

Kind of. First, save your experiment before you try this. Then make a table of the blacklist wave. Delete rows containing the run numbers you want to un-blacklist. Do not attempt to insert run numbers here, just remove them. When you are finished, press Get-Index again. This will go through some todo wave and indexing functions; the 'all' todo wave will now have the un-blacklisted runs. Unfortunately user-defined todo waves will NOT have the newly-unblacklisted runs inserted. Depending on where you are in your analysis, you may have to re-preprocess for downstream values to appear.

How can I remove a ToDo wave that's no longer needed from the ToDo wave list?

If you don't need the wave again, you can simply kill the wave and select 'Get List' from the todo wave drop-down menu. If you want to prevent it from appearing in the todo drop down menu, you can change the wave type to something other than a 32-bit integer or you can remove the todo wave note (See 'How can I convert a 'regular' wave to a ToDo wave'? above).

How can I make todo waves based on the DAQ menu numbers?

The menu numbers themselves are not saved in any parVal or infoVal setting as menu numbers them selves are meaningless. A user could switch menu 1 and 3 for example, and the menu numbers themselves would not be helpful in determining what kind of data a particular run has. Instead, there are 3 waves that can/should be able to sort out the needed settings for any particular run: tofType ( 1 = v or 2 = w, c = 0), ionizationType (EI, sEI, etc, given by numbers) DAQSamplingType(parVal #162). This is a new wave created and used in version 1.45. The idea is that any unique combination of these 3 waves should be able to uniquely identify all original menus.

File and Experiment Organization

What are intermediate files?

The intermediate files are essential components of your Igor experiment. If you were to move the intermediate files to another location and reopen the igor experiment, you will get a prompt asking for their location. Intermediate files can grow large. One good strategy is to create a separate, dedicated folder to house them, and locate this intermediate data folder in the same folder as your experiment. If you are in doubt as to what intermediate files are attached to an experiment, display, in a table, the wave root:index:file_index. But don't try to edit or monkey with this wave, or any other waves in the index folder. That would be a big no-no.

On a Mac OSX system how do I change the font size so that the text fits the buttons in the panel?

In the command line enter DefaultGuiFont button={"Geneva",9,0}. Thanks to Pete DeCarlo for the tip.

What can I do when a pxp file goes missing/bad?

From a user: An HR-Squirrel pxp of mine recently disappeared. I think my computer crashed while it was unsaved, which is very sad. But the intermediate files are still there - is there any useful information that can be easily mined from them, or is best if I just start over? The response: Without the root:index:squirrel_index wave, which 'lives' in the pxp file, the intermediate files are pretty much worthless.

Default Settings

Why is the default RIE value for the nitrate species 1.1 instead of 1?

The RIE is defined compared to the sum of nitrate ions in the m/z (30+46). RIE_x = (Ions_x/Molecule_x) / (Ions_30+46/Molecule_30+46) * MWno3 / MWxThus, if one were to sum ALL nitrate ions (including m/z 63 for nitric acid, 14 for nitrogen, isotope ions due to 15N and 18O, etc.), one gets ~1.1 * (sum of 30+46). Another way of saying this is that (14+30+46+63+...)/(30+46) ~ 1.1.

The 'Get Index' Step

What's the difference between *_series and *_index waves?

The waves run_index, rn_series, time_index and t_series will always have the same number of points. The _index waves track each other and indicate a simple listing of when a run was identified as being under consideration, being in an hdf file. The _series waves also track each other, and they have the same information in them as the _index waves, but the series waves are in chronological and run-number increasing order. Data is processed in squirrel in increasing order, that is, using the _series waves. You should always use the _series waves for plotting and such. I can't think of a reason why a user would need to look at the run_index wave. It may be confusing because often the rn _series and the run_index waves are identical, and one may get used to looking at the *_index wave. But you should always use the *_series waves.

How does squirrel handle 'fast mode' data?

Here are some things to consider when dealing with fast mode data: In the Get Index step, squirrel should automatically create allFastOpen and allFastClosed todo waves. For fast mode runs, it is good to blacklist the first and/or last closed spectra for each fast mode cycle. Squirrel finds these edge runs. You can then blacklist the closed Edge runs. You then want to recalculate your sticks to get Squirrel to get newly interpolated closed sticks across your fast open runs. The first and/or last open runs in your fast mode cycle may also need to be trimmed. You won't want to report these smeared runs as ambient data. The default m/z calibration settings use MSClosed, not MSOpen (and not MSOpen_p and MSClosed_p), so for fast mode open runs, you will need to be prepared for this and handle as you wish. Lastly, often aircraft measurements require different m/z calibration results for open and closed. I am not clear on why this trend seems to be true, but anecdotally this is the case. This has nothing to do with Fast mode data, but it is something to be aware of.

Working with MS data

When generating an average mass spectra what does the 'Truncate sticks to 0' checkbox do?

Traditionally one plots average mass spectra of different species on the same graph as stacked sticks. That is, the value of one species is shown above the value of the previously plotted species. But this type of graph only works when all entries are greater than zero. When the 'Truncate sticks to 0' checkbox is checked the negative values in individual species waves (such as mssd_todowave_org) are set to 0, immediately before this wave is plotted but after the sum is calculated. If one find the sum of this wave, you do not get the same value as indicated in the legend because you are only adding all the values > 0, because all the values < 0 have been replaced with zeros.

How does the airbeam correction work if I run the AMS with v/w mode switching?

You need to choose only one airbeam region, one set of airbeam reference runs. If the AMS was operating in both modes during this region Squirrel will automatically calculate the airbeam average for the v mode only and the w mode only for these reference runs and then combine these to generate the aribeam correction factor. If you select a value from the ToF Type drop down box (in the corrections - airbeam tab) you can view the airbeam average for each of the modes.

The Frag and Batch Table

How do I make a time-dependent entry in a frag wave?

All you need to do is enter in the name of the wave in the frag wave. Be sure that the wave is in the root folder and any mapping to get the data onto the ams time wave has already been done. If there is a nan in the wave, the resulting frag value will be nan. It is always a good idea to first test this feature and syntax using a dummy wave that has has a constant value in it. For example at m/z 44 in the frag_CO2 wave the default entry is this 0.00037*1.36*1.28*1.14*frag_air[28] And if you have a wave called myCO2 with the gas phase CO2 amounts in it (being careful of units) you could change the entry to a time-dependent frag entry to look like myCO2*1.36*1.28*1.14*frag_air[28]

Help

Are tutorials available?

Before using Squirrel, one should be familiar with the basic concepts of Igor. Several good Igor tutorials are available in the Wavemetrics/Igor Pro Folder/Learning Aids folder. You should know how to create, edit and find waves, and create, modify and find graphs.

Currently, there are 3 Squirrel presentations which can be downloaded:

  1. Squirrel overview power point presentation. The original form of this ppt was presented at the 2006 User's Meeting, and has been updated a few times since then.
  2. Squirrel pre-preprocessing power point presentation. This covers the m/z calibration and baseline fitting options. It is available as a ppt (4.4 MB)
  3. Squirrel post-preprocessing power point presentation. This covers the airbeam correction factor, DVa and DC marker calculations. It is available as a ppt (2.2 MB)

How do I report a problem with the ToF-AMS Analysis software? (Squirrel, Pika, Apes, etc)

There are a few places you should look before reporting any problems.

  • The Technical FAQ
  • The release notes of the version of Squirrel you are using and previous version release notes, if appropriate.
  • The coding to do list, because the feature you are dealing with may not be implemented yet or will be updated.

There are a few things you should try before reporting any problems:

  • Make sure the problem is replicable.
  • Isolate the problem to simplest, smallest case where the problem still occurs.
  • Gather the following (as appropriate):
    • the operating system of the computer, version of Igor, error message and symptoms
    • a screen shot of the problem or "Save Graph" if appropriate (e.g. if a graph looks odd)
    • the Igor experiment if it is small, or an example DAQ hdf if the problem is in regard to a specific file
  • Then put files info a folder you create on the Aerodyne FTP site. This ftp site is ftp.aerodyne.com/ToF-AMS/SquirrelProblems/ and uses the same username and password as for downloading the ToF AMS software. Then fire off an email to Donna Sueper (sueper 'at' colorado.edu).

Please be aware that the latest versions of Squirrel will often have bugs fixed. Users should try upgrading to the latest Igor and ToF-AMS analysis software version to see if the problem persists.

How do I analyze AMS data with Squirrel?

The best resource is the wiki at http://cires.colorado.edu/jimenez-group/wiki/index.php/Field_Data_Analysis_Guide

How do I analyze AMS data with Pika?

The best resource is the wiki at http://cires.colorado.edu/jimenez-group/wiki/index.php/High_Resolution_ToF-AMS_Analysis_Guide