Sq pk programming

From Jimenez Group Wiki
Revision as of 11:38, 13 February 2009 by DonnaS (talk | contribs) (Done in latest Beta release)
Jump to: navigation, search

APES

To Do

  • Delphine wanted to spit out the elemental mass spectra without normalizing them all to average total mass.
  • Get and incorporate Jose's read me to the wiki
  • Allow users to select HR spectra from non-root data folders

Done

  • Ken had an issue when he chose the tag wave instead of the text wave as the chemical mass formula wave. Perhaps there should be a warning that appears when x number of mass fragments are flagged as having issues instead of simply printing them to history. (done in 1.02)
  • Add a new table on the web page for downloading ipf (done in 1.01)
  • Add ability to not require the fit of CO and/or CO2 (done in 1.01)
  • Add the ability to do the analysis on user selected HR spectra, such ans PMF results (done in 1.01)


Pika

Extreme Priority

Final Peak Shape

  • In final peak shape graph add ability to:
    • truncate at 3.5 sigma
    • trim wings to ensure monotonicity
  • In final peak shape algorithm remove/modify final peak shape smoothing and/or make it user editable
  • Allow peak width to change as a function of time (smooth).

HR One Spectra plot

  • When different open/closed m/z fitting parameters are used, use these parameters in displaying raw data in HR_peakHeights graph
  • When different open/closed m/z fitting parameters are used, calculate and display sensible 'blue lines, or fits' for the difference spectra in HR_PeakHeights graph

HR Fragment choices

  • Do away with old 'family' derivations and require users to choose which HR fragment belongs to which family.
  • Begin to implement the ability for certain HR fragments to belong to more than one family with the introduction of a "HR frag table". For example CO2 would belong partially to the CHO2 family and partially to the air family.
  • Begin to implement a batch table, that is, have users create "species" with their own RIE, CE. Note that this is a slightly different concept than the family groupings.

Misc

  • Change the default behavior from fitting difference spectra to only fitting open and closed. James' point at the user's mtg is that we should always be looking at the HR_OpenSticks - HR_ClosedSticks.
  • m/z calibration - We need to subtract the mass of an electron from all our masses.

To Do Priority

  • In Samara test case the redo fit button on the peak heights graph wasn't working. (it went to get the raw spectra)
  • In peak shape panel, make sure when we update graphs that the window within the panel is called. (If you close the panel, other graph axes get rescaled.
  • Allow for the selection of different todo waves from the one used in peak width functionality
  • Begin to automatically create uncertainty values generated from peak shape and/or m/z calibration knowledge.
  • Generate 'Neil's graph' whereby we plot as sticks from zero amount of C and H mass as a function of number of Oxygens. Do a similar thing for the number of C's (the amount of H and Os and a function of number of Cs).
  • Deal with CO tag (Cobalt in usage in tags) and (C2HN?)
  • The legend for the peak width graph doesn't update when defaults are changed (is this true?).
  • Checkbox for truncating sticks to zero in stacked graphs.
  • Many parameters panel
    • add small raw spectra insets to left and right
    • add graph showing spectra-like data with window highlighting where you are
    • have run numbers update with cursor move
  • In Peak shape panel the first run number of index should be put into “run number” field (it defaults to 0)
  • Add ability to duplicate PeakHeights graph and functionality
  • In the m/z calibration panel, currently the width and accuracy vs m/z graphs don't show a summary of all the runs. E.g. the summary of all the peak widths for all the runs with error bars in the peak shape panel very nicely summarizes this information.
    • (Jose) This could be done with a radio button on top of the graph that selects SR/AR (meaning single run / all runs)
  • We need to add camels, dromedaries, and vw "bug" ions to the lists. We need some notation for these ions, let me suggest vw28, vw23, d28, d32, and c28, c32, with the comment column describing what they are. It is important that those are lowercase letters, so people don't confuse these ions with vanadium (V), tungsten (W), and deuterium (D).


To Do (Less Urgent)

  • Create other methods for the user to select masses to fit beyond the simple default option.
  • There is a need to consolidate the three sets of m/z lists that exist into a single one. We currently have the list of m/z possibly used in fits,(1) ones actually selected for HR sticks (2) list of m/z used for m/z calibration, and (3) the list of isolated ions. Each list is a subset of the previous one, and we can work with a single big list, and a series of masks to select / deselect the various m/z for the different purposes.
  • If user only does the m/z calibration and does not play with the HR panel, the code calculating the HR sticks may go awry (maybe - I need to check).
  • Incorporate some of Mike Q's functions that deal with sq frag vs pika sticks (Donna: Mike Q's July 2008 email) Also think about NH4 measure vs predicted and RIE for NH4 (Donna: Jose's July 2008 email).
  • Make clear what limitations there are about the total number of masses that can be fit at any m/z value.


To Discuss

  • Change saving of the HR sicks matrix from masses that were selected to fit to all known masses.


Tasks for next Beta release

  • In the peak shape panel, the code does not like it if you don't first examine one species before doing the todo wave.
  • In the peak shape panel, include a string that indicates what species are used but not displayed.
  • Make the pop button work for examining one spectra.


Done in latest Beta release

  • Change the error that says "no can do!" to something more specific (done in 1.04H at least)
  • In the tables that indicate pthalate279 use the formula C16H23O4 (done in 1.04H at least)
  • In Peak shape panel and "live update" button similar to the one in the m/z calibration panel. Similarly, don't clear all previous fits when one presses the 'begin peak width' button. (done in 1.04H at least)
  • In Peak Heights graph, the x axis should be centered on the gray area. (done in 1.04H at least)
  • In peak shape panel, add the ability to plot real data onto final peak shape graph (show scatter of real points on 'wings') (done in 1.04H at least)
  • In peak width/shape panel, if todo wave is v mode and user selects a w mode peak shape, an error message should pop up right away. (done in 1.04H at least)
  • Add the ability to plot residuals on the raw spectra axes in the HR_peakHeights graph (not only on the top axis). (done in 1.04H at least)
  • In final peak width/shape add ability to subtract sq baseline from peak shape algorithm (instead of baselines being constant) done in 1.04H at least)


Done

  • Step 4 of pika: maybe rather than two options (one run or Todo wave) and one button, we could just have two buttons? Something like “Calculate HR Sticks for averaged Todo wave” and “Calculate HR Sticks for one run”? (Done in 1.04)
  • Change in step 4 options, test for peak height to be less than 0 to be less than user defined value (done in 1.03D)
  • Warn clearly that changing selected masses is done globally (done in 1.03D)
  • Add a button in PeakHeights graph to redo pika fits (Keep all user settings from step 4) (done in 1.03D)
  • Partition Avg ability for todo waves (especially v only, for Qi Chen - done in 1.03A)
  • In Param graph add radio buttons for raw and percent (done in 1.03D)
  • In Param graph add print all cursors (done in 1.03D)
  • Add button for web link in credits panel. (done in 1.03E)
  • In the graph for one spectra HR fit, move the redo fits button so it is farther away than the arrows. (done in 1.03E)
  • Double check m/z parameters (so they are not nan) before trying to fit (done in 1.03E)
  • Alex had an issue with the show inset button being clicked on, but the insets weren't showing up. Make sure that this toggle value keeps up with the display. (done in 1.03E)

Squirrel

Extreme Priority

  • Rework the entire m/z calibration fitting routine so that we fit all plausible peaks (getting peak centers and widths) at the beginning. Then we only need to select which masses to use for the m/z calibration equation.
  • Add Doug's threshholding diagnostics and correction factor.
  • Incorporate errors into time series and average mass spectra plots!


To Do Priority

  • Incorporate the CE algorithms as identified by Tim O, Roya, etc
  • Make clear the errors units.
  • Manjula thought that when the user puts more hdf files in the data folder, and presses the Get Index button, that the code does not handle new v/w switching and mode changes well. (For example, at first the DAQ is set up to do all v then does v/w switching.) Donna's comment: Right now I cannot reproduce this problem but there was a problem with the diagnostic plot not updating correctly, and this was fixed in 1.43F.
  • Mike's pcurser in todo wave names
  • Mike Q's issue with m/z calibration with fast mode data
  • In some data sets, many small intermediate files get written instead of a few big ones. (Some of Mike C's Arctas)
  • In the preprocess step, make clear the distinction between using DAQ sticks and recalcing sticks
  • In average mass spec plots, review how negative values can be 'hidden' by other traces.
  • In average mass spec plots, a better system for expanding/shrinking is needed.
  • We have problems with averaging PToF runs when the PToF settings change between runs. Have squirrel automatically create different todo waves for each PToF setting.
  • Review existing code that deals with fast mode data.
  • change the field analysis to warn users to check for large CO influence in ab correction; provide a sample pre and post AB correction from Tim to demonstrate
  • In average mass spec graph, check the display of negative values. (Doug thought that it wasn't working right.)
  • In frag check tab we need a button to update plots instead of replotting every time.
  • In frag diagnostic plots we need to change the 'display gaps' setting.
  • Doug really likes to have time series graphs be such that grids are on midnight of every day, and labels have day of week inserted.
  • We need to be able to select some ions to be used in m/z cal and peak shape for V or W only.
  • In the m/z calibration panel, think about the ability to tweak settings in conditions of high loadings (Mike C).
  • Add somewhere in the help/faq something about the averaging of raw spectra and the units conversion (Donna - see James email of 11/2007).
  • Doug wants the diagnostic of Closed/Diff sticks. Anything that sticks out as being very high has a different source (CO2, O, K, C).
  • In Tim's Ron Brown data set the m/z calibration panel updates so slow (91,000) runs because the right side plots update also.
  • In Tim's Ron Brown data it might be good to write MSSDiff_p matrix to memory to free up memory - make this an option?
  • TimO made the suggestion that there be a little button for saving the m/z calibration table with a todo suffix.


To Do (Less Urgent)

  • Deal with the popup menus that deal with long strings by using 'reflection'.
  • Doug wants to rename 'AB reference runs' to something else.
  • Heikki had out-of-memory issues with the organic matrices.
  • Make more consistent the todo wave creation scheme whereby users can input "todowave and not 1xxx and not 2yyy" for the todo wave formula
  • Change PToF code so that users have an option to plot the legend (for size distribution graphs).
  • For PToF size distributions, often the size range extends to smaller values than are useful. Add controls so that user can quickly zoom in and out of interested size ranges.
  • Should we allow users to enter a negative value for the PToF vl parameter? (Liz, John J)
  • Add the ability to calculate diameter mean and median in PToF data (Manjula).
  • Occasionally after preprocessing, the status bar ends saying “Estimating space requirements” rather than “Done”
  • For DC marker corrections, Manjula wants to be able to enter nitrate, and then have squirrel figure out what m/zs this corresponds to.
  • In average mass spec graph, when checking linear scale have the default go to 0.
  • In average mass spec graph, perhaps add a 'magnification' drop down menu - x 2, x 10, x 25, etc.
  • Doug really likes for tables to have columns be the minimum width that is sensible.
  • Re-look at IE/AB calculations (Donna - see Tim O's email of Jan 2008).
  • In the diagnostic plots, perhaps the PToF airbeam wave should be nanned whenever the run doesn't have PToF data.
  • In m/z calibration panel, check the possible bug when interpolating across a todo wave.
  • In Alice's diagnostic plots, change legends that give integer m/z values to molecular species.
  • Review todo wave name length limitations.
  • In the m/z calibration panel, make more clear what the nan param button does.
  • From Tim O:During manual (F3) saves, the mass spectral data (e.g. MSOpen_V, MSSOpen_V, etc.) are saved as simple data arrays with the m/z in the rows direction. During autosaving, even if only for one run, these waves are always saved with the m/z in the column (run number in the row dimesion). It turns out that Squirrel can read and process either, just not at the same time. Thus, while he can process the data separately, he cannot ask Squirrel to load and process autosaved data with nonautosaved data interspersed.
  • Incorporate some peak finding code as per Jesse's request.
  • Deal with Fast runs not finishing a fast mode cycle.
  • Update squirrel web site.
  • From Ken: In the baseline fitting panel, when scrolling through the baseline fits for all runs in a todo list, add the ability to pause on a chosen run. It would also be great to select the direction in which the run baselines are scrolled - forward or backward.
  • Allow users to go > 500m/z in baseline panel.
  • Create Frag table default fragment text waves (Donna would be good for the manual)
  • Tim O's issue with somehow getting the squirrel_index matrix messed up and getting the stick matrix data set being as wide (columns) as the raw spectra matricies. (Donna - is this replicable?)
  • Review the use of t_series_old
  • Review the use of custom time series - are sanity checks in place?
  • Review the use of the todo graph.... add checkboxes to remove todo waves?
  • Add code that calculates the mean/median for size distributions (Manjula)
  • Add an upper limit to when we generate normalization factors (current usage is to have user's create custom size interval).
  • When a user has applied an AB correction and a new ion_eff has been calculated, then adds new hdf files (and indexes, etc) the ion_eff wave may not grow to the needed size. This adding more hdf files scenario could also cause problems when new ionization types (sEI) may appear. (Donna: email from Alberto Presto July 2008 - he used sEI 'mode' when it wasn't soft EI, he just wanted to clearly delineate modes, and we now have sample types to address this).
  • Amewu gets (1) strange wave stats error in baseline panel (2) strange interpolate2 after pre-processing and noticed (3) Frag checks color-by-f(z) problems.
  • Achim wants the ability to calculate sticks beyond 1000m/z for SP2. (It is a simple code change; Achim knows how to do it. It is not clear how this could be incorporated into squirrel for other users.)

Done in latest Beta release

  • Puneet discovered a small error in the m/z calibration panel. If he adds a new species but doesn't use it to fit the section of code that deals with the updating of individual graphs throws an error. (fixed in 1.44B)
  • Bug found with Sanna: when using a user defined time base but minutes = 0, we get an out of memory error. (fixed in 1.44B)
  • Write intermediate files as compressed HDF files? Check to see if/how this is best done for updating experiments. (Jose's suggestion only NEW files are compressed) (done in 1.44B - only new files are gzip compressed)
  • Tim O et al made the suggestion for using the daq values as input for the m/z calibration fit (done in 1.44B - this seems to make the m/z calibration routine 10 - 20% faster)
  • Error that Samara found when trying to get error matricies... The dimensions of the stick matrices were different for open and closed (done in .144B)
  • In mz_reSizeMzCalWaves() recode it so that it does not nan the Accuracy_XX_yy waves when one fits only one run. (fixed in 1.44B)
  • There are still some instances where if the user presses a button when not in the root folder, squirrel is ungraceful (Niall). (fixed many in 1.44B)
  • When doing an m/z cal fitting, put runs with bad fits into a separate todo wave. (done in 1.44B)
  • Deal with Helsinki/Doug's unique PToF data (grouping last 3 of 4 choppers). Done but only distributed for use in Mikael's data
  • Incorporate speccorr_list (Batch table) functionality to make species specific time series corrections via batch table. (Done in earlier versions. Ken entered this item but was not clear on its usage)
  • The 'resolution' parameters for determining peak integration area need to be printed to the history window whenever we recalc sticks. (Done in 1.44B)
  • Pre-process: when recalculating sticks with new m/z parameters, the error related to some m/z parameters being NaNs often shows up. A few times at the beginning, and then halfway through. Is there any way to deal with this more effectively? (done in 1.44B - only tested for fast mode situations but other situations should equally apply)
  • When overwriting a todo wave the message that gets printed to history can be incorrect - it says that the current todo wave was overwritten. (Done in 1.44B)
  • From James: If you get it to do a batch of diurnal plots, it calculates all of the time series afresh before plotting each graph. It's just a function call on the wrong side of a for statement, so it should be pretty trivial to fix. It also plots a graph of the time series each time around as well, which is really annoying because it ends up creating loads of excess graphs. (Done in 1.44B)
  • In the baseline subpanel, make the mass defect default values more clear. (Done in 1.44B)
  • In the modify SI table, also include ionization and tof type waves so users can more easily see how to change SI values. (Done in 1.44B)
  • Add the LS diagnostics to the set of waves that get loaded. (finished in 1.43I, checked in 1.44B)
  • Load in menu information from DAQ hdfs and make menu info more readily available to user. (kinda finished in 1.44B) - we have loaded in the tofType and ionizationType since ~ 5 versions ago. This is all that most will ever need. I have added DAQsamplingType for users like Doug who may have set up funky menus. We have chosen to avoid having menu numbers as indicators/mask waves because they really don't provide the details we need - the toftype and ionizationtype.
  • Make squirrel handle situations whereby the sEI is not really sEI but a different menu. (done via use of samplingType - see note above, done in 1.44B)
  • Add 2 new ionization types to code (see Joel's email of March 2008). (done in 1.44B)
  • In m/z calibration panel we need a readout that tells ionization type (sEI, EI) as well as ToF type (v/w) (done in 1.44B)
  • In baseline panel we need a readout that tells ionization type (sEI, EI) as well as ToF type (v/w)(done in 1.44B)
  • Add code that somehow simply tells the user whether the sticks were recalced or not, and if recalced, the parameters used. (done in 1.44B)
  • When a user enters runs and the first is > than the last, 45-40, a better error message should be generated. (Now it generates a todowave with 0 points, done in 1.44B)
  • From Ken: In the baseline fitting panel make the main plot in the popped window automatically scale the x axis to that in the main window. (done in pre-144B? If the user uses the marquee instead of the panel settings for setting the axis range, the x,y scales won't be the same. But if they use the panel variables for x and y scales, it works.)
  • In Dva popped table we should also see run numbers and whether or not a run has PToF data. (done in 1.44B)
  • Ask user to save experiment before blacklisting (done in 1.44B)
  • Added the possible creation of an allMS todo wave so that users can be clear when there is no MS data (only PToF).
  • James would like the ability to enlarge the stick integration region for one m/z. Currently the code only allows you to make the stick integration region smaller. (done in 1.44B)
  • In the baseline panel allow uses to go higher than m/z 500.(done in 1.44B)

To Discuss

  • Normalize w mode to v mode - One diagnostic:the ratio of v and w sticks with the x axis as m/z.
  • Change frag table of sulfate according to suggestion from Ann
    • Current frag_sulphate -> frag_sulphate_old
    • New frag_sulphate doesn't depend on frag_SO3 and frag_H2SO4 (Ann will wori up this new table)
    • Frag_SO3 and H2SO4 are kept in Squirrel for reference and if someone wants to put them back to use, but not used by default
  • change frag_organics as per Allison's paper
  • sq_verifyTodoForPToF function aborts in case where every run in ToDo wave has PTOF. Currently commentize the abort call to get SQ to pre-process. MJC 1/14/09

Done

  • Check the labels for graphs generated using x vs tof and dxdlogdp (Done in 1.43I).
  • Possibly rename the checks to be done before preprocessing as 'stick adjustments'. (Done in 1.43I)
  • Change defaults in Frag checks tab to be 'all' and colored by timewave.
  • Fix bug that pops up in Frag checks plots. (Done in 1.43I)
  • Make sure that when doing DC marker corrections squirrel changes todo wave to be todo wave AND allPToF (Donna note: This was done primarily to avoid creation of blanks in the prepocess step but was also added in the corrections tab, done in 1.43I).
  • Add checkbox in Misc - graphs section that sets the default displays to not show gaps. (Done in 1.43I)
  • When preprocessing PToF sticks generate a todo wave that has only PToF runs in it instead of generating dire warning. (Done in 1.43I)
  • Change the preprocess checkbox settings so that the default is to apply dc markers. (Done in 1.43I)
  • An error pops up when trying to creat normalization factors for user defined species (As in Pb for Leah). (The normalization routine was overhauled in 1.43G and this error should no longer appear.)
  • A general warning should pop up when users try to create todo wave names with > x characters. (Done in 1.43E)
  • Add button to generate diagnostic plot if user killed it. (Done in 1.43I)
  • Change default for average mass spec so that only sticks (not sticks + raw) are calculated. (Done in 1.43I)
  • Bug found by Mike Q about checking modifications of all frag waves, not just species selected in average mass spec (fixed in 1.43C-ish)
  • In the m/z calibration panel, make the live option checked automatically when viewing one run. (Done in 1.43I)
  • Bug found by Carly whereby the MS time series for the species all can replace the 'all' todo wave (Done in 1.32I)
  • For PToF size distributions, the total loading for each species should be displayed (somewhat similar to the legends that get displayed with average mass spec). (Total MS and normalization factors are now printed to history - Done in 1.43I)
  • When an intermediate file was not accessed successfully change the code from the existing error message that gets printed to history, to an abort command (Manjula). Done in 1.43I

Time Trace

Not Urgent

  • Make the quick view program instructions more clear.


LS

Priority

To Do

Wait for requests and updates from Eben, Tim, Jay, etc.

EC

Priority

  • Wait for request from Delphine - nothing to do yet.