spike.plugins package¶
Submodules¶
spike.plugins.Fitter module¶
set of function for Peak fitter
Very First functionnal - Not finished !
requires the Peaks plugin installed
July 2016 M-A Delsuc
-
class
spike.plugins.Fitter.
FitTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Test for fitter, assumes Peaks plugin is loaded
-
spike.plugins.Fitter.
Lor
(Amp, Pos, Width, x)[source]¶ One Lorentzian Param contains in sequence Amp_i, Pos_i, Width_i
-
spike.plugins.Fitter.
Spec
(Param, x)[source]¶ x is the spectral coordinates Param contains in sequence Amp_i, Pos_i, Width_i all coordinates are in index
-
spike.plugins.Fitter.
dSpec
(Param, x, y=None)[source]¶ Param contains in sequence Amp_i, Pos_i, Width_i
-
spike.plugins.Fitter.
display_fit
(npkd, **kw)[source]¶ displays the result of the fit accept the same arguments than display()
-
spike.plugins.Fitter.
fit
(npkd, zoom=None)[source]¶ fit the 1D npkd data-set for Lorentzian line-shape current peak list is used as an initial values for fitting Only peaks within the zoom windows are fitted
- fitting is contraint from the initial values
intensity will not allowed to change by more than x0.5 to x2
positions by more than 5 points
width by more than x5
(constraints work only for scipy version >= 0.17 )
It may help to use centroid() to pre-optimize the peak list before calling fit(), or calling fit() twice (slower)
-
spike.plugins.Fitter.
residu
(Params, x, y)[source]¶ The residue function, returns a vector Ycalc(Params) - y_experimental can be used by leastsq
spike.plugins.Linear_prediction module¶
plugin for the Linear Prediction algos into NPKDATA
-
class
spike.plugins.Linear_prediction.
LinpredicTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
spike.plugins.Linear_prediction.
lpext
(npkd, final_size, lprank=10, algotype='burg')[source]¶ extends a 1D FID or 2D FID in F1 up to final_size, using lprank coefficients, and algotype mode
spike.plugins.Peaks module¶
set of function for Peak detections and display - 1D and 2D
Very First functionnal - Not finished !
- Peak1D and Peak2D are simple objects
with attributes like Id, label, intens(ity), pos(ition), or width the only added method is report() (returns a string)
- Peak1DList and Peak2DList are python list, with a few added methods
report (to stdio or to a file)
- largest sort in decreasing order of intensity
- other sorts can simply done by peaklist.sort(key = lambda p: p.XXX)
where XXX is any peak attribute (see largest code)
Example of usage:
# assuming d is a 2D NPKData / 1D will be just as simple d.pp() # computes a peak picking over the whole spectrum using 3 x standard_deviation(d)
# This is just a detection of all local maxima
# We can be more specific: d.pp(threshold=5E5, zoom=((700,1050),(300,350)) ) # zoom is always in the currently active unit, defined with d.unit
# this attached the peak list to the dataset as d.peaks, # it is a list of Peaks2D objects, with some added properties print( “number of detected peaks: %d” % len(d.peaks))
p0 = d.peaks[0] # peaks have label, intensity and positions attributes
- print( p0.report() ) # and a report method
# report has an additional format parameter which enables control on the output
# we can call centroid to improve the accuracy and move the position to center of a fitted (2D) parabola d.centroid()
# The peak list can be displayed on screen as simple crosses d.display_peaks()
# The label can be modififed for specific purposes: for p in d.peaks:
- if 150 < p.posF2 < 1500 :
p.label = “%.2f x %.f”%(p.posF1,p.posF2) # for instance changing to the coordinates for a certain zone
- else:
p.label = “” # and removing elsewhere
d.display_peaks(peak_label=True)
# peak lists can also be reported d.report_peak()
# but also as a formatted stream, and redirected to a file: output = open(“my_peak_list.csv”,”w”) # open the file output.write(“# LABEL, INTENSITY, F1, Width, F2, width”) d.report_peak(file=output, format=”{1}, {4:.2f}, {2:.7f}, {5:.2f}, {3:.7f}, {6:.2f}”)
# arguments order order is id, label, posF1, posF2, intensity, widthF1, widthF2
output.close()
Sept 2015 M-A Delsuc
-
class
spike.plugins.Peaks.
Peak
(Id, label, intens)[source]¶ Bases:
object
a generic class to store peaks defines : Id a unique integer intens The intensity (the height of the largest point) area The area/volume of the peak label a string intens_err The uncertainty of the previous values area_err ..
-
class
spike.plugins.Peaks.
Peak1D
(Id, label, intens, pos, pos_err=0.0, width=0.0, width_err=0.0)[source]¶ Bases:
spike.plugins.Peaks.Peak
a class to store a single 1D peak defines in addition to Peak pos position of the peak in index relative to the typed (real/complex) buffer width width of the peak in index pos_err uncertainty of the previous values width_err …
-
full_format
= '{}, {}, {}, {}, {}, {}, {}, {}, '¶
-
report
(f=<function _identity>, format=None)[source]¶ print the peak list f is a function used to transform the coordinate indentity function is default, for instance you can use something like peaks.report(f=s.axis1.itop) to get ppm values on a NMR dataset order is “id, label, position, intensity”
parameters are : Id label positions intens width intens_err pos_err width_err in that order.
By default returns only the 4 first fields with 2 digits, but the format keyword can change that. format values: - None or “report”, the standard value is used: “{}, {}, {:.2f}, {:.2f}”
(so only the four first parameters are shown)
“full” is all parrameters at full resolution ( “{}; “*8 )
- any othe string following the format syta will do.
you can use any formating syntax. So for instance the following format “{1} : {3:.2f} F1: {2:.7f} +/- {4:.2f}” will remove the Id, show position with 7 digits after the comma, and will show width
you can change the report and full default values by setting pk.__class__.report_format and pk.__class__.full_format which class attributes
-
report_format
= '{}, {}, {:.2f}, {:.2f}'¶
-
-
class
spike.plugins.Peaks.
Peak1DList
(*arg, **kwds)[source]¶ Bases:
spike.plugins.Peaks.PeakList
store a list of 1D peaks contains the array version of the Peak1D object : self.pos is the numpy array of the position of all the peaks and self[k] is the kth Peak1D object of the list
-
display
(peak_label=False, peak_mode='marker', zoom=None, show=False, f=<function _identity>, color='red', marker='x', markersize=6, figure=None, scale=1.0, NbMaxPeaks=1000, markerdict=None, labeldict=None)[source]¶ displays 1D peaks zoom is in index peak_mode is either “marker” “bar” or “None” (to be used to just add labels) NbMaxPeaks is the maximum number of peaks to display in the zoom window (show only the largest) f() should be a function which converts from points to current display scale - typically npk.axis1.itoc
-
peakaggreg
(distance, maxdist=None, method='max')[source]¶ aggregates 1D peaks in peaklist if peaks are closer than a given distance in pixel check peak_aggreg() for detailed doc
-
property
pos
¶ returns a numpy array of the positions in index
-
report
(f=<function _identity>, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the peak list f is a function used to transform respectively the coordinates indentity function is default, for instance you can use something like d.peaks.report(f=d.axis1.itop) to get ppm values on a NMR dataset
check documentation for Peak1D.report() for details on output format
-
-
class
spike.plugins.Peaks.
Peak2D
(Id, label, intens, posF1, posF2)[source]¶ Bases:
spike.plugins.Peaks.Peak
a class to store a single 2D peak defines in addition to Peak posF1 posF2 positions in F1 and F2 of the peak in index relative to the typed (real/complex) axis widthF1 …F2 widthes of the peak in index posF1_err … uncertainty of the previous values widthF1_err …
-
full_format
= '{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, '¶
-
report
(f1=<function _identity>, f2=<function _identity>, format=None)[source]¶ print the peak list f1, f2 are two functions used to transform respectively the coordinates in F1 and F2 indentity function is default, for instance you can use something like peaks.report(f1=s.axis1.itop, f2=s.axis2.itop) to get ppm values on a NMR dataset order is “id, label, posF1, posF2, intensity, widthF1, widthF2”
printed parameters are : Id label posF1 posF2 intens widthF1 widthF2 posF1_err posF2_err intens_err widthF1_err widthF2_err
By default returns only the 5 first fields with 2 digits, but the format keyword can change that. format values: - None or “report”, the standard value is used: “{}, {}, {:.2f}, {:.2f}, {:.2f}”
(so only the four first parameters are shown)
“full” is all parrameters at full resolution ( “{}; “*12 )
- any othe string following the format syntxa will do.
you can use any formating syntax. So for instance the following format “{1} : {4:.2f} F1: {2:.7f} +/- {5:.2f} X F2: {3:.7f} +/- {6:.2f}” will remove the Id, show position with 7 digits after the comma, and will show widthes
-
report_format
= '{}, {}, {:.2f}, {:.2f}, {:.2f}'¶
-
-
class
spike.plugins.Peaks.
Peak2DList
(*arg, **kwds)[source]¶ Bases:
spike.plugins.Peaks.PeakList
store a list of 2D peaks contains the array version of the Peak2D object : self.posF1 is the numpy array of the position of all the peaks and self[k] is the kth Peak2D object of the list
-
display
(axis=None, peak_label=False, zoom=None, show=False, f1=<function _identity>, f2=<function _identity>, color=None, markersize=6, figure=None, NbMaxPeaks=1000, markerdict=None, labeldict=None)[source]¶ displays 2D peak list zoom is in index f1 and f2 should be functions which convert from points to current display scale - typically npk.axis1.itoc npk.axis2.itoc
-
property
posF1
¶ returns a numpy array of the F1 positions in index
-
property
posF2
¶ returns a numpy array of the F2 positions in index
-
report
(f1=<function _identity>, f2=<function _identity>, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the peak list f1, f2 are two functions used to transform respectively the coordinates in F1 and F2 indentity function is default, for instance you can use something like d.peaks.export(f1=s.axis1.itop, f2=s.axis2.itop) to get ppm values on a NMR dataset the file keyword allows to redirect the output to a file object
check documentation for Peak2D.report() for details on output format
-
-
class
spike.plugins.Peaks.
PeakList
(*arg, **kwds)[source]¶ Bases:
collections.UserList
the class generic to all peak lists
-
property
intens
¶ returns a numpy array of the intensities
-
property
label
¶ returns an array of the labels
-
property
-
spike.plugins.Peaks.
center
(x, xo, intens, width)[source]¶ the centroid definition, used to fit the spectrum x can be a nparray FWMH is sqrt(2) x width.
-
spike.plugins.Peaks.
center2d
(yx, yo, xo, intens, widthy, widthx)[source]¶ the 2D centroid, used to fit 2D spectra - si center() xy is [x0, y_0, x_1, y_1, …, x_n-1, y_n-1] - is 2*n long for n points, returns [z_0, z_1, … z_n-1]
-
spike.plugins.Peaks.
centroid1d
(npkd, npoints=3, reset_label=True, cure_outliers=True)[source]¶ from peak lists determined with peak() realize a centroid fit of the peak summit and width, will use npoints values around center (npoints has to be odd) computes Full width at half maximum updates in data peak list reset_label when True (default) reset the labels of FTMS datasets cure_outliers : restore peaks with pathological parameters TODO : update uncertainties
-
spike.plugins.Peaks.
centroid2d
(npkd, npoints_F1=3, npoints_F2=3)[source]¶ from peak lists determined with peak() realize a centroid fit of the peak summit and width, computes Full width at half maximum updates in data peak list
TODO : update uncertainties
-
spike.plugins.Peaks.
display_peaks
(npkd, peak_label=False, peak_mode='marker', zoom=None, show=False, color=None, markersize=6, figure=None, scale=1.0, NbMaxPeaks=1000, markerdict=None, labeldict=None)[source]¶ display the content of the peak list, peak_mode is either “marker” (default) or “bar” (1D only) zoom is in current unit.
-
spike.plugins.Peaks.
peak_aggreg
(pklist, distance, maxdist=None, method='max')[source]¶ aggregates 1D peaks in peaklist if peaks are closer than a given distance in pixel distance : if two consecutive peaks are less than distance (in points), they are aggregated if maxdist is not None, this is the maximal distance to the largest peak method is either ‘max’ or ‘mean’
-
spike.plugins.Peaks.
peakpick
(npkd, threshold=None, zoom=None, autothresh=3.0, verbose=False)[source]¶ performs a peak picking of the current experiment threshold is the level above which peaks are picked
None (default) means that autothresh*(noise level of dataset) will be used - using d.robust_stats() as proxy for noise-level
- zoom defines the region on which detection is made
zoom is in currentunit (same syntax as in display) None means the whole data
-
spike.plugins.Peaks.
peaks1d
(npkd, threshold, zoom=None)[source]¶ math code for NPKData 1D peak picker
-
spike.plugins.Peaks.
pk2pandas
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe - in current unit if full is False (default), the uncertainty are not listed uses nmr or ms version depending on data_type
-
spike.plugins.Peaks.
pk2pandas_ms
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe for MS datasets
-
spike.plugins.Peaks.
pk2pandas_nmr
(npkd, full=False)[source]¶ export extract of current peak list to pandas Dataframe for NMR datasets
-
spike.plugins.Peaks.
report_peaks
(npkd, file=None, format=None, NbMaxPeaks=1000)[source]¶ print the content of the peak list, using the current unit
if file should be an already opened writable file stream. if None, output will go to stdout
for documentation, check Peak1D.report() and Peak2D.report()
spike.plugins.bcorr module¶
set of function for the baseline correction
First version - Not finished !
improved July 2016
-
spike.plugins.bcorr.
autopoints
(npkd, Npoints=8, modulus=True)[source]¶ computes Npoints (defaut 8) positions for a spline baseline correction
-
spike.plugins.bcorr.
bcorr
(npkd, method='spline', xpoints=None, nsmooth=0, modulus=True)[source]¶ recapitulate all baseline correction methods, only 1D so far
- method is either
- auto:
use bcorr_auto, uses an automatic determination of the baseline does not work with negative peaks.
- linear:
simple 1D correction
- spline:
a cubic spline correction
both linear and spline use an additional list of pivot points ‘xpoints’ used to calculate the baseline if xpoints absent, pivots are estimated automaticaly if xpoints is integer, it determines the number of computed pivots (defaut is 8 if xpoints is None) if xpoints is a list of integers, there will used as pivots
if nsmooth >0, buffer is smoothed by moving average over 2*nsmooth+1 positions around pivots. if dataset is complex, the xpoints are computed on the modulus spectrum, unless modulus is False
default is spline with automatic detection of 8 baseline points
-
spike.plugins.bcorr.
bcorr_auto
(npkd, iterations=10, nbchunks=40, degree=1, nbcores=2, smooth=True)[source]¶ applies an automatic baseline correction
Find baseline by using low norm value and then high norm value to attract the baseline on the small values. Parameters : iterations : number of iterations for convergence toward the small values. nbchunks : number of chunks on which is done the minimization. Typically, each chunk must be larger than the peaks. degree : degree of the polynome used for approaching each signal chunk. nbcores : number of cores used for minimizing in parallel on many chunks (if not None)
smooth i True, applies a final Savitsky-Golay smoothing
-
spike.plugins.bcorr.
get_ypoints
(buff, xpoints, nsmooth=0)[source]¶ from buff and xpoints, returns ypoints = buff[xpoints] eventually smoothed by moving average over 2*nsmooth+1 positions
spike.plugins.fastclean module¶
A utility to set to zero all points below a ratio
-
spike.plugins.fastclean.
fastclean
(npkd, nsigma=2.0, nbseg=20, axis=0)[source]¶ set to zeros all points below nsigma times the noise level This allows the corresponding data-set, once stored to file, to be considerably more compressive.
- nsigma: float
the ratio used, typically 1.0 to 3.0 (higher compression)
- nbseg: int
the number of segments used for noise evaluation, see util.signal_tools.findnoiselevel
- axis: int
the axis on which the noise is evaluated, default is fastest varying dimension
spike.plugins.gaussenh module¶
Gaussian enhancement apodisation
d.gaussenh(width, enhancement=1.0, axis=0)
apply an gaussian enhancement, width is in Hz enhancement is the strength of the effect axis is either F1, or F2 in 2D, 0 is default axis. multiplies by gauss(width) * exp(-enhancement*width)
Created by DELSUC Marc-André on February 2019 Copyright (c) 2019 IGBMC. All rights reserved.
spike.plugins.rem_ridge module¶
removes ridges in 2D
Created by Marc-André on 2011-08-15. Copyright (c) 2011 IGBMC. All rights reserved.
spike.plugins.sane module¶
plugin for Sane denoising
This plugin implements the SANE denoising algorithm, SANE is inspired from urQRd algorithm, but is improved in several points
faster on vector length != 2**n
much more efficient on weak signals
requires less iterations and less overestimate of rank
however, a non productive iteration is always performed, so processing time for I iterations of SANE should be compared with I+1 iterations of urQRd.
associated publications - Bray, F., Bouclon, J., Chiron, L., Witt, M., Delsuc, M.-A., & Rolando, C. (2017).
Nonuniform Sampling Acquisition of Two-Dimensional Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Increased Mass Resolution of Tandem Mass Spectrometry Precursor Ions. Analytical Chemistry, acs.analchem.7b01850. http://doi.org/10.1021/acs.analchem.7b01850
Chiron, L., van Agthoven, M. A., Kieffer, B., Rolando, C., & Delsuc, M.-A. (2014). Efficient denoising algorithms for large experimental datasets and their applications in Fourier transform ion cyclotron resonance mass spectrometry. PNAS , 111(4), 1385–1390. http://doi.org/10.1073/pnas.1306700111
-
spike.plugins.sane.
sane_plugin
(npkd, rank, orda=None, iterations=1, axis=0, trick=True, optk=False, ktrick=False)[source]¶ Apply “sane” denoising to data rank is about 2 x number_of_expected_lines Manages real and complex cases. Handles the case of hypercomplex for denoising of 2D FTICR for example.
sane algorithm. Name stands for Support Selection for Noise Elimination. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer rank : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated optk : if set to True will calculate the rank giving the best recovery for an automatic estimated noise level. trick : permits to enhanced the denoising by using a cleaned signal as the projective space. “Support Selection” ktrick : if a value is given, it permits to change the rank on the second pass.
The idea is that for the first pass a rank large enough as to be used to compensate for the noise while for the second pass a lower rank can be used.
spike.plugins.sg module¶
set of function Savitsky-Golay smoothing
-
spike.plugins.sg.
sg
(npkd, window_size, order, deriv=0, axis=0)[source]¶ applies Savitzky-Golay of order filter to data window_size : int
the length of the window. Must be an odd integer number.
- orderint
the order of the polynomial used in the filtering. Must be less than window_size - 1.
- deriv: int
the order of the derivative to compute (default = 0 means only smoothing)
- axis: int
the axis on which the filter is to be applied, default is fastest varying dimension
-
spike.plugins.sg.
sg2D
(npkd, window_size, order, deriv=None)[source]¶ applies a 2D Savitzky-Golay of order filter to data window_size : int
the length of the square window. Must be an odd integer number.
- orderint
the order of the polynomial used in the filtering. Must be less than window_size - 1.
- deriv: None, ‘col’, or ‘row’. ‘both’ mode does not work.
the direction of the derivative to compute (default = None means only smoothing)
can be applied to a 2D only.
spike.plugins.test module¶
Test procedure for plugins
spike.plugins.urQRd module¶
plugin for the urQRd denoising method
spike.plugins.zoom3D module¶
Module contents¶
Plug-ins for the Spike package
All the plugin files located in the spike/plugins folder are loaded automatically when import importing spike the first time.
the variable spike.plugins.plugins contains the list of the loaded plugin modules.
It is allways possible to load a plugin afterward by import a plugin definition at a later time during run-time.
Each plugin file should define the needed functions :
- def myfunc(npkdata, args):
“myfunc doc” …do whatever, assuming npkdata is a NPKData return npkdata # THIS is important, that is the standard NPKData mechanism
and register them into NPKData as follows : NPKData_plugin(“myname”, myfunc)
then, any NPKData will inherit the myname() method
For the moment, only NPKData plugins are handled.
-
spike.plugins.
load
(debug=True)[source]¶ the load() function is called at initialization, and loads all files found in the plugins folders typically
plugins folder in distribution
$HOME/Spike/plugins
-
spike.plugins.
loadfolder
(folder, debug=True)[source]¶ the loads plugins from a given folder import all python code found here except the ones with names starting with a _