spike.Algo package¶
Submodules¶
spike.Algo.BC module¶
This module contains several utilities for baseline correction of spectra
Created by Marc-Andre on 2015-03-26.
Modification by Lionel 2015-07-10
-
spike.Algo.BC.
baseline0
(y, degree=2, power=1, method='Powell', chunksize=2000, nbcores=None, ratiocov=0.7)[source]¶ compute a piece-wise baseline on y using fitpolyL1 degree : is the degree of the underlying polynome power : norm for the approximation chunksize : defines the size of the pieces. nbcores : number of cores used for parallelization of the calculations. ratiocov : covering ratio of the chunks y - baseline(y) produces a baseline corrected spectrum
-
spike.Algo.BC.
baseline1
(y, degree=2, chunksize=2000)[source]¶ compute a piece-wise baseline on y using fitpolyL1 degree is the degree of the underlying polynome chunksize defines the size of the pieces a cosine roll-off is used to smooth out chunks junctions
y - baseline(y) produces a baseline corrected spectrum
-
spike.Algo.BC.
bcL1
(y, degree=2, power=1, method='Powell')[source]¶ compute a baseline on y using fitpolyL1
-
spike.Algo.BC.
bcL1_paral
(args)[source]¶ compute a baseline on y using fitpolyL1 for the parallel case.
-
spike.Algo.BC.
correctbaseline
(y, iterations=1, nbchunks=100, firstpower=0.3, secondpower=7, degree=1, chunkratio=1.0, interv_ignore=None, method='Powell', nbcores=None, debug=False, choiceBL=0, ratiocov=0.7)[source]¶ Find baseline by using low norm value and then high norm value to attract the baseline on the small values. Parameters : iterations : number of iterations for convergence toward the small values. nbchunks : number of chunks on which is done the minimization. Typically, each chunk must be larger than the peaks. firstpower : norm used for the first iterate secondpower : norm used for attracting the curve toward the lowest values. firstdeg : degree used for the first minimization degree : degree of the polynome used for approaching each signal chunk. chunkratio : ratio for changing the chunksize inside main loop interv_ignore : ignore a given intervall in the spectrum (eg : avoids issues with water pick) method : Algorithm used for minimization on each chunk nbcores : number of cores used for minimizing in parallel on many chunks (if not None) debug : if debug is set to True, the dictionary bls is built ratiocov : covering ratio of the chunks. High recovering ratios seem to give better results. By default ratiocov = 0.7
spike.Algo.CS_transformations module¶
Authors: Marc-André Last modified: 2011/11/18.
adapted by Lionel on 2013-3-6. Copyright (c) 2011 IGBMC. All rights reserved.
-
spike.Algo.CS_transformations.
read_data
(F)[source]¶ Reads data from the sampling file, used by sampling_load()
-
spike.Algo.CS_transformations.
read_param
(F)[source]¶ Reads the sampling parameters. used by sampling_load()
-
spike.Algo.CS_transformations.
sampling_load
(addr_sampling_file)[source]¶ Loads a sampling protocole from a list of indices stored in a file named addr_sampling_file returns an nparray with the sampling scheme.
i.e. if b is a full dataset, b[sampling] is the sampled one
-
class
spike.Algo.CS_transformations.
transformations
(size_image, size_mesure, sampling=None, debug=0)[source]¶ Bases:
object
this class contains methods which are tools to generate transform and ttransform. ttrans form data to image trans from image to data.
spike.Algo.Cadzow module¶
cadzow.py
Created by Marc-André on 2010-03-18. Copyright (c) 2010 IGBMC. All rights reserved.
-
class
spike.Algo.Cadzow.
CadzowTest
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Testing Cadzow mathematics -
-
spike.Algo.Cadzow.
cadzow
(data_arg, n_of_line=5, n_of_iter=5, orda=10)[source]¶ apply the cadzow procedure to remove noise from the current FID. should be followed with an FT or an LP-SVD analysis
will return a dataset of the same type (float or cpx) as the input
will do nothing if the data is null
-
spike.Algo.Cadzow.
cadzow1d
(d1D, n_of_line=5, n_of_iter=5, orda=100)[source]¶ applies the cadzow procedure to a 1D dataset
-
spike.Algo.Cadzow.
cadzow2d
(d2D, n_of_line=5, n_of_iter=5, orda=100, mp=True, N_proc=None, verbose=1)[source]¶ applies the cadzow procedure to all columns of the 2D
if mp, does it in a multiprocessing fashion using multiprocessing.Pool() if N_proc is None, finds the optimum numer itself. verbose = 1 is minimum output, verbose 0 is no output
-
spike.Algo.Cadzow.
cadzow2dmp
(d2D, n_of_line=5, n_of_iter=5, orda=100, mp=True, N_proc=None, verbose=1)¶ applies the cadzow procedure to all columns of the 2D
if mp, does it in a multiprocessing fashion using multiprocessing.Pool() if N_proc is None, finds the optimum numer itself. verbose = 1 is minimum output, verbose 0 is no output
-
spike.Algo.Cadzow.
dt2svd
(data, orda=5)[source]¶ computes SVD at order ‘orda’ from current 1D data (FID) data should be a numpy complex 1D array
Calculates the rectangular matrix X(size-order,order) from the data and perform its singular value decomposition. This is the first step of the LP-SVD spectral analysis method. The same singular value decomposition can be used to perform forward and backward analysis.
data is untouched
returns (U, S, Vh)
see also : svd2dt svdlist svdclean1 svd2ar
-
spike.Algo.Cadzow.
svd2dt
(U, S, V)[source]¶ Recalculate the data which would produce the SVD decomposition U S V, Does this by approximating the Hankel data matrix x from the svd and u,v
U S V are untouched,
return data
see also : dt2svd svdclean svd2ar
-
spike.Algo.Cadzow.
svdclean
(svd, keep=0, threshold=0, remove=1)[source]¶ removes noise related features from a svd spectrum two methods available (uses whichever is available (eventually both)): keep != 0 : the number of svd to keep threshold != 0 : the minimum value to keep when remove == 1, the mean power of the removed SVD is removed from the remaining ones
svd is untouched
spike.Algo.Cadzow_mpi module¶
spike.Algo.Cadzow_mpi2 module¶
spike.Algo.Laplace module¶
spike.Algo.Linpredic module¶
Adaptation of code from :
file CollombBurg.py author/translator Ernesto P. Adorio
UPDEPP (UP Clark) ernesto.adorio@gmail.com
Version 0.0.1 jun 11, 2010 # first release. References Burg’s Method, Algorithm and Recursion, pp. 9-11
Created by Lionel on 2011-09-18. Removed the “for loops” so as to speed up using numpy capabilities. Copyright (c) 2010 IGBMC. All rights reserved.
-
class
spike.Algo.Linpredic.
LinpredTests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Testing linear prediction , Burg algorithm-
-
spike.Algo.Linpredic.
burgc
(lprank, data)[source]¶ Based on Gifa code Burgs Method, algorithm and recursion
lprank - number of lags in autoregressive model. data - complex data vector to approximate.
returns ar, pow1, error
-
spike.Algo.Linpredic.
burgr
(m, x)[source]¶ Based on Collomb’s C++ code, pp. 10-11 Burgs Method, algorithm and recursion
m - number of lags in autoregressive model. x - real data vector to approximate.
spike.Algo.SL0 module¶
SL0 from http://ee.sharif.ir/~SLzero/
Authors: Massoud Babaie-Zadeh and Hossein Mohimani Version: 1.3 Last modified: 4 August 2008.
adapted to numpy by Marc-André on 2011-11-18. Copyright (c) 2011 IGBMC. All rights reserved.
-
spike.Algo.SL0.
SL0
(A, x, sigma_min, sigma_decrease_factor=0.5, mu_0=2, L=3, A_pinv=None, true_s=None)[source]¶ SL0(A, x, sigma_min, sigma_decrease_factor, mu_0, L, A_pinv, true_s)
Returns the sparsest vector s which satisfies underdetermined system of linear equations A*s=x, using Smoothed L0 (SL0) algorithm. Note that the matrix A should be a ‘wide’ matrix (more columns than rows). The number of the rows of matrix A should be equal to the length of the column vector x.
The first 3 arguments should necessarily be provided by the user. The
other parameters have defult values calculated within the function, or may be provided by the user.
- Sequence of Sigma (sigma_min and sigma_decrease_factor):
- This is a decreasing geometric sequence of positive numbers:
The first element of the sequence of sigma is calculated
automatically. The last element is given by ‘sigma_min’, and the change factor for decreasing sigma is given by ‘sigma_decrease_factor’.
The default value of ‘sigma_decrease_factor’ is 0.5. Larger value
gives better results for less sparse sources, but it uses more steps on sigma to reach sigma_min, and hence it requires higher computational cost.
There is no default value for ‘sigma_min’, and it should be
provided by the user (depending on his/her estimated source noise level, or his/her desired accuracy). By `noise’ we mean here the noise in the sources, that is, the energy of the inactive elements of ‘s’. For example, by the noiseless case, we mean the inactive elements of ‘s’ are exactly equal to zero. As a rule of tumb, for the noisy case, sigma_min should be about 2 to 4 times of the standard deviation of this noise. For the noiseless case, smaller ‘sigma_min’ results in better estimation of the sparsest solution, and hence its value is determined by the desired accuracy.
- mu_0:
The value of mu_0 scales the sequence of mu. For each vlue of
sigma, the value of mu is chosen via mu=mu_0*sigma^2. Note that this value effects Convergence.
The default value is mu_0=2 (see the paper).
- L:
number of iterations of the internal (steepest ascent) loop. The
default value is L=3.
- A_pinv:
is the pseudo-inverse of matrix A defined by A_pinv=A’*inv(A*A’).
If it is not provided, it will be calculated within the function. If you use this function for solving x(t)=A s(t) for different values of ‘t’, it would be a good idea to calculate A_pinv outside the function to prevent its re-calculation for each ‘t’.
- true_s:
is the true value of the sparse solution. This argument is for
simulation purposes. If it is provided by the user, then the function will calculate the SNR of the estimation for each value of sigma and it provides a progress report.
Authors: Massoud Babaie-Zadeh and Hossein Mohimani Version: 1.4 Last modified: 4 April 2010. Web-page: ——————
Code History:
MODIF MAD : FT if true, spacity is evaluated in Fourier space Version 2.0: 4 April 2010
Doing a few small modifications that enable the code to work also for complex numbers (not only for real numbers).
- Version 1.3: 13 Sep 2008
Just a few modifications in the comments
Version 1.2: Adding some more comments in the help section
- Version 1.1: 4 August 2008
Using MATLAB’s pseudo inverse function to generalize for the case the matrix A is not full-rank.
Version 1.0 (first official version): 4 July 2008.
First non-official version and algorithm development: Summer 2006
-
spike.Algo.SL0.
SL0FT
(A, x, sigma_min, sigma_decrease_factor=0.5, mu_0=2, L=3, A_pinv=None, true_s=None)[source]¶ Simplistic implementation of FT in SL0 same argument as SL0,
A is a sampling/scrambling matrix if A is identity, trunctation or random sampling, then A_pinv should be provided as A.T
-
class
spike.Algo.SL0.
SL0_Tests
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
-
test_igor1
()[source]¶ from http://nuit-blanche.blogspot.com/2011/11/how-to-wow-your-friends-in-high-places.html code from Igor Carron
-
-
spike.Algo.SL0.
SL0gene
(transf, ttransf, x, sigma_min, sigma_decrease_factor=0.5, mu_0=2, L=3, true_s=None)[source]¶ general implementation of SL0 most arguments as SL0
transform and ttransform are generalized linear operation simplistic implementation could be : def trans(s):
“transform s to x” return np.dot(A, fft.ifft(s) )
- def ttrans(x):
“ttransform x to s” return fft.fft(np.dot(A_pinv, x))
just write your own (eventually algorithmic) or used provided ones
-
class
spike.Algo.SL0.
transformations
(size_image, size_mesure, debug=0)[source]¶ Bases:
object
this class contains methods which are tools to generate transform and ttransform functions needed by convergence algorithms
spike.Algo.SL0_test module¶
spike.Algo.maxent module¶
maxent.py
Created by Marc-André on 2012-03-06. Copyright (c) 2012 IGBMC. All rights reserved.
-
class
spike.Algo.maxent.
ExpData
(data)[source]¶ Bases:
object
Implement an experimental data to be analysed Combined with the definition of the transform ( TransferFunction ), defines the scene for the MaxEnt solver
-
class
spike.Algo.maxent.
MaxEnt
(transferfunction, expdata, iterations=10, debug=0)[source]¶ Bases:
object
implements the Maximum Entropy solver
given M finds the minimum of Q = S(M) - lamb*Chi2(M) with lamb such that Chi2(M) is normalized
-
Q
(im)[source]¶ the Q function returns the value of -Q for a given image -Q so that it can be minimized
-
do_chisquare
(image, mode='derivative')[source]¶ computes chi2 (scalar) and its derivative (array), computed between the current image
and self.expdata which is the experimental data
returned as (chi2, dchi2) if mode != ‘derivative’ then only chi2 is computed, and returned
-
do_entropy
(image, mode='derivative')[source]¶ computes the entropy S (scalar) and its derivative (array), computed on image returned as (S, dS) if mode != ‘derivative’ then only S is computed, and returned
-
drive_conv
(dchi, dS)[source]¶ returns (conv, lambda_optim) given the derivative dchi and dS, and the current settings
-
report_convergence
()[source]¶ returns a set of array, which describe the way convergence was handled returns self.iter The number of iteration performed so far self.image The final image obtained on the last iteration self.chi2 The chi2 obtained on the last iteration self.S The entropy obtained on the last iteration self.sum The integral of self.image self.lamb The lambda obtained on the last iteration self.dchi2 The derivative of chi2 obtained on the last iteration self.dS The derivative of S obtained on the last iteration (following are [] if self.debug==0 ) self.lchi2 The list of chi2 values since the beginning self.lentropy The list of S values since the beginning self.llamb The list of lambda values since the beginning
-
setup
(mode='default')[source]¶ Set-up internal MaxEnt parameters so that a kind of optimal state is given
- mode can be (case insensitive)
“Descent” “Newton” “Gifa” “default” : currently sets to Descent
- modifies (M the MaxEnt instance):
M.algo M.exptang M.lambdacont M.lambdamax M.lambdasp M.miniter M.stepmax
-
solve
()[source]¶ main entry point for MaxEnt after having been initialised, and all attributes set, calling solve will realize the MaxEnt convergence.
after having called solve(), the following fields are updated : (see self.report_convergence()) self.iter The number of iteration performed so far self.image The final image obtained on the last iteration self.chi2 The chi2 obtained on the last iteration self.S The entropy obtained on the last iteration self.sum The integral of self.image self.lamb The lambda obtained on the last iteration self.dchi2 The derivative of chi2 obtained on the last iteration self.dS The derivative of S obtained on the last iteration the following will be populated only if self.debug>0 : self.lchi2 The list of chi2 values since the beginning self.lentropy The list of S values since the beginning self.llamb The list of lambda values since the beginning
-
-
class
spike.Algo.maxent.
TransferFunction
[source]¶ Bases:
object
defines the transfert functions for a given problem to be solved by MaxEnt data <– transform – image The experimental transfer function data – t_transform –> image The transpose of `transform’
- must implement
transform() // defined above t_transform() init_transform() norm // the norm of the transform operator, defined as ||tr(x)|| / ||x||
-
spike.Algo.maxent.
d_entropy_Gifa
(F, S, sc)[source]¶ compute the derivative of entropy of the ndarray a, given entropy S and scaling value sc Gifa : dS/dFi = -1/sc (S+log(Fi/A))
-
spike.Algo.maxent.
d_entropy_Skilling
(F, sc)[source]¶ compute the derivative of entropy of the ndarray a, given entropy S and scaling value sc Skilling : dS/dFi = -1/sc (1+log(Fi/A))
-
spike.Algo.maxent.
d_entropy_prior
(F, A)[source]¶ compute the derivative of entropy of the ndarray F, given a prior knowledge in ndarray A
-
spike.Algo.maxent.
entropy
(F, sc)[source]¶ compute the entropy of the ndarray a, given scaling value sc S = sum of -Z(i)*LOG(Z(i)), where Z(i) = a(i)/sc )
-
spike.Algo.maxent.
entropy_prior
(F, A)[source]¶ compute the entropy of the ndarray F, given a prior knowledge in ndarray A S = - sum [F(i) - A(i) + F(i)*LOG(F(i)/A(i)) ]
spike.Algo.rQRd module¶
rQRd.py¶
Algorithm for denoising time series, named rQRd (standing for random QR denoising)
main function is rQRd(data, rank) data : the series to be denoised rank : the rank of the analysis
Copyright (c) 2013 IGBMC and CNRS. All rights reserved.
Marc-Andr’e Delsuc <madelsuc@unistra.fr> Lionel Chiron <lionel.chiron@gmail.com>
This software is a computer program whose purpose is to compute rQRd denoising.
This software is governed by the CeCILL license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL license as circulated by CEA, CNRS and INRIA at the following URL “http://www.cecill.info”.
As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software’s author, the holder of the economic rights, and the successive licensors have only limited liability.
In this respect, the user’s attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software’s suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.
The fact that you are presently reading this means that you have had knowledge of the CeCILL license and that you accept its terms.
Created by Lionel Chiron and Marc-Andr’e on 2012-04-04. version 1.0
-
spike.Algo.rQRd.
dt2Hankel
(data, orda)[source]¶ constructs the Hankel H matrix from the data Build the matrix by sticking shifted column vectors.
-
spike.Algo.rQRd.
rQRd
(data, k, orda=None, iterations=1)[source]¶ rQRd algorithm. Name stands for random QR denoising. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer k : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated
values are such that orda <= (len(data)+1)/2 k < orda N = len(data)-orda+1 Omega is (N x k)
spike.Algo.sane module¶
sane.py
This version is GPU accelerated, using the cuda library. Speed-up ranges from 2 to 10 depending on board and system. Depending on the size of your inboard GPU memory, the size of the data-set may be limited
main function is sane(data, rank) data : the series to be denoised rank : the rank of the analysis
Copyright (c) 2015 IGBMC. All rights reserved. Marc-Andr’e Delsuc <madelsuc@unistra.fr> Lionel Chiron <lionel.chiron@gmail.com>
This software is a computer program whose purpose is to compute sane denoising.
This software is governed by the CeCILL license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL license as circulated by CEA, CNRS and INRIA at the following URL “http://www.cecill.info”.
As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software’s author, the holder of the economic rights, and the successive licensors have only limited liability.
In this respect, the user’s attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software’s suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.
The fact that you are presently reading this means that you have had knowledge of the CeCILL license and that you accept its terms.
First Created by Lionel Chiron and Marc-Andr’e on september 2015. CUDA implementation by Laura Duciel on March 2019
associated publications - Bray, F., Bouclon, J., Chiron, L., Witt, M., Delsuc, M.-A., & Rolando, C. (2017).
Nonuniform Sampling Acquisition of Two-Dimensional Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Increased Mass Resolution of Tandem Mass Spectrometry Precursor Ions. Analytical Chemistry, acs.analchem.7b01850. http://doi.org/10.1021/acs.analchem.7b01850
-
spike.Algo.sane.
Cu_FastHankel_prod_mat_vec
(gene_vect, prod_vect, k=0)[source]¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result k =0 initializes cuda for gene_vect
-
spike.Algo.sane.
Cu_FastHankel_prod_mat_vec_2dt
(prod_vect, k, Q)[source]¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result k =0 initializes cuda for gene_vect
-
spike.Algo.sane.
Cu_Fast_Hankel2dt
(Q, QH)[source]¶ returning to data from Q and QstarH Based on FastHankel_prod_mat_vec.
-
spike.Algo.sane.
Cu_vec_mean
(M, L)[source]¶ Vector for calculating the mean from the sum on the antidiagonal. data = vec_sum*vec_mean
-
spike.Algo.sane.
FastHankel_prod_mat_mat
(gene_vect, matrix)[source]¶ Fast Hankel structured matrix matrix product based on FastHankel_prod_mat_vec
-
spike.Algo.sane.
FastHankel_prod_mat_vec
(gene_vect, prod_vect, k=0)¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result
-
spike.Algo.sane.
Fast_Hankel2dt
(Q, QH)¶ returning to data from Q and QstarH Based on FastHankel_prod_mat_vec.
-
class
spike.Algo.sane.
OPTK
(fid, orda, prec=0.9, debug=False)[source]¶ Bases:
object
Class for finding the best rank for classical sane. The rank is calculated so as to permit the retrieving of the signal power at high enough level. Passed parameters are the Fid “fid”, the estimated number of lines “estim_nbpeaks” and the order “orda”
-
spike.Algo.sane.
SFastHankel_prod_mat_vec
(gene_vect, prod_vect, k=0)[source]¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result
-
spike.Algo.sane.
SFast_Hankel2dt
(Q, QH)[source]¶ returning to data from Q and QstarH Based on FastHankel_prod_mat_vec.
-
spike.Algo.sane.
sane
(data, k, orda=None, iterations=1, trick=True, optk=False, ktrick=False)[source]¶ sane algorithm. Name stands for Support Selection for Noise Elimination. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer k : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated optk : if set to True will calculate the rank giving the best recovery for an automatic estimated noise level. trick : permits to enhanced the denoising by using a cleaned signal as the projective space. “Support Selection” ktrick : if a value is given, it permits to change the rank on the second pass.
The idea is that for the first pass a rank large enough as to be used to compensate for the noise while for the second pass a lower rank can be used.
values are such that orda <= (len(data)+1)//2 k < orda N = len(data)-orda+1 Omega is (N x k)
Sane is based on the same idea than urQRd, however, there is a much clever selection of the basis on which the random projection is performed. This allows a much better recovery of small signals buried into the noise, compared to urQRd.
the flags trick, optk, ktrick control the program when all are false, sane folds back to urQRd algorithm. Optimal is certainly trick = True, optk = True, ktrick = True but not fully tested yet.
sane uses a new trick for performing a better denoising. A rank a little above the number of peaks as to be given. this permit to make a filtering matrix containing the signal quasi only after a first pass. On a second pass the full signal is projected on a new filtering subspace done from preceding denoising. A higher number of iterations will decrease even more the smallest amplitudes. ##########
-
spike.Algo.sane.
test_sane
(lendata=10000, rank=100, orda=4000, noise=200.0, iterations=1, noisetype='additive')[source]¶ ============== example of use of Sane on a synthetic data-set ===============
spike.Algo.sane_old module¶
sane.py¶
Algorithm for denoising time series, named sane (standing for “Support Selection for Noise Elimination”)
main function is sane(data, rank) data : the series to be denoised rank : the rank of the analysis
Copyright (c) 2015 IGBMC. All rights reserved. Marc-Andr’e Delsuc <madelsuc@unistra.fr> Lionel Chiron <lionel.chiron@gmail.com>
This software is a computer program whose purpose is to compute sane denoising.
This software is governed by the CeCILL license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL license as circulated by CEA, CNRS and INRIA at the following URL “http://www.cecill.info”.
As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software’s author, the holder of the economic rights, and the successive licensors have only limited liability.
In this respect, the user’s attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software’s suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.
The fact that you are presently reading this means that you have had knowledge of the CeCILL license and that you accept its terms.
First Created by Lionel Chiron and Marc-Andr’e on september 2015.
associated publications - Bray, F., Bouclon, J., Chiron, L., Witt, M., Delsuc, M.-A., & Rolando, C. (2017).
Nonuniform Sampling Acquisition of Two-Dimensional Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Increased Mass Resolution of Tandem Mass Spectrometry Precursor Ions. Analytical Chemistry, acs.analchem.7b01850. http://doi.org/10.1021/acs.analchem.7b01850
-
spike.Algo.sane_old.
FastHankel_prod_mat_mat
(gene_vect, matrix)[source]¶ Fast Hankel structured matrix matrix product based on FastHankel_prod_mat_vec
-
spike.Algo.sane_old.
FastHankel_prod_mat_vec
(gene_vect, prod_vect)[source]¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result
-
spike.Algo.sane_old.
Fast_Hankel2dt
(Q, QH)[source]¶ returning to data from Q and QstarH Based on FastHankel_prod_mat_vec.
-
class
spike.Algo.sane_old.
OPTK
(fid, orda, prec=0.9, debug=False)[source]¶ Bases:
object
Class for finding the best rank for classical sane. The rank is calculated so as to permit the retrieving of the signal power at high enough level. Passed parameters are the Fid “fid”, the estimated number of lines “estim_nbpeaks” and the order “orda”
-
spike.Algo.sane_old.
sane
(data, k, orda=None, iterations=1, trick=True, optk=False, ktrick=False)[source]¶ sane algorithm. Name stands for Support Selection for Noise Elimination. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer k : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated optk : if set to True will calculate the rank giving the best recovery for an automatic estimated noise level. trick : permits to enhanced the denoising by using a cleaned signal as the projective space. “Support Selection” ktrick : if a value is given, it permits to change the rank on the second pass.
The idea is that for the first pass a rank large enough as to be used to compensate for the noise while for the second pass a lower rank can be used.
values are such that orda <= (len(data)+1)//2 k < orda N = len(data)-orda+1 Omega is (N x k)
Sane is based on the same idea than urQRd, however, there is a much clever selection of the basis on which the random projection is performed. This allows a much better recovery of small signals buried into the noise, compared to urQRd.
the flags trick, optk, ktrick control the program when all are false, sane folds back to urQRd algorithm. Optimal is certainly trick = True, optk = True, ktrick = True but not fully tested yet.
sane uses a new trick for performing a better denoising. A rank a little above the number of peaks as to be given. this permit to make a filtering matrix containing the signal quasi only after a first pass. On a second pass the full signal is projected on a new filtering subspace done from preceding denoising. A higher number of iterations will decrease even more the smallest amplitudes. ##########
-
spike.Algo.sane_old.
test_sane
(lendata=10000, rank=100, orda=4000, noise=200.0, iterations=1, noisetype='additive')[source]¶ ============== example of use of Sane on a synthetic data-set ===============
spike.Algo.savitzky_golay module¶
Smooth (and optionally differentiate) data with a Savitzky-Golay filter.
- CODE from http://www.scipy.org/Cookbook/SavitzkyGolay
adapted by M-A Delsuc, august 2011
-
spike.Algo.savitzky_golay.
savitzky_golay
(y, window_size, order, deriv=0)[source]¶ Smooth (and optionally differentiate) data with a Savitzky-Golay filter. The Savitzky-Golay filter removes high frequency noise from data. It has the advantage of preserving the original shape and features of the signal better than other types of filtering approaches, such as moving averages techniques. Parameters ———- y : array_like, shape (N,)
the values of the time history of the signal.
- window_sizeint
the length of the window. Must be an odd integer number.
- orderint
the order of the polynomial used in the filtering. Must be less than window_size - 1.
- deriv: int
the order of the derivative to compute (default = 0 means only smoothing)
- ysndarray, shape (N)
the smoothed signal (or it’s n-th derivative).
The Savitzky-Golay is a type of low-pass filter, particularly suited for smoothing noisy data. The main idea behind this approach is to make for each point a least-square fit with a polynomial of high order over a odd-sized window centered at the point. Examples ——– t = np.linspace(-4, 4, 500) y = np.exp( -t**2 ) + np.random.normal(0, 0.05, t.shape) ysg = savitzky_golay(y, window_size=31, order=4) import matplotlib.pyplot as plt plt.plot(t, y, label=’Noisy signal’) plt.plot(t, np.exp(-t**2), ‘k’, lw=1.5, label=’Original signal’) plt.plot(t, ysg, ‘r’, label=’Filtered signal’) plt.legend() plt.show() References ———- .. [1] A. Savitzky, M. J. E. Golay, Smoothing and Differentiation of
Data by Simplified Least Squares Procedures. Analytical Chemistry, 1964, 36 (8), pp 1627-1639.
- 2
Numerical Recipes 3rd Edition: The Art of Scientific Computing W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery Cambridge University Press ISBN-13: 9780521880688
-
spike.Algo.savitzky_golay.
savitzky_golay2D
(z, window_size, order, derivative=None)[source]¶ realises Savitzky-Golay smoothing in 2D. see savitzky_golay() for more information
-
spike.Algo.savitzky_golay.
sgolay_coef
(window_size, order, deriv=0)[source]¶ compute savistki-golay coefficients
spike.Algo.urQRd module¶
urQRd.py¶
Algorithm for denoising time series, named urQRd (standing for “uncoiled random QR denoising”) main function is urQRd(data, rank) data : the series to be denoised rank : the rank of the analysis
Copyright (c) 2013 IGBMC. All rights reserved.
Marc-Andr’e Delsuc <madelsuc@unistra.fr> Lionel Chiron <lionel.chiron@gmail.com>
This software is a computer program whose purpose is to compute urQRd denoising.
This software is governed by the CeCILL license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL license as circulated by CEA, CNRS and INRIA at the following URL “http://www.cecill.info”.
As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software’s author, the holder of the economic rights, and the successive licensors have only limited liability.
In this respect, the user’s attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software’s suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.
The fact that you are presently reading this means that you have had knowledge of the CeCILL license and that you accept its terms.
Created by Lionel Chiron and Marc-Andr’e on 2013-10-13.
version 2.0 28/oct/2013
-
spike.Algo.urQRd.
FastHankel_prod_mat_mat
(gene_vect, matrix)[source]¶ Fast Hankel structured matrix matrix product based on FastHankel_prod_mat_vec
-
spike.Algo.urQRd.
FastHankel_prod_mat_vec
(gene_vect, prod_vect)[source]¶ Compute product of Hankel matrix (gene_vect) by vector prod_vect. H is not computed M is the length of the result
-
spike.Algo.urQRd.
Fast_Hankel2dt
(Q, QH)[source]¶ returning to data from Q and QstarH Based on FastHankel_prod_mat_vec.
-
spike.Algo.urQRd.
test_urQRd
(lendata=10000, rank=100, orda=4000, noise=200.0, iterations=1, noisetype='additive')[source]¶ ============== example of use of urQRd on a synthetic data-set ===============
-
spike.Algo.urQRd.
urQRd
(data, k, orda=None, iterations=1)[source]¶ urQRd algorithm. Name stands for uncoiled random QR denoising. From a data series return a denoised series denoised data : the series to be denoised - a (normally complex) numpy buffer k : the rank of the analysis orda : is the order of the analysis
internally, a Hankel matrix (M,N) is constructed, with M = orda and N = len(data)-orda+1 if None (default) orda = (len(data)+1)/2
iterations : the number of time the operation should be repeated
values are such that orda <= (len(data)+1)/2 k < orda N = len(data)-orda+1 Omega is (N x k)