function group_pval=group_pvt_alpha(D) % group_pvt_alpha returns the p-value of a group of Poisson % variability tests on a group of spike count data sets, % using the (grouping) test described in A. Amarasingham (2004; PhD thesis; Section 2.2.4) % D is assumed to be an Mx3 matrix, where there are M data sets, % with the following entries: % (i,1)= number of trials (n) in data set i % (i,2)= total number of spikes across all trials (N) in data set i % (i,3)= sum of squares of number of spikes across all trials (S) in data set i % % Individual tests are evaluated using standard 5% significance levels % (i.e., \alpha=.05, specified in first line of code, see sec 2.2.4) % % Note that the code here relies among other things % on extensive, exact computation (by dynamic programming) of the % Poisson variability test, and is computationally intensive. % (No attempt is made here to speed up the code, for example by using % Monte Carlo or chi square approximations, et cetera). As a consequence, % it will not be feasible to use this code with all data sets. % % 2005, by Asohan Amarasingham, send comments to asohan@dam.brown.edu alpha=0.05; % sets alpha value (significance level..) if size(D,2) ~= 3, 'error: input matrix has wrong dimensions', return, end f_values=zeros(size(D,1),2); % f_values(i,1) will contain the maximal value of sum of squares (S) needed % for rejecting H_0, for data set i (Poisson var. test) % f_values(i,2) will contain the r^* value ("actual alpha") of Amarasingham (2004) % get f_values: for cc=1:size(D,1) [min_bound,actual_alpha]=f( D(cc,1),alpha,D(cc,2) ); f_values(cc,1)=min_bound; f_values(cc,2)=actual_alpha; end % get group pvalue % from f_values(:,2) we want to compute distribution of sum of M % (inhomogeneous) bernoullis with those probabilities % and calculate distribution of Y=sum x_i, where x_i's are independent % Bernoulli r.v.'s with probabilities f_values(i,2) dist=zeros( size(D,1)+1,1 ); dist(1)=1; % index to dist shifted by 1 for i=1:size(D,1) new_dist=zeros( length(dist),1 ); new_dist(1)=dist(1)*( 1 - f_values(i,2) ); for j=2:i+1 new_dist(j) = dist(j-1)* f_values(i,2) + dist(j)*(1 - f_values(i,2) ); end dist=new_dist; end % get number of actual of rejections num_rej = sum( D(:,3) <= f_values(:,1) ); % now compute p-value group_pval = sum( dist(num_rej+1:length(dist)) ); return %****************************************************************************************** function [min_bound,actual_alpha] = f(n,alpha,num_spikes,max_bound) % f returns [min_bound alpha_actual] % f(n,alpha,num_spikes,max_bound) returns the maximal value of \sum_{i=1}^n which will reject the % null hypothesis of the Poisson variability test % min_bound is the maximal values (of sum of squares) % alpha_actual is the pvalue associated with the maximal value (the actual alpha) % if % n = number of trials % alpha = significance of test % num_spikes = the total number of spikes across all trials (\sum_{i=1}^n m_i) % max_bound = a guaranteed upper bound on f, leave out if there is no such bound.. % multprob(N,m,r) -- Computes the multinomial probability P( \sum_{i=1}^m X_i^2 <= r ) % where X_1, X_2, ..., X_m, are distributed multinomially as ~ M( N; 1/m, 1/m, ..., 1/m ) % use dynamic programming % prob is the probability, prob_up an upper bound on numerical accuracy.. % need max { r: multprob(mu*n,n,r) <= alpha } min_bound=floor(n*(num_spikes/n)^2); fl=floor(num_spikes/n); min_bound= ( n*( fl+1 ) - num_spikes )*( fl^2 ) + ( num_spikes - n*fl )*( fl+1 )^2; % see thesis for derivation of min_bound: the most reliable outcome.. %max_bound=n^2*mu^2; this is the true max_bound if isempty(who('max_bound')) %%%% THIS IS THE WRONG WAY TO DO IT.. START SMALL & EXPAND in powers of 2, say.. SINCE small is faster if multprob(num_spikes,n,min_bound)>alpha actual_alpha=0; return end % start small and expand (in powers of 2).. max_bound=min_bound+1; while 1 if multprob(num_spikes,n,max_bound)<=alpha max_bound=(max_bound-min_bound)*2 + min_bound; else break end end end % now have a max_bound, whittle down.. while max_bound ~= min_bound j=ceil((max_bound-min_bound)/2 + min_bound); pr=multprob(num_spikes,n,j); if pr > alpha max_bound=j-1; else min_bound=j; end end actual_alpha=multprob(num_spikes,n,min_bound); return %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [prob] = multprob(N,m,r,clock_max) % multprob(N,m,r) -- Computes the multinomial probability P( \sum_{i=1}^m X_i^2 <= r ) % where X_1, X_2, ..., X_m, are distributed multinomially as ~ M( N; 1/m, 1/m, ..., 1/m ) % use dynamic programming % prob is the probability, prob_up an upper bound on numerical accuracy.. % idea is to compute \sum_{n : \sum n_i=N, \sum n_i^2 <= r } multchoose(N;n_1,..,n_m)*(1/m)^n % using dynamic programming in expanded space (s_j,t_j) where s_j=\sum^j n_i, t_j=\sum^j n_i^2 % (see thesis of A. Amarasingham).. if m==1 % if m=1 then then sum of squares=N (w.p.1) if r>=N prob=1; else prob=0; end return end if m==2 % if m=2, compute from binomial probability % for m=2, note x_1^2 + x_2^2 <= k is equivalent to n/2-k' <= x_1 <= n/2+k', for some k' % i.e. find smallest integer c s.t. (x+c)^2 + (N- (x+c) )^2 > r x=N/2; if round(N/2)~=N/2, j=.5; else, j=0, end for c=0:ceil(1+N/2) if (x+j+c)^2 + (N - (x+j+c))^2 > r break end end if c==0 prob=0; else c=c-1; leftprob=binocdf(x+c+j,N,.5); if x-c-j-1<0 rightprob=0; else rightprob=binocdf(x-c-j-1,N,.5); end prob=leftprob-rightprob; end return end % initial loop.. alpha=sparse(N+1,r+1); % index is shifted by 1 to accomodate 0 indices.. for s1=0:N if s1^2 > r break else for s2=s1:N if s1^2 + (s2-s1)^2 > r % ie if t2>r break else t2=s1^2 + (s2-s1)^2; alpha(s2+1,t2+1) = alpha(s2+1,t2+1) + coeff( s2,s1,1/m,s2 ); end end end end % induction loop for t=2:m-2 beta=sparse(N+1,r+1); % again, index shifted by 1.. beta matrix % calculated by alpha.. % find non-empty values of alpha matrix.. % notation in mind: (sm,tm) is (s_m,t_m).. (sm1,tm1) is (s_{m+1},t_{m+1}).. [i,j]=find(alpha); for cc=1:length(i) sm=i(cc)-1; tm=j(cc)-1; % remember the index shift.. for sm1=sm:N tm1=tm + (sm1-sm)^2; if tm1>r break else beta( sm1+1,tm1+1 ) = beta(sm1+1,tm1+1) + coeff(sm1,sm,1/m,sm1-sm)*alpha( i(cc),j(cc) ); end end end alpha=beta; end % final loop beta=sparse(N+1,r+1); [i,j]=find(alpha); for cc=1:length(i) sm=i(cc)-1; tm=j(cc)-1; % remember the index shift.. tm1=tm + (N-sm)^2; if tm1 <= r beta( N+1,tm1+1 )= beta(N+1,tm1+1) + coeff(N,sm,1/m,N-sm)*alpha( i(cc),j(cc) ); end end % sum out t_m and you're done prob=sum(sum(beta)); prob=full(prob); return %-------------------------------- function val = coeff(N,k,p,P) % computes the coefficient nchoosek(N,k)*(p)^P.. % (using log gamma for more efficient numerics..) % use gammaln to keep numbers of same order, then exponentiate.. val = exp( gammaln(N+1)-gammaln(k+1)-gammaln(N-k+1) + P*log(p) ); return