MATLAB
Codes and PSP Manual
The
Matlab codes of PSP consist of two MATLAB functions: psp
and optsetpsp. Commented source files can be
downloaded:
Provided
below is the manual of psp. We believe they are
sufficiently detailed for using the algorithm in practice.
psp
Find data patterns a
computational model can possibly generate. The PSP (Parameter Space
Partitioning) algorithm finds discrete data patterns the given model could
produce by identifying their corresponding regions in the unknown partition of
the model's parameter space.
Syntax
Y = psp(model,x0,xbound)
Y = psp(model,x0,xbound,options)
Y = psp(model,x0,xbound,options,p1,p2,...)
[Y,xmcv] = psp(...)
[Y,xmcv,optjmp] = psp(...)
[Y,xmcv,optjmp,stime] = psp(...)
Description
Y = psp(model,x0,xbound) searches the parameter space
defined by xbound for the data patterns of model, with starting point(s), x0, and
returns discovered data patterns in Y.
Y = psp(model,x0,xbound,options) runs with the default search
options replaced by values in the structure options.
Use optsetpsp to set these options.
Y =
psp(model,x0,xbound,options,p1,p2,...) passes the problem-dependent parameters p1,p2,... directly to the function model.
In psp, model is
evaluated in the form: feval(model,x,p1,p2,...). Use options=[]
as a place holder if no options are set.
[Y,xmcv] = psp(...) returns information about the
partition of parameter space in a cell array xmcv.
[Y,xmcv,optjmp] = psp(...) returns the results of the
adaptation (i.e., optimal size of jumping distribution) of the Markov chain in
each discovered region.
[Y,xmcv,optjmp,stime] = psp(...) returns the information of elapsed
search time before finding each data pattern.
Input Arguments
model The function
whose input space is to be searched for its output patterns. model is a MATLAB
function that accepts parameter values in a column vector x and returns
a pattern of data in a row vector y. The function model can be
specified as a function handle for an M-file function:
Y = psp(@myfun,x0,xbound)
where
myfun is a MATLAB function such as
function y = myfun(x)
y
= ... % Compute a data pattern at x
The function must always return a
row vector of the same length for any given parameter point x ranged within xbound.
x0,xbound x0 must be an NDIM-by-M
matrix where NDIM is the number of model parameters and M is the number
of starting points--the user may supply multiple starting points. xbound is an NDIM-by-2
matrix whose first column sets the lower bounds of parameters, and second
column, the upper bounds. xbound must be finite (i.e., Inf or -Inf
is not allowed). In case some unconstrained parameters are unavoidable (i.e.,
plausible data patterns could be generated from their extreme values), it is
advisable to reparameterize model, making use of log, inverse logistic, or other
transformation function.
Options Search options used in psp are MaxPSP,
IniJump, SmpSize, VolEst, VolSmpSize, Display, and
DispInterval. For
details, see Options below.
Output Arguments
Y Y stores discovered data patterns in
rows.
xmcv xmcv is a 1-by-4 cell array containing the information about the
resultant partition of the algorithm run. Each cell element contains:
xmcv {1} First element of xmcv is an NDIM-by-N matrix that consists of N
column vectors of parameter values that can generate each of N discovered
data patterns.
xmcv {2} Second
element is a an NDIM-by-N matrix that contains N column
vectors of mean parameter values, each corresponding to each of the regions
identified in the partition.
xmcv {3} Third
element is a 3-dimensional matrix in which the covariance matrix of parameter
values over the uniform distribution on each region stacks up along the 3rd
dimension.
xmcv {4} fourth
element is a row vector of approximated volumes of the discovered regions on a
log scale.
optjmp A 1-by-N
vector optjmp stores the
results of adapting the Markov chain in each discovered region. The size of
adapted jumping distribution (radius of a sphere) for the i-th region can
be retrieved by IniJump*2^optjmp(i).
stime stime is a 2-by-(N+1)
matrix whose first row lists (cumulative) time points at which each of N
data patterns is discovered, and the time of terminating the algorithm in the
last column. The second row lists basically the same information except that
the unit is the number of search trials made before finding each new pattern.
Options
Search options used by psp. You can use optsetpsp to set or change the values of these fields in the
options structure options. Each
of them is explained below:
MaxPSP Maximum number of search cycles allowed before termination.
The psp search stops when the number of
search cycles of every Markov chain (in every discovered region) reaches MaxPSP. A cycle consists of a certain number of search trials
(i.e., MCMC sample points). The cycle size is set by the 2nd element of SmpSize (see below). The counting of cycles does not begin until
the adaptive stage of every Markov chain enters the stage of a regular MCMC
sampling. The default value is 6. Setting MaxPSP to a low value may cause an
incomplete search or unreliable information on the partition of parameter
space.
IniJump Initial size of jumping distribution with which the Markov chain in a newly found region starts. The jumping distribution of the MCMC process is a uniform density over a hyper-sphere. IniJump specifies the initial value for the sphere's radius. The regular MCMC sampling process in each discovered region does not begin until the size of jumping distribution is adapted so that it meets a certain rate of sample acceptance. IniJump does not depend on the scale of parameters because the range of every parameter is rescaled to [0 1] in psp. Even so, IniJump is problem-dependent and should affect the efficiency of psp. The default value is 0.1.
SmpSize Sizes of MCMC samples used to adapt the Markov chain of each
discovered region. The adaptation is performed in two stages each of which
consists of a few sampling cycles. SmpSize specifies the size of MCMC sample
in one cycle. SmpSize is a row vector of two integers,
e.g., [smpsz1 smpsz2], where smpsz1 defines the size of one cycle in the Level 1 stage, and smpsz2 in the Level 2 stage. The former must be smaller than the
latter (e.g., smpsz1 is half of smpsz2) since the Level 1 adaptation is designed to be fast but
coarse-grained whereas the Level 2 adaptation to be slow but fine-grained. smpsz2 also serves as the size of regular MCMC cycles following
the adaptation, which are counted to determine the termination of psp (see MaxPSP above). The default setting in psp is ceil([100 200]*1.2^NDIM).
VolEst Method of volume estimation of
dicovered regions.
‘Ellipsoid’ (default) estimates the volume through ellipsoid
approximation using eigen decomposition of the estimated covariance matrix from
each region's MCMC sample. Although this method does not require additional
computing time, it causes a bias toward overestimation because an ellipsoid
fills up the, if any, concavity of the region.
‘HitMiss’ uses a methodinspired by the hit-or-miss Monte Carlo
integration, which corrects the bias due to ellipsoid approximation. This
method is the same as the regular hit-or-miss integration method except that
the estimated ellipsoid is employed as a base sampling domain, which may
exclude some portion of the region and may thus result in underestimation. This
method, however, will give better estimates in general. Additional computing
time is required after the completion of search process.
VolSmpSize Size
of a sample used to estimate a region's volume when VolEst is set as 'HitMiss.'
Default is ceil(500*1.2^NDIM).
Display Level of display.
'off' displays no output;
'final' displays just the final output (number of patterns found
& elapsed time);
'min' displays the region index each time a new data
pattern is discovered;
'all' (default) displays all the information regarding the
status of search and MCMC adaptation.
DispInterval Length of intervals (in search
trials) at which psp displays elapsed time while no new
pattern is found. By default, psp does so every 30,000 search trials.
optsetpsp
Creates a psp search options structure options
in which the named parameters have the specified values.
Syntax
options =
optsetpsp('param1',value1,'param2',value2,...)
Notes
Any unspecified
parameters are set to [] (parameters with value [] indicate to use the default value
for that parameter when options is passed to psp). It is sufficient to type only the leading characters that
uniquely identify the parameter. Case is ignored for parameter names. See psp above for details of search options.