jLab Help Files

 POLYSMOOTH  Mapping using local polynomial fitting, also known as loess.  
 
    POLYSMOOTH generates a map from scattered data in two dimensions using
    a locally weighted least squares fit to a polynomial.
 
    This method is variously known as local polynomial fitting, local
    polynomial smoothing, multivariate locally weighted least squares 
    regression, lowess (originally for LOcally WEighted Scatterplot
    Smoothing), and loess.  All of these are essentially synonyms.
 
    POLYSMOOTH has the support for all of the following options:
 
        --- Cartesian or spherical geometry
        --- a constant, linear, or quadratic fit in space
        --- an additional linear or quadartic fit in time
        --- fixed-bandwidth or fixed population algorithms
        --- prescribed spatially-varying bandwidth or population
        --- multiple choices of weighting function or kernel
        --- additional datapoint weighting factors, e.g. by confidence
        --- the median-based robustification of Cleveland (1979)
 
    POLYSMOOTH is implemented using a numerically efficient algorithm that
    avoids using explicit loops. The data are pre-sorted so that different
    mapping parameters can be tried out at little computational expense.
 
    For algorithm details, see Lilly and Lagerloef (2018).
    __________________________________________________________________
 
    Local polynomial fitting on the plane
 
    Let's say we have an array Z of data is at locations X,Y.  X,Y, and Z
    can be arrays of any size provided they are all the same size.
 
    The problem is to obtain a mapped field ZHAT on some regular grid 
    specified by the vectors XO and YO.
 
    Calling POLYSMOOTH is a two-step process: 
 
        [DS,XS,YS,ZS]=TWODSORT(X,Y,Z,XO,YO,CUTOFF);
        ZHAT=POLYSMOOTH(DS,XS,YS,[],ZS,[],RHO,H);
 
    The empty arrays mark locations of optional arguments described later.
 
    In the first step, one calls TWODSORT which returns ZS, a 3D array of 
    data values at each grid point, sorted by increasing distance DS, and 
    the corresponding positions XS and YS.  See TWODSORT for more details.
   
    CUTOFF determines the maximum distance included in the sorting and 
    should be chosen to be greater than H.  
 
    In the second step, POLYSMOOTH fits a RHOth order spatial polynomial at 
    each gridpoint within a neighborhood specified by the "bandwidth" H.
 
    The fit is found by minimizing the weighted mean squared error between 
    the fitted surface and the observations.  The bandwidth H sets the 
    decay of this weighting function, described in more detail shortly.
 
    The fit order RHO may be chosen as RHO=0 (fit to a constant), RHO=1
    (fit to a plane), or else RHO=2 (fit to a parabolic surface). 
 
    The data locations (X,Y) and grid point locations (X0,Y0) shoud have
    the same units as the bandwidth H (e.g., kilometers).
 
    Note that H may either be a scalar, or a matrix of size M x N to
    specify an imposed spatially-varying bandwidth. 
 
    The dimensions of XO and YO are M x N x J, where J is the maximum
    number of data points within bandwidth cutoff at any grid point 
    location.  Then ZHAT is matrix of dimension M x N.
    __________________________________________________________________
 
    Choice of weighting function or kernel
 
    POLYSMOOTH(DS,XS,YS,[],ZS,[],RHO,{H,KERN}) weights the data points in
    the vicinity of each grid point by some decaying function of distance 
    called the kernel, specified by KERN. 
 
         KERN = 'uni' uses the uniform kernel, K=1
         KERN = 'epa' uses the Epanechnikov (parabolic) kernel K=1-(DS/H)^2
         KERN = 'bis' uses the bisquare kernel K=(1-(DS/H)^2)^2
         KERN = 'tri' uses the tricubic kernel K=(1-(DS/H)^3)^3
         KERN = 'gau' used the Gaussian kernel, K=EXP(-1/2*(3*DS/H)^2)
 
    Note that all choices of weighting function are set to vanish for DS>H.
 
    The default behavior is STR='gau' for the Gaussian kernel, which is 
    specified to have a standard deviation of H/3.
 
    KERN may also be an integer, in which the standard deviation of the 
    Gaussian is set to H/KERN, with the default corresponding to KERN = 3.
 
    KERN may also be a vector (of length greater than one) defining a 
    custom kernel, with KERN(1) corresponding to DS/H=0 and KERN(end) to 
    DS/H=1.  Kernel values at any distance are then linearly interpolated.
    __________________________________________________________________
 
    Inclusion of temporal variability
 
    It may be that the data contributing to the map is taken at different
    times, and that in constructing the map it is important to take into
    account temporal variability of the underlying field.
 
    In this case, data points with values Z are taken at locations X,Y and
    also at times T. POLYSMOOTH is then called as follows:
 
        [DS,XS,YS,TS,ZS]=TWODSORT(X,Y,T,Z,XO,YO,CUTOFF);
        ZHAT=POLYSMOOTH(DS,XS,YS,TS,ZS,[],[RHO MU],H,TAU);
 
    which will fit the data to the sum of spatial polynomial of bandwidth H
    and order RHO, and a temporal polynomial of bandwidth TAU and order MU.
 
    TAU, like H, may either be a scalar or an M x N matrix.  Its units
    should be the same as those of the times T.  
 
    Note that the times T should be given relative to the center of the 
    time window, that is, time T=0 should correspond to the time at which
    you wish to construct the map. 
    
    By default the Gaussian kernal is used in time.  One can also employ
    a cell array, as in POLYSMOOTH(..,[RHO MU],H,{TAU,KERN}), to specify 
    other behaviors for the time kernel, as described above.
    __________________________________________________________________
 
    Additional output arguments
 
    [ZHAT,BETA,AUX]=POLYSMOOTH(...) returns two additional arguments.
 
    BETA contains the estimated field, the same as ZHAT, together with 
    estimates of all spatial derivatives of the field up to RHOth order:
  
         BETA(:,:,1) = ZHAT     --- The field z
         BETA(:,:,2) = ZXHAT    --- The first derivative dz/dx
         BETA(:,:,3) = ZYHAT    --- The first derivative dz/dy
         BETA(:,:,4) = ZXXHAT   --- The second derivative d^2z/dx^2
         BETA(:,:,5) = ZXYHAT   --- The second derivative d^2z/dxdy 
         BETA(:,:,6) = ZYYHAT   --- The second derivative d^2z/dy^2
 
    The length of the third dimension of BETA is set by the total number of
    derivatives of order RHO or less.  This number, called Q, is equal to
    Q = 1, 3, and 6 for RHO = 0, 1, and 2 respectively. 
 
    After these, in the case that TS is input, the estimated time 
    derivatives are returned up to order MU:
 
        BETA(:,:,Q+1) = ZT   --- The first time derivative dz/dt
        BETA(:,:,Q+2) = ZTT  --- The second time derivative d^2z/dt^2
    
    AUX is an M x N x 5 array of auxiliary fields associated with the fit.
 
         AUX(:,:,1) = P  --- The total number of datapoints employed
         AUX(:,:,2) = H  --- The bandwidth used at each point
         AUX(:,:,3) = E  --- The rms error between the data and the fit
         AUX(:,:,4) = W  --- The total weight used 
         AUX(:,:,5) = R  --- The weighted mean distance to the data points
         AUX(:,:,6) = V  --- The weighted standard deviation of data values
         AUX(:,:,7) = C  --- The matrix condition number
 
    P, called the population, is the total number of data points within one
    bandwidth distance H from each of the (M,N) grid points. 
 
    The root-mean-squared error E and standard deviation V are both 
    computed using the same weighted kernal applied to the data.
 
    The condition number C arises because a matrix must be inverted at each
    (M,N) location for the RHO=1 or RHO=2 fits. C is equal to 1 for RHO=0. 
 
    C is computed by COND.  At (M,N) points where C is large, the least 
    squares solution is unstable, and one should consider using a lower-
    order fit RHO or a larger value of the bandwidth H.
    __________________________________________________________________
 
    Fixed population
 
    POLYSMOOTH(DS,XS,YS,[],ZS,[],RHO,P,'population') varies the spatial 
    bandwidth H to be just large enough at each grid point to encompass P 
    points. This is referred to here as the "fixed population" algorithm.
 
    Note that the argument P relaces the bandwidth H.  So, if one chooses 
    to specify a kernel, one use POLYSMOOTH(...,RHO,{P,KERN},'population'). 
 
    When employed with the option for including a temporal fit, the fixed
    population algorithm only applies to the spatial kernel.  The temporal 
    kernel remains specified in terms of a temporal bandwidth. 
 
    The fixed population algorithm can give good results when the data
    spacing is uneven, particularly when used with a higher-order fit.
 
    When using this method, the length of the third dimension of the fields
    output by TWODSORT or SPHERESORT must be at least P. If it is greater 
    than P, it may be truncated to exactly P, thus reducing the size of 
    those fields and speeding up calculations.  This is done internally,
    and also can be done externally by calling POLYSMOOTH_PRESORT.
    __________________________________________________________________
 
    Weighted data points
 
    POLYSMOOTH can incorporate an additional weighting factor on the data
    points. Let W be an array of positive values the same size as the data 
    array Z.  One may form a map incorporating these weights as follows:
 
        [DS,XS,YS,ZS,WS]=TWODSORT(X,Y,Z,W,XO,YO,CUTOFF);           
        ZHAT=POLYSMOOTH(DS,XS,YS,[],ZS,WS,RHO,H);
 
    The weights W could represent the confidence in the measurement values,
    or an aggregation of invididual measurements into clusters.  The latter 
    approach may be used to condense large datasets to a managable size.
    __________________________________________________________________
 
    Smoothing on the sphere
 
    POLYSMOOTH supports a local polynomial fit on the sphere, as described
    in Lilly and Lagerloef (2018).  As before this is a two-step process:
 
        [DS,XS,YS,ZS]=SPHERESORT(LAT,LON,Z,LATO,LONO,CUTOFF);
        ZHAT=POLYSMOOTH(DS,XS,YS,[],ZS,[],RHO,H);
 
    The only different is that one firstly one calls SPHERESORT, the
    analogue of TWODSORT for the sphere.  See SPHERESORT for more details.
 
    The bandwidth in this case should have units of kilometers. 
 
    Note that SPHERESORT and POLYSMOOTH both assume the sphere to be the 
    radius of the earth, as specified by the function RADEARTH.
 
    The derivatives appearing in BETA are now the derivatives given in a
    local tangent plane.  These can be converted into derivatives in terms
    of latitude and longitude following Lilly and Lagerloef (2018).
    _________________________________________________________________
 
    One grid, many fields
 
    It is often the case that the field to be mapped, Z, consists of many 
    repeated sets of observations at the same (X,Y) points. 
 
    For example, X and Y could be mark the locations of measurements that
    are repeated at different times (as in satellite altimetry), or else
    there could be multiple fields Z that are measured simultaneously.  
 
    There is a simple way to handle this situation without needing to 
    resort the grid.  First one calls TWODSORT or SPHERESORT as follows:
 
      [DS,XS,YS,INDEX]=TWODSORT(X,Y,XO,YO,CUTOFF);
      --- or ---
      [DS,XS,YS,INDEX]=SPHERESORT(LAT,LON,LATO,LONO,CUTOFF);
 
    INDEX is now an index into the sorted datapoint locations, such that
 
       ZS=POLYSMOOTH_INDEX(SIZE(DS),INDEX,K);
 
    returns sorted values of Z that can be passed to POLYSMOOTH.  
   
    The virtue of this approach is that one only has to call TWODSORT or 
    SPHERESORT once, no matter how many variable are to be mapped.
    __________________________________________________________________
 
    Robustification 
 
    ZHAT=POLYSMOOTH(...,'robust',NI) implements the median-based iterative 
    robust algorithm of Cleveland (1979), p 830--831, using NI iterations.  
 
    This can be useful when outliers are present in the data.   Typically
    a single iteration is sufficient to remove most outliers. 
 
    ZHAT will in this case have NI+1 entries along its third dimension.
    The iterative estimates are stored in reserve order, with the last 
    iteration in ZHAT(:,:,1) and the original estimate in ZHAT(:,:,NI+1).
    __________________________________________________________________
 
    Parallelization
 
    POLYSMOOTH(...,'parallel') parallelizes the computation using a PARFOR
    loop, by operating on each latitude (or matrix row) separately.   This 
    requires that Matlab's Parallel Computing toolbox be installed.  
 
    POLYSMOOTH will then using an existing parallel pool, or if one does 
    not exist, a pool will be created using all availabale workers.
 
    POLYSMOOTH(...'parallel',Nworkers) alternately specifies the number of
    workers to use. If you run into memory constraints, reduce Nworkers.
 
    If you are working on multiple maps simultaneously, depending on the 
    size of your problem, it may be faster to use an exterior PARFOR loop,
    rather than calling POLYSMOOTH with the 'parallel' flag. 
    __________________________________________________________________
 
    Quiet option
 
    By default, POLYSMOOTH displays a status message saying what row it is 
    working on.  POLYSMOOTH(...,'quiet') suppresses this message.
    ___________________________________________________
 
    'polysmooth --t' runs some tests.
    'polysmooth --f' generates some sample figures.
 
    Usage:  [ds,xs,ys,zs]=twodsort(x,y,z,xo,yo,cutoff);  
            zhat=polysmooth(ds,xs,ys,[],zs,[],rho,H);
            [zhat,beta,aux]=polysmooth(ds,xs,ys,[],zs,[],rho,H);
    --or--
            [ds,xs,ys,zs,ws]=spheresort(lat,lon,z,w,lato,lono,cutoff); 
            [zhat,beta,aux]=polysmooth(ds,xs,ys,[],zs,[],rho,H);
    __________________________________________________________________
    This is part of JLAB --- type 'help jlab' for more information
    (C) 2008--2020 J.M. Lilly --- type 'help jlab_license' for details
POLYSMOOTH is the jMap module of jLab.

contents | allhelp | index