Recurrence Motifs Probability Distributions
RecurrenceMicrostatesAnalysis.jl
aims to be a user-friendly library with powerful capabilities. It can be used through simple function calls or more advanced configurations that offer greater control. We will begin with the simpler usage, explaining its arguments and settings, and gradually move toward more complex configurations throughout this discussion.
One-dimensional data
This section presents a run similar to the one shown on the quick start page, but with a more detailed explanation. For one-dimensional problems, such as the logistic map or the generalized Bernoulli shift map (Beta-X), you can use a vector of positions along the trajectory as input. To illustrate this, let's consider a uniform distribution:
julia> using Distributions
julia> data = rand(Uniform(0, 1), 3000)
3000-element Vector{Float64}: 0.3582584346421953 0.15937950287511737 0.08538841252367568 0.9736158016416954 0.10003297805296296 0.22789101656304311 0.09804447235697589 0.7541301564198961 0.33452789661144744 0.6490796830467139 ⋮ 0.762177820901682 0.24135011626989977 0.7377077219758958 0.11996259788942776 0.7706878092853523 0.8731135123890762 0.5118402666066991 0.4597438444133971 0.8041048316254743
Computing the recurrence motif distribution is straightforward once the threshold
and n
(motif size) parameters are defined. A good value for threshold
can be estimated using the find_parameters
function, which we recommend using in most cases.
julia> using RecurrenceMicrostatesAnalysis
julia> th, s = find_parameters(data, 3)
(0.27182640088724275, 5.735233889351002)
julia> dist = distribution(data, th, 3)
512-element Vector{Float64}: 0.0164797507788162 0.0028415665331553183 0.0027503337783711616 0.0032354250111259458 0.0027770360480640854 0.0030729862038273255 0.0033110814419225632 0.01241878059635069 0.002650200267022697 0.0032465509568313307 ⋮ 0.0019581664441477528 0.010979083222073876 0.0016221628838451268 0.001606586559857588 0.001802403204272363 0.0016777926123720517 0.0018558077436582109 0.001862483311081442 0.01013351134846462
We do not recommend the use of find_parameters
inside a loop, as it needs to compute several distributions to find the threshold
value that maximizes recurrence entropy, which can significantly reduce the library's performance. For this reason, we have not created an overload of the distribution
function that automatically calculates the threshold
. Instead, we suggest using an average threshold
value computed from a few representative snippets of your dataset using the find_parameters
function.
The distribution
function includes several keyword arguments for configuration. Before moving on to the next section, we will discuss these arguments, as they apply to every call of the distribution function.
Motif constrained shape
There are variations in motif constraint shapes proposed in the literature, such as the triangular motif. Supporting these shape generalizations is one of the goals of RecurrenceMicrostatesAnalysis.jl
, and it is also a computational challenge. Adapting the conversion of motifs with a generic shape from a binary structure to a decimal value can be a very complex problem, and to support this in the library, we need to adapt the pipeline that converts a motif for each specific shape.
Currently, RecurrenceMicrostatesAnalysis.jl
supports five shapes: square, triangle, diagonal, line, and pair. The way the library converts these motifs constrained shapes to decimal values is detailed on the motifs page. You can change the shape using the kword shape
, which can be set to :square
, :triangle
, :diagonal
, :line
, or :pair
. By default, the library uses :square
as the default shape.
julia> dist = distribution(data, th, 3; shape = :triangle)
64-element Vector{Float64}: 0.03278148642634624 0.027636849132176233 0.01330663106364041 0.01300178015131286 0.013996439697374278 0.011793502447708056 0.024018691588785047 0.02154650645304851 0.0142456608811749 0.012147307521139296 ⋮ 0.008184245660881174 0.023377837116154872 0.01818647085002225 0.007986203827325322 0.007400979083222074 0.00805518469069871 0.00764797507788162 0.01968179795282599 0.019902091677792614
The shape :pair
doesn't require a value of n
, since it always uses n=2
. However, it is still necessary to informe a value to this parameter, that will be interpreted as the separation between two points in a diagonal.
julia> dist = distribution(data, th, 6; shape = :pair)
4-element Vector{Float64}: 0.28127027587452424 0.24850111816919027 0.24947101567220586 0.22075759028407962
When workign with shape :pair
, we recommend you to use the full structure of distribution
function.
julia> structure = [3, 9]
2-element Vector{Int64}: 3 9
julia> dist = distribution(data, data, th, structure; shape = :pair)
4-element Vector{Float64}: 0.2807440786085204 0.24962932078189345 0.24935730355116265 0.2202692970584235
Here, structure
defines the position of the second element based on the random position of the first element.
Motifs sampling
The sampling mode defines how RecurrenceMicrostatesAnalysis.jl
selects motifs from a recurrence space. Currently, the library supports four sampling modes: full, random, columnwise and triangle up. You can learn more about them on the motifs page, where we discuss how each mode works. The sampling mode can be configured using the keyword argument sampling_mode
, which can be set to :full
, :random
, :columnwise
, :columnwise_full
, or :triangleup
. By default, the library uses :random
as the default sampling mode.
julia> dist = distribution(data, th, 3; sampling_mode = :full)
512-element Vector{Float64}: 0.016602128793000093 0.0028825087305257096 0.0027643512397190744 0.0031998205608275206 0.002850020983524262 0.0031153746705052648 0.0032357573494626837 0.012516015791715268 0.0027643512397190744 0.0031998205608275206 ⋮ 0.0018565857336067051 0.010852687649004161 0.0016059182884208775 0.0015992427239685252 0.0017943917247922898 0.0016227184589592972 0.0018565857336067051 0.0018778362804466931 0.010374272196585582
Not all sampling modes are compatible with certain motif constrained shapes, and the following table illustrates the compatibility between them.
:full | :random | :columnwise | :columnwise_full | :triangleup | |
---|---|---|---|---|---|
:square | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ |
:triangle | $\checkmark$ | $\checkmark$ | |||
:diagonal | $\checkmark$ | ||||
:time | $\checkmark$ | ||||
:pair | $\checkmark$ | $\checkmark$ |
Run mode
RecurrenceMicrostatesAnalysis.jl
has two run modes that results in a different output type. The run mode :vect
allocates all required memory in beginning of the process, and return the distribution as a vector. This is the default configuration of the library for $n < 6$.
julia> dist = distribution(data, th, 4; run_mode = :vect)
65536-element Vector{Float64}: 0.002284569138276553 0.00019149409930973058 0.00016922734357604097 0.0001224671565352928 0.0001736806947227789 0.00013137385882876866 0.0001224671565352928 0.00025384101536406146 0.00019372077488309953 0.00016700066800267202 ⋮ 0.00014696058784235136 0.00012469383210866178 6.234691605433089e-5 8.238699621465152e-5 9.797372522823424e-5 6.457359162769984e-5 0.00011356045424181696 0.0001002004008016032 0.0011222444889779559
The run mode :dict
uses dictionaries to allocate memory just when needed. The total allocation of dictionary mode can be greater than when using vectors, but the real memory allocation is smaller.
julia> dist = distribution(data, th, 4; run_mode = :dict)
Dict{Int64, Float64} with 37577 entries: 63746 => 6.68003e-6 11950 => 1.33601e-5 45120 => 1.55867e-5 1703 => 4.45335e-6 37100 => 8.9067e-6 60623 => 8.68403e-5 7685 => 2.00401e-5 18374 => 2.22668e-6 64460 => 1.55867e-5 61392 => 6.68003e-6 3406 => 2.89468e-5 47756 => 6.68003e-6 28804 => 2.00401e-5 28900 => 6.68003e-6 27640 => 4.45335e-6 23970 => 4.45335e-6 62961 => 7.12536e-5 63743 => 8.9067e-6 1090 => 1.11334e-5 ⋮ => ⋮
It is important to note that the shapes :diagonal
, :line
, and :pair
are not compatible with run mode :dict
. Additionally, sampling modes :columnwise
and :columnwise_full
return a matrix in which each column represents a probability distribution for a specif time value.
julia> dist = distribution(data, th, 2; sampling_mode = :columnwise)
16×2999 Matrix{Float64}: 0.133333 0.346667 0.1 … 0.0566667 0.166667 0.04 0.103333 0.0366667 0.136667 0.0433333 0.00333333 0.0433333 0.0366667 0.0 0.0933333 0.0766667 0.0133333 0.04 0.116667 0.176667 0.0 0.05 0.206667 0.04 0.08 0.0533333 0.136667 0.05 0.0266667 0.0766667 0.02 0.00333333 0.126667 … 0.0266667 0.00333333 0.11 0.00666667 0.0 0.0966667 0.0766667 0.0 0.116667 0.0766667 0.0166667 0.0 0.0766667 0.0133333 0.06 0.0166667 0.0 0.0833333 0.0933333 0.0233333 0.0566667 0.0133333 0.0 0.13 0.08 0.00666667 0.0733333 0.00666667 0.0 0.0966667 … 0.113333 0.00333333 0.0833333 0.03 0.0 0.0 0.06 0.03 0.0266667 0.126667 0.203333 0.0 0.0433333 0.223333 0.0366667 0.0766667 0.0333333 0.0 0.05 0.0266667 0.1 0.0433333 0.0 0.0 0.0733333 0.0333333 0.0566667 0.113333 0.13 0.0 … 0.03 0.22 0.04
julia> dist = distribution(data, th, 2; sampling_mode = :columnwise_full)
16×2999 Matrix{Float64}: 0.135712 0.316772 0.108703 … 0.06002 0.17039 0.0343448 0.072024 0.0383461 0.120707 0.0513505 0.015005 0.0706902 0.0333444 0.0 0.104702 0.0926976 0.0263421 0.0466822 0.124708 0.206069 0.0 0.0446816 0.195065 0.0376792 0.0726909 0.0416806 0.128376 0.0593531 0.0203401 0.0633545 0.0363454 0.006002 0.137046 … 0.0483494 0.00433478 0.122374 0.0176726 0.0 0.101701 0.0743581 0.00200067 0.0923641 0.0686896 0.024008 0.0 0.0353451 0.0253418 0.0713571 0.0286762 0.0 0.0973658 0.0903635 0.0186729 0.0563521 0.015005 0.0 0.10937 0.0730243 0.003001 0.0883628 0.00500167 0.0 0.0920307 … 0.126042 0.00366789 0.0726909 0.0343448 0.0 0.0 0.0676892 0.0340113 0.0520173 0.128376 0.202401 0.0 0.039013 0.197066 0.0353451 0.072024 0.0276759 0.0 0.0443481 0.0296766 0.0683561 0.027009 0.0 0.0 0.0643548 0.0273424 0.0573525 0.128376 0.137046 0.0 … 0.0290097 0.227743 0.0306769
Number of samples
With exception of sampling modes :full
and :columnwise_full
, all sampling modes take motifs randomly in a recurrence space. The kword num_samples
defines the number of samples that will be used by the library, it can be either an integer value that specifies the exact number, or a decimal value interpreted as the percentage of samples taken from the entire available population. By default, RecurrenceMicrostatesAnalysis.jl
uses $5\%$.
julia> dist = distribution(data, th, 3; num_samples = 0.1)
512-element Vector{Float64}: 0.01650311526479751 0.0029094348019581663 0.0028304405874499334 0.003126390743213173 0.0028204272363150868 0.0031130396083667113 0.0032053849577214064 0.012585669781931465 0.0027903871829105475 0.003224299065420561 ⋮ 0.0019158878504672897 0.01079884290164664 0.001595460614152203 0.0015765465064530484 0.0017712505562972852 0.001593235425011126 0.001860258121940365 0.0018658210947930574 0.010471740097908322
julia> dist = distribution(data, th, 3; num_samples = 50000)
512-element Vector{Float64}: 0.01694 0.00288 0.0024 0.00332 0.00262 0.00288 0.00304 0.01216 0.00268 0.00318 ⋮ 0.00182 0.01124 0.00166 0.00192 0.00204 0.00148 0.00194 0.00206 0.01036
Threads
RecurrenceMicrostatesAnalysis.jl
is highly compatible with CPU asynchronous jobs, that can increase significantly the computational performance of the library. The kword threads
defines if the library will use threads or not, being true
by default. The number of threads used is equal to the number of threads available to Julia, being it configured by the environment variable JULIA_NUM_THREADS
, or by the running argument --threads T
in Julia initiation: For example, using julia --threads 8
.
julia> using BenchmarkTools
julia> @benchmark distribution(data, th, 4; sampling_mode = :full, threads = false)
BenchmarkTools.Trial: 4 samples with 1 evaluation per sample. Range (min … max): 1.520 s … 1.559 s ┊ GC (min … max): 0.00% … 0.00% Time (median): 1.528 s ┊ GC (median): 0.00% Time (mean ± σ): 1.534 s ± 17.402 ms ┊ GC (mean ± σ): 0.00% ± 0.00% █ █ █ █ █▁▁▁▁▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁ 1.52 s Histogram: frequency by time 1.56 s < Memory estimate: 1.02 MiB, allocs estimate: 27.
julia> @benchmark distribution(data, th, 4; sampling_mode = :full, threads = true)
BenchmarkTools.Trial: 17 samples with 1 evaluation per sample. Range (min … max): 269.315 ms … 366.376 ms ┊ GC (min … max): 0.00% … 0.00% Time (median): 293.891 ms ┊ GC (median): 0.00% Time (mean ± σ): 304.865 ms ± 27.972 ms ┊ GC (mean ± σ): 0.20% ± 0.80% ▁ ▁▁ ▁▁█ █▁▁ ▁ ▁ ▁ ▁ ▁ ▁ █▁▁▁██▁▁▁▁▁███▁███▁▁▁▁▁▁▁▁▁▁▁█▁█▁▁▁█▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁█ ▁ 269 ms Histogram: frequency by time 366 ms < Memory estimate: 5.03 MiB, allocs estimate: 145.
RecurrenceMicrostatesAnalysis.jl
allocates memory for each thread, so how many threads you use, more memory the library will allocate. It is done to increase the performance, and avoid the memory concurrency.
Metrics
RecurrenceMicrostatesAnalysis.jl
uses the library Distances.jl to simplify the configuration of metrics, and increase the computation performance. With it, modify the metric is a easy process that can be done with the kword metric
.
julia> using Distances
julia> my_metric = KLDivergence()
Distances.KLDivergence()
julia> dist = distribution(data, th, 2; metric = my_metric)
16-element Vector{Float64}: 0.06476984656437625 0.017126973537914165 0.018009784300644874 0.07498332221480987 0.01578830331331999 0.05889704247275962 0.0 0.06324883255503669 0.016842339337336 0.0 0.05823437847453858 0.06378029797642873 0.07692461641094063 0.06179452968645764 0.06270180120080053 0.3468979319546364
The default recurrence functions were configured to metrics with two arguments, like euclidean(x, y)
, so if you need to use another type of metric, it is needed to define a new recurrence function, see Recurrence functions page to know more about it.
Recurrence functions
A recurrence function defines if two points of a trajectory recurr or not. Actually the library have two recurrence functions available
- Standard recurrence: $R(\mathbf{x}, \mathbf{y})=\Theta(\varepsilon - \|\mathbf{x}-\mathbf{y}\|)$
- Recurrence with corridor threshold: $R(\mathbf{x}, \mathbf{y})=\Theta(\|\mathbf{x}-\mathbf{y}\| - \varepsilon_{min}) \cdot \Theta(\varepsilon_{max} - \|\mathbf{x}-\mathbf{y}\|)$
RecurrenceMicrostatesAnalysis.jl
automatically change between them with the type of parameters, so if you use as parameter a Float64
, the library will apply the standard recurrence, or, if you use a Tuple{Float64, Float64}
, the library will apply the recurrence with corridor threshold.
julia> dist = (distribution(data, th, 2))'
1×16 adjoint(::Vector{Float64}) with eltype Float64: 0.110547 0.0468045 0.047634 0.0848499 … 0.0313342 0.0309629 0.0816922
julia> dist = (distribution(data, (0.0, th), 2))'
1×16 adjoint(::Vector{Float64}) with eltype Float64: 0.110142 0.0478741 0.047952 0.0848343 … 0.0313342 0.0313409 0.0811252
It is possible to write your own recurrece function, we talk more about it in the Recurrence functions page.
High-dimensionality data
If you are working with a dynamical system or a data time serie with two or more dimensions, it is important to note that RecurrenceMicrostatesAnalysis.jl
effectively not works with vectors, but matrices. In this situation, each row of the matrix will represent a coordinate, and each column a set of coordinates along a trajectory. For example, if we want a uniform distribution with three dimension and 3,000 points, we will have something like:
julia> using Distributions
julia> data = rand(Uniform(0, 1), 3, 3000)
3×3000 Matrix{Float64}: 0.106559 0.790753 0.0473227 0.156753 … 0.0584074 0.142849 0.442479 0.726704 0.140109 0.485255 0.44631 0.0517217 0.199714 0.00114974 0.946425 0.436349 0.215763 0.774433 0.726645 0.838822 0.778131
This format of data is effectvely what the library uses. In the case of previous section, when we are working with vectors, RecurrenceMicrostatesAnalysis.jl
converts it to a matrix $1\times 3,000$ but when we are working which data with a dimensionality different than one, it is necessary to use the proper format.
julia> using RecurrenceMicrostatesAnalysis
julia> th, s = find_parameters(data, 3)
(0.6592558133646942, 6.127219750894992)
julia> dist = distribution(data, th, 3)
512-element Vector{Float64}: 0.009221183800623053 0.003495772140631954 0.0032932799287939477 0.003159768580329328 0.0033333333333333335 0.003186470850022252 0.003202047174009791 0.005456163773920783 0.0034646194926568757 0.003119715175789942 ⋮ 0.0034579439252336447 0.005614152202937249 0.003164218958611482 0.003108589230084557 0.003593680462839341 0.0030818869603916332 0.003611481975967957 0.003662661326212728 0.009997774810858923
Continuous problems
Continuous problems means numerically integrate a differential equation problem and take the values as input to RecurrenceMicrostatesAnalysis.jl
. Thinking in it, we make the library compatible with a powerful tool to solve these problems in Julia: the library DifferentialEquations.jl. The way to apply this kind of data in the library is similar with the other two cases discussed before, as we will demonstrate in this section.
The code of Lorenz system used in these examples was get from Example 2 of DifferentialEquations.jl documentation
julia> function lorenz!(du, u, p, t) du[1] = 10.0 * (u[2] - u[1]) du[2] = u[1] * (28.0 - u[3]) - u[2] du[3] = u[1] * u[2] - (8 / 3) * u[3] end
lorenz! (generic function with 1 method)
julia> using DifferentialEquations
julia> u0 = [1.0; 0.0; 0.0]
3-element Vector{Float64}: 1.0 0.0 0.0
julia> tspan = (0.0, 1000.0)
(0.0, 1000.0)
julia> prob = ODEProblem(lorenz!, u0, tspan)
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true Non-trivial mass matrix: false timespan: (0.0, 1000.0) u0: 3-element Vector{Float64}: 1.0 0.0 0.0
julia> sol = solve(prob)
retcode: Success Interpolation: 3rd order Hermite t: 13309-element Vector{Float64}: 0.0 3.5678604836301404e-5 0.0003924646531993154 0.003262408731175873 0.009058076622686189 0.01695647090176743 0.027689960116420883 0.041856352219618344 0.060240411865493296 0.08368541210909924 ⋮ 999.4684386984246 999.5461655578678 999.6379068125303 999.7253820792902 999.811462486767 999.8851689329069 999.9428080600619 999.9973353539323 1000.0 u: 13309-element Vector{Vector{Float64}}: [1.0, 0.0, 0.0] [0.9996434557625105, 0.0009988049817849058, 1.781434788799189e-8] [0.9961045497425811, 0.010965399721242457, 2.1469553658389193e-6] [0.9693591548287857, 0.0897706331002921, 0.00014380191884671585] [0.9242043547708632, 0.24228915014052968, 0.0010461625485930237] [0.8800455783133068, 0.43873649717821195, 0.003424260078582332] [0.8483309823046307, 0.6915629680633586, 0.008487625469885364] [0.8495036699348377, 1.0145426764822272, 0.01821209108471829] [0.9139069585506618, 1.442559985646147, 0.03669382222358562] [1.0888638225734468, 2.0523265829961646, 0.07402573595703686] ⋮ [0.4932515482755346, 0.8715606700919747, 10.516359581195958] [0.9404189451381124, 1.7670046363918026, 8.612476815089584] [2.1798699336043352, 4.244219726761523, 7.112313060328521] [5.047985066511083, 9.898098798343868, 7.577131003969779] [11.054231264129331, 20.013355756332004, 15.432985026875338] [17.030266923750112, 21.643364279228603, 34.339086610002006] [16.197155332736422, 8.52886663217251, 43.855823473225534] [10.191550116138744, -2.3118400709623317, 39.71841399721481] [9.858747856008915, -2.613711553766342, 39.371604354846596]
With the data computed, it is easy to apply to RecurrenceMicrostatesAnalysis.jl
, with a simply memory access given by DifferentialEquations.jl
.
julia> using RecurrenceMicrostatesAnalysis
julia> data = sol[:, :]
3×13309 Matrix{Float64}: 1.0 0.999643 0.996105 0.969359 … 16.1972 10.1916 9.85875 0.0 0.000998805 0.0109654 0.0897706 8.52887 -2.31184 -2.61371 0.0 1.78143e-8 2.14696e-6 0.000143802 43.8558 39.7184 39.3716
julia> th, s = find_parameters(data, 3)
(19.0781489114414, 3.3427340364604166)
julia> dist = distribution(data, th, 3)
512-element Vector{Float64}: 0.1550994080289936 0.010941275915955749 0.0003625556991722887 0.005588327378083022 0.025834408952889444 0.0001562039040359113 0.006630138521125138 0.01767024192517302 0.00035995794805672405 0.005586068464069488 ⋮ 0.001032549595586624 0.01328602866200457 0.017670806653676405 0.00065327793271418 0.027383120400568704 0.0016136552255683766 0.0010080403785397748 0.00397693106652818 0.2594232856988606
Although it is possible to compute the distribution as demonstrated above, we strongly advise against doing so in this way.
We recommend you to apply a transient into your data and take a correct time resolution while doing the process of discretization, it is needed to maximize the information available. RecurrenceMicrostatesAnalysis.jl
has a utilitary function to help with this process.
julia> prepared_data = prepare(sol, 0.2; transient = 10000, K = 1000)
3×1000 Matrix{Float64}: -10.719 -6.7762 -3.09341 -12.7639 … -2.68218 -7.09519 -10.68 -15.3216 -1.71365 -4.16461 -17.1177 -2.00772 -11.6349 -3.69038 23.3945 30.9561 17.7139 27.2908 21.6974 16.2567 36.4084
julia> th, s = find_parameters(prepared_data, 3)
(18.20265022026126, 5.416470371032311)
julia> dist = distribution(prepared_data, th, 3)
512-element Vector{Float64}: 0.0007028112449799197 0.0022690763052208834 0.0023092369477911647 0.005823293172690763 0.001244979919678715 0.004779116465863454 0.0034136546184738957 0.0011646586345381525 0.0022891566265060242 0.005522088353413655 ⋮ 0.002188755020080321 0.009578313253012049 0.0016465863453815261 0.00028112449799196787 0.0030923694779116466 0.00012048192771084337 0.0024096385542168677 0.0016867469879518072 0.03255020080321285
If you have the threshold parameter, it is also possible to simplify the call using:
julia> dist = distribution(sol, th, 3, 0.2; transient = 10000, K = 1000)
512-element Vector{Float64}: 0.0005622489959839357 0.0025502008032128516 0.0019477911646586345 0.006124497991967871 0.0014257028112449799 0.004839357429718876 0.0030120481927710845 0.0013052208835341366 0.0021485943775100404 0.0064859437751004015 ⋮ 0.0021485943775100404 0.0092570281124498 0.0016465863453815261 0.00014056224899598393 0.0037751004016064256 8.032128514056225e-5 0.0026104417670682733 0.0019879518072289156 0.03251004016064257
Spatial data
RecurrenceMicrostatesAnalysis.jl
is compatible with generalised recurrence plot analysis for spatial data proposed by Marwan, Kurths and Saparin at 2006 [9]. It allow the library to calculate a probability distribution of motifs in a tensorial recurrence space, for example, to images the recurrence space have four dimensions.
Since this is an open research field, the library is designed to support exploration and estimation for research purposes. We don’t recommend applying it in production environments 😉
The application of RecurrenceMicrostatesAnalysis.jl
to spatial data is very similar to the others presented before, but the input format is more complex. Instead to matrices we need to use abstract arrays with dimension $D$, where the first dimension will be interpreted as a coordinate dimension (such as for high-dimensionaly data), and rest of the dimensions will be the spatial data dimensionality. To illustrate it, let an image with RGB. It can be represented as an abstract array with 3 dimensions, where the first dimension will have a length 3, being each element a color value (red, blue and green), and the others two dimensions are relative to each pixel that compose the image. We will demonstrate it using a uniform distribution, where each position can be interpreted as a RGB pixel for an image 100x100.
julia> using Distributions
julia> data = rand(Uniform(0, 1), 3, 100, 100)
3×100×100 Array{Float64, 3}: [:, :, 1] = 0.000994302 0.96917 0.537133 … 0.221113 0.911737 0.941228 0.350146 0.634496 0.645865 0.211006 0.180946 0.0233403 0.0648645 0.279628 0.00495735 0.979841 0.118277 0.102487 [:, :, 2] = 0.866067 0.194661 0.639266 0.678209 … 0.593766 0.715739 0.197021 0.238985 0.109844 0.893717 0.783467 0.734497 0.820914 0.129985 0.171055 0.194803 0.550102 0.650473 0.376448 0.983477 0.121822 [:, :, 3] = 0.472401 0.410875 0.72591 0.976852 … 0.230291 0.991491 0.309363 0.825634 0.47538 0.642128 0.966311 0.572159 0.980597 0.894135 0.355797 0.244324 0.499927 0.282063 0.325088 0.0693037 0.0882074 ;;; … [:, :, 98] = 0.416043 0.18592 0.233755 0.73409 … 0.36302 0.0893145 0.122896 0.462909 0.813512 0.0448407 0.178228 0.808904 0.879995 0.129945 0.09689 0.699377 0.178228 0.572462 0.17386 0.983835 0.814822 [:, :, 99] = 0.993668 0.0726666 0.0399119 0.188741 … 0.593808 0.125076 0.695159 0.575391 0.67835 0.87844 0.0254418 0.191598 0.216431 0.434663 0.492249 0.489359 0.80068 0.220772 0.609854 0.273957 0.522254 [:, :, 100] = 0.591141 0.233691 0.296233 0.824574 … 0.152652 0.975687 0.337592 0.504007 0.850705 0.0921761 0.315545 0.128025 0.423022 0.474114 0.9405 0.168162 0.472189 0.0801102 0.587236 0.304734 0.166603
When we work with spatial data is necessarity to use the complete structure of distribution
function, defining a vector structure where each value represents the length of a motif constrained side. For example, to a square tensorial motif constrained with side 2, we can use:
julia> using RecurrenceMicrostatesAnalysis
julia> dist = distribution(data, data, 0.5, [2, 2, 2, 2])
65536-element Vector{Float64}: 0.8074478761102483 0.004181362404174075 0.003998559227812733 0.0003666473730892071 0.003931725720281992 0.00035290590425111077 0.0003645653323561622 2.3527060283407386e-5 0.004095374121899321 0.0004547176960970064 ⋮ 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Since the recurrence space has four dimensions, in this examples, it is necessary for structure
has the same number of elements, where each element will represent the motif' side lenght for each dimension.