Performance tips to use RecurrenceMicrostatesAnalysis.jl
One of the main goals of RecurrenceMicrostatesAnalysis.jl
is its computational performance, being fast and light. For it, RecurrenceMicrostatesAnalysis.jl
has a good memory managment, allocating only the necessary memory, and a good adaptability to multi-threading jobs, spliting the work between all available threads. For that, we recommend to always use threads = true
with the distribution
function, and define a number of threads different than one in the enverionment variable JULIA_NUM_THREADS
, or openning julia using julia --threads 8
.
It is crucial to note that how much larger is a dataset, more time is needed to the library compute a recurrence motif distribution, and the number of samples can also affect it.
With respect of memory consumition, RecurrenceMicrostatesAnalysis.jl
has even better performance, being extremally light. The library allocates only the necessary memory to store information, such as a vector with the number of each motif that there is in some recurrence space. It is possible to see in the following graphic the library memory usage when compared with standard approach.
RecurrenceMicrostatesAnalysis.jl
allocates memory for each thread, so when you increase the number of available threads, the library will allocate more memory to avoid concurrency. It is also necessary to allocate more memory when we increase the motif size n
, that is based on the motif area $\sigma$ (our hypervolume for spatial generalization), so largest motifs needs more memory per thread.
These measures were made using the library BenchmarkTools.jl
.
julia> using Distributions, RecurrenceMicrostatesAnalysis, BenchmarkTools
julia> data = rand(Uniform(0, 1), 10000);
julia> @benchmark distribution(data, 0.27, 3)
BenchmarkTools.Trial: 34 samples with 1 evaluation per sample. Range (min … max): 138.217 ms … 182.323 ms ┊ GC (min … max): 0.00% … 0.00% Time (median): 145.746 ms ┊ GC (median): 0.00% Time (mean ± σ): 149.646 ms ± 10.452 ms ┊ GC (mean ± σ): 0.00% ± 0.00% ▃ ██ █▃█ ▃ ▃ █▁▇▇▁██▇███▇▁█▇▁▇▁▇█▁▁▁▁▁▇▁▁▁▁▇▁▁▁▁▁▁▁▁▇▇▇▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁ 138 ms Histogram: frequency by time 182 ms < Memory estimate: 144.02 KiB, allocs estimate: 146.
julia> @benchmark distribution(data, 0.27, 3; sampling_mode = :full)
BenchmarkTools.Trial: 2 samples with 1 evaluation per sample. Range (min … max): 2.271 s … 2.789 s ┊ GC (min … max): 0.00% … 0.00% Time (median): 2.530 s ┊ GC (median): 0.00% Time (mean ± σ): 2.530 s ± 366.094 ms ┊ GC (mean ± σ): 0.00% ± 0.00% █ █ █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁ 2.27 s Histogram: frequency by time 2.79 s < Memory estimate: 144.00 KiB, allocs estimate: 145.
julia> @benchmark distribution(data, 0.27, 4)
BenchmarkTools.Trial: 14 samples with 1 evaluation per sample. Range (min … max): 266.599 ms … 558.195 ms ┊ GC (min … max): 0.00% … 0.00% Time (median): 339.976 ms ┊ GC (median): 0.00% Time (mean ± σ): 374.496 ms ± 97.831 ms ┊ GC (mean ± σ): 0.00% ± 0.00% ▁ █ ▁ █ ▁ ▁ ▁ ▁ ▁ ▁ ▁▁ █▁█▁▁▁▁█▁█▁▁█▁▁▁▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁██ ▁ 267 ms Histogram: frequency by time 558 ms < Memory estimate: 5.10 MiB, allocs estimate: 146.
julia> @benchmark distribution(data, 0.27, 4; sampling_mode = :full)
BenchmarkTools.Trial: 2 samples with 1 evaluation per sample. Range (min … max): 3.469 s … 3.943 s ┊ GC (min … max): 0.00% … 0.00% Time (median): 3.706 s ┊ GC (median): 0.00% Time (mean ± σ): 3.706 s ± 335.274 ms ┊ GC (mean ± σ): 0.00% ± 0.00% █ █ █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁ 3.47 s Histogram: frequency by time 3.94 s < Memory estimate: 5.10 MiB, allocs estimate: 145.