Call (+1) 650-212-1212


Faster evaluation in Analytica 5.0

Lonnie Chrisman 11 Oct 2017 Analytica 5.0

Since the October 30, 2017 release of Analytica 5.0 is fast approaching, it seemed like a good time to re-run the benchmark timing tests from my earlier blog posting, "Faster evaluation in Analytica 4.6", to see how the upcoming release compares.

Speed enhancements can vary a lot across different models. I am already well-aware that models with large arrays (including large Monte Carlo runs), often experience sizeable speed-up from Analytica 5.0's new multithreaded evaluation capability. But models without large arrays usually don't benefit from this, since the overhead of dividing up a computation can easily outweigh the gains from utilizing multiple core. Similarly, models that let array abstraction take care of iteration the way it is intended are likely to benefit, whereas code that has explicit FOR loops, thus circumventing automatic array abstraction, has less opportunity to benefit. Anecdotally, I have already seen some individual cases where evaluation experienced no speed-up at all up to others that had a four-fold speed-up. To come up with an average speed-up, we need to come up with some sort of "representative mix" of real problems. This was the concept behind the benchmark suite first reported in the "Faster evaluation in Analytica 4.6" blog post.

I set aside those benchmark models several years ago for the sole purpose of benchmark testing. I excluded these from any profiling or speed-measurements during code development in order to prevent intentional or unintentional tuning to the benchmark suite. In fact, this suite of models have been hidden away gathering dust since I published that blog article, except that at some point, one more benchmark model (ASB) was added to the suite. The models are all actual, substantial models, in most cases pretty large.

I ran all the benchmarks under four test conditions: Analytica 4.6, and then Analytica 5.0 with 1 thread, 4 threads and 8 threads. I then repeated all these tests 10 times and averaged the result. For the speed-up percent, I compared to the Analytica 4.6 speed. All timings were run on the same computer, which in fact is the same computer used for the tests in the previous article (Intel Core i7-2600 @ 3.4GHz, 4 cores, 8 logical processors, Windows 7). All tests used Analytica 64-bit.


Benchmark timings
Benchmark Elapsed time (sec) Percent speed-up
4.6 5.0 (1) 5.0 (4) 5.0 (8) 5.0 (1) 5.0 (4) 5.0 (8)
AM1 24.0 23.2 22.9 23.0 3% 5% 4%
AT1 10.2 10.1 9.5 9.6 1% 7% 6%
CA1 17.0 15.4 15.6 15.6 11% 9% 9%
CE3 68.7 56.6 60.7 61.2 21% 13% 12%
ES1 60.4 56.9 56.6 56.7 6% 7% 6%
KI3 14.0 12.8 11.5 11.9 10% 22% 18%
RP1 41.4 41.3 29.3 29.2 0% 42% 42%
PO5 0.183


SE1 73.0 72.1 69.0 71.7 1% 6% 2%
SS1 11.4 11.6 11.7 11.7 -2% -3% -2%
ASB 63.3 60.4 32.4 29.1 5% 95% 117%
      Ave (w/o PO5): 6% 20% 21%

For these parallelizable models, it is interesting to see a performance boost going from 1 to 4 threads, but not much from 4 to 8 threads. I think might be related the fact that my computer has 4 cores and 8 logical processors. I seems like the 4 cores may be the more important number. (?)

The RP1 and ASB benchmarks show a strong improvement.  I am aware that RP1 does a fair amount of Monte Carlo simulation (whereas I think Monte Carlo simulation may be a but underrepresented by the other models in the suite). ASB does a lot of arithmetic on very large arrays. So the speed-up on those is consistent with those types of models benefiting from the utilization of multiple cores.

My original automated run showed disappointing results for PO5; however, upon further inspection I discovered that this very old model (created in Analytica 3.1) was using the + operator for text concatenation, and this was responsible for the slower numbers. I went through the model and changed the + operators to & in the places they were being used to concatenate text and reran that benchmark. The original numbers are in gray, the new numbers are in black. The gray numbers are not a fair comparison for several reasons. In the 4.6 test, a lot of pre-computation of indexes was carried out during model load, before the timing test started, whereas in the 5.0 tests, because of errors due to the + operator, indexes had not been pre-computed, and hence, a lot more computation was carried out under the timer. The black numbers seem to be the fair comparison. 

I have not yet reviewed the other benchmark models to see if something similar occurs. Hopefully I will eventually get to that, since it would make for a fairer comparison (and would improve the 5.0 numbers even more).

The conclusion of these benchmarks is that Analytica 5.0 seems to be about 20% faster on average with multithreading, but with actual results highly dependent on the specifics of the model.

I am sure that future releases will continue to see improvements in evalutation speed, so I'll make an effort to repeat these benchmark measurements and report them in a blog posting with each new release.

Be sure to visit the What's new in Analytica 5.0 page. There's even a video there to showcase all the huge number of new features! 

» Back

Lonnie Chrisman

Lonnie Chrisman, PhD, is Lumina's Chief Technical Officer, where he heads engineering and development of Analytica®. He has authored dozens refereed publications in the areas of machine learning, Artificial Intelligence planning, robotics, probabilistic inference, Bayesian networks, and computational biology. He was was in eighth grade when he had his first paid programming job. He was awarded the Alton B. Zerby award "Most outstanding Electrical Engineering Student in the USA", 1987. He has a PhD in Artificial Intelligence and Computer Science from Carnegie Mellon University; and a BS in Electrical Engineering from University of California at Berkeley. Lonnie used Analytica for seismic structural analysis of an extension that he built to his own home where he lives with his wife and raised four daughters: So, he really trusts Analytica calculations!

Leave a Comment