Efficient Lock-Free Removing and Compaction for the Cache-Trie Data Structure

Aleksandar Prokopec
Euro-Par 2018

Abstract

The recently proposed cache-trie data structure improves the performance of lock-free Ctries by maintaining an auxiliary data structure called a cache. The cache allows basic operations to run in expected O(1) time, instead of the previous O(log n) bound. While earlier work showed that cache-tries improve inserts and lookups by 1.5−5x on standard workloads, the remove operation was not previously examined. One of the main challenges of remove is to compact the trie - removing the elements should recycle the unused parts of the data structure.

In this paper, we describe a new non-compacting and two new compacting non-blocking variants of the remove operation for cache-tries. We ensure that each remove implementation runs in expected O(1) time. Compared to standard Ctries, performance improvements range between 10% and 35%, depending on the size of the data structure, the parallelism level and the hardware architecture.

[PDF] [BibTex] [Springer]

References

Areias, M., Rocha, R.: On the correctness and efficiency of lock-free expandable tries for tabled logic programs. In: Proceedings of the 16th International Symposium on Practical Aspects of Declarative Languages - Volume 8324. pp. 168–183. PADL 2014, Springer-Verlag New York, Inc., New York, NY, USA (2014)

Areias, M., Rocha, R.: A lock-free hash trie design for concurrent tabled logic programs. Int. J. Parallel Program. 44(3), 386–406 (Jun 2016)

Bagwell, P.: Ideal hash trees (2001)

Baskins, D.: The Judy array implementation. http://judy.sourceforge.net/ (2000)

Braginsky, A., Petrank, E.: Locality-conscious lock-free linked lists. In: Proceedings of the 12th International Conference on Distributed Computing and Networking. pp. 107–118. ICDCN’11, Springer-Verlag, Berlin, Heidelberg (2011), http://dl.acm.org/citation.cfm?id=1946143.1946153

Braginsky, A., Petrank, E.: A lock-free B+tree. In: Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures. pp. 58–67. SPAA ’12, ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2312005.2312016

Bronson, N.G., Casper, J., Chafi, H., Olukotun, K.: A practical concurrent binary search tree. SIGPLAN Not. 45(5), 257–268 (Jan 2010), http://doi.acm.org/10.1145/1837853.1693488

De La Briandais, R.: File searching using variable length keys. In: Papers Presented at the the March 3-5, 1959, Western Joint Computer Conference. pp. 295–298. IRE-AIEE-ACM ’59 (Western), ACM, New York, NY, USA (1959), http://doi.acm.org/10.1145/1457838.1457895

Ellen, F., Fatourou, P., Ruppert, E., van Breugel, F.: Non-blocking binary search trees. In: Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing. pp. 131–140. PODC ’10, ACM, New York, NY, USA (2010), http://doi.acm.org/10.1145/1835698.1835736

Fredkin, E.: Trie memory. Commun. ACM 3(9), 490–499 (Sep 1960), http://doi.acm.org/10.1145/367390.367400

Georges, A., Buytaert, D., Eeckhout, L.: Statistically Rigorous Java Performance Evaluation. SIGPLAN Not. 42(10), 57–76 (Oct 2007), http://doi.acm.org/10.1145/1297105.1297033

Haller, P., Prokopec, A., Miller, H., Klang, V., Kuhn, R., Jovanovic, V.: Scala improvement proposal: Futures and promises (SIP-14) (2012), http://docs.scalalang.org/sips/pending/futures-promises.html

Joisha, P.G.: Sticky tries: Fast insertions, fast lookups, no deletions for large key universes. In: Proceedings of the 2014 International Symposium on Memory Management. pp. 35–46. ISMM ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/2602988.2602998

Lea, D.: Doug Lea’s workstation (2014), http://g.oswego.edu/

Li, X., Andersen, D.G., Kaminsky, M., Freedman, M.J.: Algorithmic improvements for fast concurrent cuckoo hashing. In: Proceedings of the Ninth European Conference on Computer Systems. pp. 27:1–27:14. EuroSys ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/2592798.2592820

Liang, F.M.: Word Hy-phen-a-tion by Com-pu-ter. Ph.D. thesis, Stanford University, Stanford, CA 94305 (Jun 1983), also available as Stanford University, Department of Computer Science Report No. STAN-CS-83-977

Liu, Y., Zhang, K., Spear, M.: Dynamic-sized nonblocking hash tables. In: Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing. pp. 242–251. PODC ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/2611462.2611495

Michael, M.M.: High performance dynamic lock-free hash tables and list-based sets. In: Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures. pp. 73–82. SPAA ’02, ACM, New York, NY, USA (2002), http://doi.acm.org/10.1145/564870.564881

Oshman, R., Shavit, N.: The skiptrie: Low-depth concurrent search without rebalancing. In: Proceedings of the 2013 ACM Symposium on Principles of Distributed Computing. pp. 23–32. PODC ’13, ACM, New York, NY, USA (2013), http://doi.acm.org/10.1145/2484239.2484270

Prokopec, A., Petrashko, D., Odersky, M.: Efficient lock-free work-stealing iterators for data-parallel collections. In: 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. pp. 248–252 (March 2015)

Prokopec, A.: Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime. Ph.D. thesis, IC, Lausanne (2014)

Prokopec, A.: Scalameter website (2014), http://scalameter.github.io

Prokopec, A.: SnapQueue: Lock-free queue with constant time snapshots. In: Proceedings of the 6th ACM SIGPLAN Symposium on Scala. pp. 1–12. SCALA 2015, ACM, New York, NY, USA (2015), http://doi.acm.org/10.1145/2774975.2774976

Prokopec, A.: Pluggable scheduling for the reactor programming model. In: Proceedings of the 6th International Workshop on Programming Based on Actors, Agents, and Decentralized Control. pp. 41–50. AGERE 2016, ACM, New York, NY, USA (2016), http://doi.acm.org/10.1145/3001886.3001891

Prokopec, A.: Accelerating by idling: How speculative delays improve performance of message-oriented systems. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017: Parallel Processing. pp. 177–191. Springer International Publishing, Cham (2017)

Prokopec, A.: Analysis of Concurrent Lock-Free Hash Tries with Constant-Time Operations. ArXiv e-prints (Dec 2017)

Prokopec, A.: Encoding the building blocks of communication. In: Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. pp. 104–118. Onward! 2017, ACM, New York, NY, USA (2017), http://doi.acm.org/10.1145/3133850.3133865

Prokopec, A.: Cache-tries: Concurrent lock-free hash tries with constant-time operations. In: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. PPoPP ’18, ACM, New York, NY, USA (2018), http://doi.acm.org/10.1145/3178487.3178498

Prokopec, A.: Efficient lock-free removing and compaction for the cache-trie data structure. https://doi.org/10.6084/m9.figshare.6369134 (2018)

Prokopec, A.: Reactors.io website (2018), http://reactors.io

Prokopec, A., Bagwell, P., Odersky, M.: Cache-Aware Lock-Free Concurrent Hash Tries. Tech. rep. (2011)

Prokopec, A., Bagwell, P., Odersky, M.: Lock-Free Resizeable Concurrent Tries, pp. 156–170. LCPC 2011, Springer Berlin Heidelberg, Berlin, Heidelberg (2011)

Prokopec, A., Bagwell, P., Rompf, T., Odersky, M.: A generic parallel collection framework. In: Proceedings of the 17th international conference on Parallel processing - Volume Part II. pp. 136–147. Euro-Par’11, Springer-Verlag, Berlin, Heidelberg (2011), http://dl.acm.org/citation.cfm?id=2033408.2033425

Prokopec, A., Bronson, N.G., Bagwell, P., Odersky, M.: Concurrent tries with efficient non-blocking snapshots. In: Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. pp. 151–160. PPoPP ’12, ACM, New York, NY, USA (2012), http://doi.acm.org/10.1145/2145816.2145836

Prokopec, A., Haller, P., Odersky, M.: Containers and aggregates, mutators and isolates for reactive programming. In: Proceedings of the Fifth Annual Scala Workshop. pp. 51–61. SCALA ’14, ACM, New York, NY, USA (2014), http://doi.acm.org/10.1145/2637647.2637656

Prokopec, A., Leopoldseder, D., Duboscq, G., W¨urthinger, T.: Making collection operations optimal with aggressive jit compilation. In: Proceedings of the 8th ACM SIGPLAN International Symposium on Scala. pp. 29–40. SCALA 2017, ACM, New York, NY, USA (2017), http://doi.acm.org/10.1145/3136000.3136002

Prokopec, A., Miller, H., Haller, P., Schlatter, T., Odersky, M.: FlowPools: A LockFree Deterministic Concurrent Dataflow Abstraction, Proofs. Tech. rep. (2012)

Prokopec, A., Miller, H., Schlatter, T., Haller, P., Odersky, M.: Flowpools: A lockfree deterministic concurrent dataflow abstraction. In: LCPC. pp. 158–173 (2012)

Prokopec, A., Odersky, M.: Near optimal work-stealing tree scheduler for highly irregular data-parallel workloads. In: Cascaval, C., Montesinos, P. (eds.) Languages and Compilers for Parallel Computing. pp. 55–86. Springer International Publishing, Cham (2014)

Prokopec, A., Odersky, M.: Isolates, channels, and event streams for composable distributed programming. In: 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!). pp. 171–182. Onward! 2015, ACM, New York, NY, USA (2015), http://doi.acm.org/10.1145/2814228.2814245

Prokopec, A., Odersky, M.: Conc-Trees for Functional and Parallel Programming, pp. 254–268. Springer International Publishing, Cham (2016), http://dx.doi.org/10.1007/978-3-319-29778-1 16

Prokopec, A., Petrashko, D., Odersky, M.: On lock-free work-stealing iterators for parallel data structures p. 10 (2014)

Pugh, W.: Concurrent maintenance of skip lists. Tech. rep., College Park, MD, USA (1990)

Schlatter, T., Prokopec, A., Miller, H., Haller, P., Odersky, M.: Multi-lane flowpools: A detailed look p. 13 (2012)

Shafiei, N.: Non-blocking patricia tries with replace operations. In: 2013 IEEE 33rd International Conference on Distributed Computing Systems. pp. 216–225 (July 2013)

Steindorfer, M.J., Vinju, J.J.: Optimizing hash-array mapped tries for fast and lean immutable jvm collections. SIGPLAN Not. 50(10), 783–800 (Oct 2015), http://doi.acm.org/10.1145/2858965.2814312

Steindorfer, M.J., Vinju, J.J.: Towards a software product line of trie-based collections. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences. pp. 168–172. GPCE 2016, ACM, New York, NY, USA (2016), http://doi.acm.org/10.1145/2993236.2993251

Sujeeth, A.K., Rompf, T., Brown, K.J., Lee, H., Chafi, H., Popic, V., Wu, M., Prokopec, A., Jovanovic, V., Odersky, M., Olukotun, K.: Composition and reuse with compiled domain-specific languages. In: Proceedings of the 27th European Conference on Object-Oriented Programming. pp. 52–78. ECOOP’13, SpringerVerlag, Berlin, Heidelberg (2013), http://dx.doi.org/10.1007/978-3-642-39038-8 3