blob: 2f30620a346f808aa0271b51a5b6ac14ed7273a0 [file] [log] [blame]
1.3.1
Support for huge pages
Bugfix to old size in aligned realloc and usable size for aligned allocs when alignment > 32
Use C11 atomics for non-Microsoft compilers
Remove remaining spin-lock like control for caches, all operations are now lock free
Allow large deallocations to cross thread heaps
1.3.0
Make span size configurable and all spans equal in size, removing span size classes and streamlining the thread cache.
Allow super spans to be reserved in advance and split up in multiple used spans to reduce number of system calls. This will not increase committed physical pages, only reserved virtual memory space.
Allow super spans to be reused for allocations of lower size, breaking up the super span and storing remainder in thread cache in order to reduce load on global cache and reduce cache overhead.
Fixed an issue where an allocation of zero bytes would cause a segmentation fault from indexing size class array with index -1.
Fixed an issue where an allocation of maximum large block size (2097120 bytes) would index the heap cache array out of bounds and potentially cause a segmentation fault depending on earlier allocation patterns.
Fixed an issue where memory pages at start of aligned span run was not completely unmapped on POSIX systems.
Fixed an issue where spans were not correctly marked as owned by the heap after traversing the global span cache.
Added function to access the allocator configuration after initialization to find default values.
Removed allocated and reserved statistics to reduce code complexity.
1.2.2
Add configurable memory mapper providing map/unmap of memory pages. Default to VirtualAlloc/mmap if none provided. This allows rpmalloc to be used in contexts where memory is provided by internal means.
Avoid using explicit memory map addresses to mmap on POSIX systems. Instead use overallocation of virtual memory space to gain 64KiB alignment of spans. Since extra pages are never touched this should have no impact on real memory usage and remove the possibility of contention in virtual address space with other uses of mmap.
Detect system memory page size at initialization, and allow page size to be set explicitly in initialization. This allows the allocator to be used as a sub-allocator where the page granularity should be lower to reduce risk of wasting unused memory ranges, and adds support for modern iOS devices where page size is 16KiB.
Add build time option to use memory guards, surrounding each allocated block with a dead zone which is checked for consistency when block is freed.
Always finalize thread on allocator finalization, fixing issue when re-initializing allocator in the same thread.
Add basic allocator test cases
1.2.1
Split library into rpmalloc only base library and preloadable malloc wrapper library.
Add arg validation to valloc and pvalloc.
Change ARM memory barrier instructions to dmb ish/ishst for compatibility.
Improve preload compatibility on Apple platforms by using pthread key for TLS in wrapper library.
Fix ABA issue in orphaned heap linked list
1.2
Dual license under MIT
Fix init/fini checks in malloc entry points for preloading into binaries that does malloc/free in init or fini sections
Fixed an issue where freeing a block which had been realigned during allocation due to alignment request greater than 16 caused the free block link to be written in the wrong place in the block, causing next allocation from the size class to return a bad pointer
Improve mmap 64KiB granularity enforcement loop to avoid excessive iterations
Fix undersized adaptive cache counter array for large block in heap structure, causing potential abort on exit
Avoid hysteresis in realloc by overallocating on small size increases
Add entry point for realloc with alignment and optional flags to avoid preserving content
Add valloc/pvalloc/cfree wrappers
Add C++ new/delete wrappers
1.1
Add four cache presets (unlimited, performance priority, size priority and no cache)
Slight performance improvement by dependent class index lookup for merged size classes
Adaptive cache size per thread and per size class for improved memory efficiency, and release thread caches to global cache in fixed size batches
Merged caches for small/medium classes using 64KiB spans with 64KiB large blocks
Require thread initialization with rpmalloc_thread_initialize, add pthread hooks for automatic init/fini
Added rpmalloc_usable_size query entry point
Fix invalid old size in memory copy during realloc
Optional statistics and integer overflow guards
Optional asserts for easier debugging
Provide malloc entry point replacements and automatic init/fini hooks, and a LD_PRELOAD:able dynamic library build
Improve documentation and additional code comments
Move benchmarks to separate repo, https://github.com/rampantpixels/rpmalloc-benchmark
1.0
Initial release