libzim 9.2.2 ============ * Avoid crash scenario in case of invalid offset in clusters (@mgautierfr #895) * Many improvements around testing (@mgautierfr #889 #897) * Windows CI using GitHub Windows runner (@mgautierfr #894) * Better error message when failin to open (split) ZIM file (@mgautierfr #884) libzim 9.2.1 ============ * Better handling of split ZIM files (@mgautierfr #879) * Fix creation of shared_ptr in test (@mgautierfr #881) libzim 9.2.0 ============ * Allow open Archive with a set of (positionned) file descriptors (@mgautierfr #860) * Introduce new (private) method `getEntryByPathWithNamespace` (@mgautierfr #859) * Internally catch xapian exception and rethrow ZimFileFormatError instead (@mgautierfr #873) * Fix compilation on Haiku (@Begasus #857) * Fix macos mmap (@mgautierfr #867) * Optimize checksum calculation (@aryanA101a #861) * Introduce `Formatter` helper (@ShaopengLin #862) * Rename all `*Url*` symbols to `*Path*` (@mgautierfr #869) * Build script: Allow to disable compilation of test (@kelson42 #854) * [CI] Use kiwix-build github's action to download deps archive (@mgautierfr #850) * [CI] Build libzim on macos-14 (@kelson42 #856) libzim 9.1.0 ============ * New addAlias() method in the creator (@mgautierfr #833) * Bump-up ZIM format minor version to 6.1.2 (@mgautierfr #847) libzim 9.0.0 ============ * getMediaCount() does not fail anymore if M/Counter is missing (@mgautierfr #827) * Reintroduce optimization of Entry::getItem() (@kelson42 @mgautierfr #836) * C++17 compatible code (@mgautierfr #819) * Add support to recent googletest framework (@kelson42 #830) * Multiple fixes for Apple macOS/iOS compilation & CI (@mgautierfr @kelson42 @rgaudin #832 #839) libzim 8.2.1 ============ * Better indexer CJK content (@mgautierfr #806) * Fix export of symbol on Windows dll (@xiaoyifang @mgautierfr #796 #807) * Remove accents from the search query (@mgautierfr #804) * Improved performance of removing of accents when working on long string (@mgautierfr #797) * Various improvement and fixes of the CI (@mgautierfr @kelson42) libzim 8.2.0 ============ * Deprecate `SearchIterator::getSize()` method (@mgautierfr #774) * Fix handling of search end iterator (@mgautierfr #774) There were cases when we could dereference a end iterator. * Fix suggestions with titles containing punctuations (@veloman-yunkan #765) * Correctly publish our public API in Windows's dll (@xiaoyifang #783) * Fix various warning and compilation error when compiling with last xcode version (@mgautierfr #782) * Fix faulty unit-test checking for async errors (@mgautierfr #776) * Update subproject wrap zstd to version 1.5.4 (and use upstream wrap file.) (@mgautierfr #749) * Move main branch from `master` to `main` * Add CI to build on aarch64 (@mgautierfr #784) * Various CI improvement (@kelson42) libzim 8.1.1 ============ * Revert a ABI breaking change introduced in 8.1.0 (Optimization of `Entry::getItem()`) libzim 8.1.0 ============ * Optimization ofthe first call to `zim::Archive::iterEfficient` (@veloman-yunkan #724) * Add some documentation to `zim::writer::IndexData` (@mgautierfr #727) * Correctly catch and rethrow exception thrown in worker threads at creation (@mgautierfr #496 #748) * Optimization of `Entry::getItem()` (@veloman-yunkan #732) * Fix declaration of `zim::setICUDataDirectory()` (@MohitMaliFtechiz #733) * Add `zim::Archive::getMediatCount()` (@mgautierfr #730) * Make compilaton of examples optional (@mgautierfr #738) * Add a CI for wasm (@mgautierfr #746) * Make constructor of SuggestionItem public (@veloman-yunkan #740) libzim 8.0.1 ============ * Fix debian packagin (@mgautierfr) libzim 8.0.0 ============ * [API-BREAK] Remove lzma compression support in writer (@veloman-yunkan #718) * Add new method `zim::Entry::getRedirectEntryIndex()` (@veloman-yunkan #716) * Add new helper function `zim::setICUDataDirectory()` to help android wrapper compilation (@mgautierfr #722) * Fix `std::call_once` usage (alpine bug) (@veloman-yunkan #ê708) * Better xapian indexation (no transaction, better compact algorithm) (@mgautierfr #719) * Reserve more space (1968B instead of 944B) for mimetype list (@mgautierfr #720) * [CI] Fix android compilation in the CI (@veloman-yunkan @mgautierfr #713) * [CI] Add CI for Alpine (@veloman-yunkan #710) * [CI] Support checkout of tag in the CI (@teeks99 #696) * [CI] Remove movebot (@kelson42 704) * [CI] Remove Impish and add Kinetic packages (@legoktm #715) * Fix code factor report (@kelson42 #700) * Fix readme (@kelson42 #701 #716) libzim 7.2.2 ============ * Change the way we generate search result snippet. We now ask xapian to generate "less" relevant snippet (even if in practice, snippets are still good). But it know generate snippet far more quicker. On cold search, no cache and low IO, search can go from 90s to 3s. (@mgautier #697) * [CI] Update base images (@mgautier #695) libzim 7.2.1 ============ * Make suggestions diacritics insensitive (@veloman-yunkan #691) * [Writer] Raise an exception when user add a invalid entry (duplicate path) instead of printing a message (which can be too easily missed) and be buggy (@mautierfr #690) * [Writer] Do not `hasIndexData` and `getTitle` in the main thread when we add an entry (@mgautier #684) * [Writer] Properly clean and stop the writer even if user hasn't call `finishZimCreation` (The created zim file is still invalid) (@veloman-yunkan #666) * Add a default argument value for mimetype of `creator::addMetadata` (@kelvinhammond #678) * Use a more informative message in exception when we cannot open a file (@veloman-yunkan #667 #668) * Use a generic dirent lookup to search by title (@veloman-yunkan #651) * Various improvements: - CI, Packaging : Stop creating packages for Ubuntu Hirsute (@legoktm #664) - Update Readme (@TheDuchy #660) - Fix cross-compilation host machine detection (@kelson42 #665) - Fix macos/ios compilation (@mgautierfr #672) - Update documentation @mgautierfr #677, @veloman-yunkan #682 libzim 7.2.0 ============ * Add methods to get/print (dependences) versions (@kelson42, #452) * Fix Emscripten compilation (@kelson42, @mossroy, #643) libzim 7.1.0 ============ * Fix dirent test on 32 bits architectures (@mgautierfr #632) * Fix compilation on Alpine - with musl (@amirouche #649) * Don't crash if ZIM without illustration nor X/W namespace (@mgautierfr #641) * Switch default suggestion operator to AND (@maneeshpm #644) * Add a new method Archive::getMetadataItem (@mgautierfr #639) * Better indexion criterias (@mgautierfr #642) * Avoid duplicated archives in the searcher (@veloman-yunkan #648) * Fix random entry (@veloman-yunkan #650) * Various improvements. - CI @mgautierfr #640, @kelson42 #638, @legoktm #654 - Doc @rgaudin #646 libzim 7.0.0 ============ Version 7.0.0 is a major release. The API has been completely rewritten. Most notable change is that namespaces are now hidden. The new API is described in documentation, which includes a Transition Guide from v6. ZIM files created with it uses new ZIM minor version (6.1 - see Header section of spec.) Both backward and forward compatibility is kept. Improvements ------------ * Rewrite creator and reader API This removes the namespace from the API. Article are automatically put in the right namespace ('A') and the retrivial of content is made using specific API. (@mgautier #454) * Better handling of the conditional compilation without xapian. Before that, the search API was present (but returning empty result) if libzim was compiled without xapian. Now the API is not present anymore. User code must check if libzim is compiled with xapian or not by checking if LIBZIM_WITH_XAPIAN is defined or not. (@mgautierfr #465) * Add a new specific listing in zim files to list entries considered as "front article". At creation, wrapper MUST pass the hint `FRONT_ARTICLE` to correctly mark the entry. Search by title uses this list if present. (@mgautierfr #487) * Store the wellknown entries in the `W` namespace (`W/mainPage`) (@mgautierfr #497) * Rewrite Search API. Fix potential memory link and allow correct reusing of create search. (@mgautierfr #530) * New suggestion search API. The api mimics the Search API but specialized for suggestion (@maneeshpm #574) * Add `zim::Archive` constructors to open an archive using a existing file descriptor. This API is not available on Windows. (@veloman-yunkan #449) * Make zstd the default compression algorithm (@veloman-yunkan #480) * The method `zim::Archive::checkIntegrity` now if the mimetypes indicated in the dirents are correct (@veloman-yunkan #505) * Writer doesn't add a `.zim` extension to the given path. (@maneeshpm #503) * Implement random entry picking. We are choosing a entry from the "front article" list if present. (@mgautierfr #476) * Creator now create the `M/Counter` metadata. (@mgautierfr) * Better Illustration handling. Favicon is replaced by Illustration. Illustration can now have different size and scale (even if the API do not use this feature) (@mgautierfr #540) * Search iterator now have a method `getZimId` to know the Id of the zim corresponding to the result (useful for multizim search) (@maneeshpm #557) Bug fixes --------- * The method `zim::Archive::checkIntegrity` now check if the dirents are correctly sorted. (@veloman-yunkan #448) * Handle large MIME-type list. Some zim file may have a pretty large mimetype list. (@veloman-yunkan #460) * Fix handling of zim file containing item of size 0. (@mgautierfr #483) * Better parsing of the entry paths to detect the namespace (@maneeshpm #479) * Fix zim file creation on Windows (@mgautierfr #508) * Better algorithm tunning for suggestion search (@maneeshpm #492) * The default indexer now index html content only. (@mgautierfr #511) * Better suggestion search : Don't use stopwords, use OP_PHRASE (@maneeshpm #501) * Remove duplicate in the suggestion search (@maneeshpm #515) * Remove the termlist from the xapian database, lower memory usage (@maneeshpm #528) * Add a anchor in the suggestion search to search term at the beginning of the title (@maneeshpm #526) * Make the suggestion search working with special characters (`&`, `+`) (@veloman-yunkan #534) * Fix creator issue not detecting that cluster must be extended if it contains only 32-bit-sized content. (@veloman-yunkan #552) * Correctly generate suggestion snippet. (@maneeshpm #545) * Better cluster size configuration (@mgautierfr #555) * Make search iterator `getTitle` return the real title of the entry and not the one stored in the xapian database (caseless) (@maneeshpm #586) * Correcly close a zim creator to avoid a crash when the creator is destructed without being started (@mgautierfr #613) * Reduce the creator memory usage by reducing the memory size of the dirent (@mgautier #616, #628) * Write the cluster using a bigger chunk size for performance (@mgautierfr #506) * Change the default cluster size to 2MiB (@mgautierfr #585) * The default mimetype for metadata now include the utf8 chardet (@rgaudin #626) * Improve the estimation of the number of search/suggestion results by forcing Xapian to evaluate at least 10 results (@mgautier #625) Other ----- * Update xapian stopwords list. (@data-man #447) * Remove direct pthread dependency (use c++11 thread library). (@mgautierfr #443) We still need pthread library on linux and freebsd as C++11 is using it internally. * [CI] Make the libzim CI compile libzim natively on Windows (@mgautierfr #453). * [CI] Build libzim package for Ubuntu Hirsute and Impish (@legoktm #459, #580) * Always create zim file using the major version 6. (@mgautierfr #512) * Move the test data files out of the git repository. Now test files are stored in `zim-testing-suite` repository and must be downloaded. (@mgautierfr #538, #535) * Add search iterator unit test (@maneeshpm #547) * Correctly fix search iterator method case to use camelCase everywhere (@maneeshpm #563) * Add a cast to string opertor on Uuid (@maneeshpm #582) * Make unittest print the path of the missing zim file when something goes wrong (@kelson42 #601) * Delete temporary data (index) after we called `finishZimCreation` instead of waiting for creator destruction. (@mgautierfr #603) * Add basic user documentation (@mgautierfr #611) Known bugs ---------- Suggestion system using in current libkiwix doesn't work with new zim files created with this release (and future ones). New libkiwix version will be fixed and will work with new and old zim files. libzim 6.3.2 ============ This is a hotfix of 6.3.0 : * libzim now create zimfile with zstd compression 19 instead of 22. So new libzim do not need to allocate 128Mb per cluster at decompression time. * At reading time, on 32 bits architectures, zstd cluster are not keep in cache. This avoid use to also keep the decompression stream which reserve 128Mb of memory address. libzim 6.3.1 ============ The release process of 6.3.1 was buggy. So, no 6.3.1. libzim 6.3.0 ============ * Rewrite internal reader structure to use stream decompression. This allow libzim to not decompresse the whole cluster to get an article content. This is big performance improvement, it speedups random access by 2, with a very small cost when doing "full" incremental reading (zim-check/zim-dump). (@veloman-yunkan) * Better dirent lookup. Dirent lookup is the process of locating article data starting from the url or title. This improves reading of zim file up to 10% (@veloman-yunkan) * Add basic, first version of `validate` function to check internal structure of a zim file. (@veloman-yunkan, @MiguelRocha) * Fix compilation of libzim without xapian (@mgautierfr) * Remove zlib dependency (and support of very old files created using zlib compression) (@mgautierfr) * New unit tests and various small fixes. libzim 6.2.2 ============ * Check blob index before access it in the cluster. * Refactoring of the cluster reading. libzim 6.2.1 (release process broken) ===================================== * Update readme and add link to repology.org packages list. * Fix compilation on windows. libzim 6.2.0 ============ * Fix compilation of libzim on freebsd. * Rewrite unit tests to remove python based test and use gtest all the time. * Make libzstd mandatory. * Support for meson 0.45. * Fix multipart support on macos. * Add a documentation system. * Better cache system implementation (huge speed up). * Various (and numerous) small refactoring. libzim 6.1.8 ============ * Increase default timeout for test to 120 seconds/test * Compression algorithm to use can be passed to `zim::writer::Creator` * Add automatic debian packaging of libzim. * Fix using of tmpdir (and now use env var TMPDIR) during tests. libzim 6.1.7 ============ * Do not assume urlPtrPos is just after the mimetype list. * Fix compilation of compression test. * Do not exit but throw an exception if an ASSERT is not fulfill. libzim 6.1.6 ============ * Better (faster) implementation of the ordering of article by cluster. * Fix compression algorithm. libzim 6.1.5 ============ * [Writer] Remove unused declaration of classes. Those classes were not implemented nor used at all. libzim 6.1.4 ============ * [Writer] Fix excessive memory usage. Data of the cluster were clean at the end of the process, not once we don't need it. libzim 6.1.3 ============ * [Writer] Use a `.tmp` suffix and rename to `.zim` at the end of the write proces. * Add unit tests * Do not include uncessary `windows.h` headers in public zim's headers. libzim 6.1.2 ============ * [CI] Fix codecov configuration * [Writer] Fix threads synchronization at end of writing process. libzim 6.1.1 ============ * Fix bug around the find function libzim 6.1.0 ============ * Compile now on OpenBSD * [Test] Use the main function provided by gtest. * [CI] Move the CI compilation to github actions. * Add stopwords for 54 new languages. * [Writer] Improve the way we are writing cluster at zim creation time. - Clusters are directly written in the zim file instead of using temporary files. - mimetypes are limited to 944 bytes. * Add a new type of iterator to iterate over articles in a performant way reducing decompression of clusters. This is now the new default iterator. * Add support for zim files compressed with zstd compression algorithm. This is not possible to use zstd to create zim file for now. libzim 6.0.2 ============ * Fix search suggestion parsing. libzim 6.0.1 ============ * Fix crash when trying to open an empty file. * Ensure that pytest tests are run on the CI. libzim 6.0.0 ============ * [Writer] Index the articles in differents threads. This is a huge speed improvement as the main thread in not blocked by indexing. * Index the title only if `shouldIndex` return true. libzim 5.1.0 ============ * Improve indexation of the title. * Better pertinence of suggestions (only for new zim files) * Improvement of the speed of Leveinstein distance for suggestions (for old zims) libzim 5.0.2 ============ * Improve README. * Remove gtest as embeded subproject. * Better lzma compression. * Better performance of the leveinstein algorithm (better suggestions performance) libzim 5.0.1 ============ * Update README. * [Writer] Add debug information (print progress of the clusters writing). * [Writer] Correctly print the url to the user. * [CI] Add code coverage. libzim 5.0.0 ============ * Fix thread slipping for win32 crosscompilation. * Fix a potential invalid access when reading dirent. * Fix memory leak in the decompression algorithm. * [Writer] Fix a memory leak (cluster cleanning) * [Writer] Write article data in a temporary cluster file instead of a temporary file per article. * [Writer] Better algorithm to store the dirent while creating the zim file. Better memory usage. * [Writer] [API Change] Url/Ns are now handle using the same struct Url. * [Writer] [API Change] No more aid and redirectAid. A redirectArticle have to implement redirectUrl. * [Writer] Use a memory pool to avoid multiple small memory allocations. * [Writer] [API Change] Rename `ZimCreator` to `Creator`. * [API Change] File's `search` and `suggestions` now return a unique_ptr instead of a raw pointer. libzim 4.0.7 ============ * Build libzim without rpath. libzim 4.0.6 ============ * Support zim file created with cluster not written sequentially. * Remove a meson warning. libzim 4.0.5 ============ * Store the xapian database in the right url. * Do not fail when reading very small zim file (<256b). * Do not print message on normal behavior. * [BUILDSYSTEM] Be able to build a dynamic lib (libzim.so) but using static dependencies. * [CI] Use last version of meson. * [CI] Use the new deps archive xz libzim 4.0.4 ============ * Fix opening of multi-part zim. * Fix convertion of path to wpath on Windows. libzim 4.0.3 ============ * Implement low level file manipilation using different backends libzim 4.0.2 ============ * [Windows] Fix opening of zim file bigger than 4GiB libzim 4.0.1 ============ * [Writer] Fix wrong redirectyon log message * Make libzim compile natively on windows using MSVC * Better message when failing to read a zim file. * Make libzim on windows correctly open unicode path. * Add compilation option to use less memory (but more I/O). Usefull on low memory devices (android) * Small fixes libzim 4.0.0 ============ * [Writer] Remove a lot of memory copy. * [Writer] Add xapian indexing directly in libzim. * [Writer] Better API. * [Writer] Use multi-threading to write clusters. * [Writer] Ensure mimetype of articles article is not null. * Extend test timeout for cluster's test. * Less memory copy for cluster's test. * Allow skipping test using a lot memory using env variable `SKIP_BIG_MEMORY_TEST=1` * Explicitly use the icu namespace to allow using of packaged icu lib. * Use a temporary file name as long as the ZIM writting process is not finished (#163) * [Travis] Do no compile using gcc-5 (but the default trusty's one 4.8) libzim 3.3.0 ============ * Fix handling of big cluster (>4GiB) on 32 bits architecture. This is mainly done by : * Do not mmap the whole cluster by default. * MMap only the memory asociated to an article. * If an article is > 4GiB, the blob associated to it is invalid (data==size==0). * Other information are still valid (directAccessInformation, ...) * Fix writing of extended cluster in writer. * Compile libzim on macos. * Build libzim setting RPATH. * Search result urls are now what is stored in the zim file. They should not start with a `/`. This is a revert of the change made in last release. (See kiwix/kiwix-lib#123) * Spelling corrections in README. libzim 3.2.0 ============ * Support geo query if the xapian database has indexed localisation. * Handle articles bigger than 4Go in the zim file (#110). * Use AND operator between search term. * Fix compilation with recent clang (#95). * Add method to get article's data localisation in the zim file. * Be able to get only a part of article (#77). * Do not crash if we cannot open the xapian Database for some reasons. (kiwix/kiwix-tools#153) * Do not assumen there is always a checksum in the zim file. (kiwix/kiwix-tools#150) * Try to do some sanity checks when opening a zim file. * Use pytest to do some tests (when cython is available). * Use levenshtein distance to sort and have better suggestion results. * Search result urls are now always absolute (starts with a '/'). (kiwix/kiwix-lib#110) * Open the file readonly when checking the zim file (and so be able to check read only file). * Accept absolute url starting with '/' when searching for article. * Fix various bugs libzim 3.1.0 ============ * Lzma is not a optional dependency anymore. * Better handle (report and not crash) invalid zim file. * Embed source of gtest (used only if gtest is not available on the system) * Move zimDump tools out of libzim repository to zim-tools * ZimCreator tools doesn't not read command line to set options. libzim 3.0.0 ============ This is a major change of the libzim. Expect a lot new improvement and API changes. * Add a suggestion mode to the search * Fix licensing issues * Fix wrong stemming of the query when searching * Deactivate searching (and so crash) in the embedded database if the zim is splitted * Rewrite the low level memory management of libzim when reading a zim file: * We use a buffer base entity to handle memory and reading file instead of reading file using stream. * MMap the memory when posible to avoid memory copy. * Use const when posible (API break) * Move to googletest instead of cxxtools for unit-tests. * Fix endiannes bug on arm. * Do not install private headers. Those headers declare private structure and should not be visible (API break) * Compile libzim with `-Werror` and `-Wall` options. * Make libzim thread safe for reading article. The search part is not thread safe, and all search operation must be protected by a lock. * Add method to get only a part of a article. * Move some tools to zim-tools repository. libzim 2.0.0 ============ * Move to meson build system `libzim` now use `meson` as build system instead of `autotools` * Move to C++11 standard. * Fulltext search in zim file. We have integrated the xapian fulltext search in libzim. So now, libzim provide an API to search in a zim containing embeded fulltext index. This means that : *libzim need xapian as (optional) dependencies (if you want compile with xapian support). * The old and unused search API has been removed. * Remove bzip2 support. * Remove Symbian support. * Few API hanges * Make some header files private (not installed); * A `Blob` can now be cast to a `string` directly; * Change a lot of `File` methods to const methods.