Releases: kuzudb/kuzu
v0.11.2
v0.11.2 is a minor release with several bug fixes and improvements.
What's Changed
- parsing: fix i128 min check by @tgahunia05 in #5809
- Update setuptools by @mewim in #5822
- Remove
uv
dependency by @mewim in #5824 - Fix LSQB benchmark runs by @ray6080 in #5820
- Fix OnDiskGraph vertex scan during transaction by @ray6080 in #5823
- Clean up unused getEstimatedMemUsage functions by @ray6080 in #5827
- Implement casting between unions, fix issues with nested unions and union lists/arrays by @05st in #5764
- Fix detach delete in the local rel table by @ray6080 in #5830
- Implement fts update by @acquamarin in #5818
- Union casting code small cleanup by @05st in #5829
- Fix unwind binding by @acquamarin in #5838
- Cache prepared statement in client context by @andyfengHKU in #5846
- fsync() the WAL on open by @thisismiller in #5843
- Refactor JNI bindings to use camelCase by @mewim in #5851
- Add necessary APIs for front-end extensions by @acquamarin in #5848
- Fullfsync on macOS/iOS by @ray6080 in #5850
- Move generated system config header file to binary to avoid polluting src by @ray6080 in #5854
- Fix recovering twice from corrupted wal by @ray6080 in #5852
- Support wildcard patterns in FTS queries by @05st in #5842
- Add locks to local hash index and add basic concurrent insertion tests by @ray6080 in #5828
- Implement optional params handling in table function by @acquamarin in #5855
New Contributors
- @thisismiller made their first contribution in #5843
Full Changelog: v0.11.1...v0.11.2
Release 0.11.1
v0.11.1 is a minor release with several bug fixes and improvements.
What's Changed
- 8 Byte hash slot for AggregateHashTable by @benjaminwinger in #5683
- Add Swift package to README by @mewim in #5751
- Fix attempting to free mmapped memory when spilling to disk by @benjaminwinger in #5673
- extension(llm): use proper provider names by @tgahunia05 in #5752
- extension(llm): update test suite with proper provider names by @tgahunia05 in #5753
- Fix fd leaks of shadow files by @ray6080 in #5756
- Fix syntax highlighting for quotes by @05st in #5754
- Fix export database error when a column was dropped by @ray6080 in #5763
- Remove lock file by @ray6080 in #5744
- Remove null from count agg state by @acquamarin in #5760
- Fix
FlatTuple::toString
truncating with UTF-8 by @05st in #5758 - Fix empty unique_ptr dereference by @benjaminwinger in #5641
- Support
ignore_pattern
option in full text search by @acquamarin in #5765 - Create default values for union during copying / runtime by @05st in #5772
- Refactor batch insert operator by @andyfengHKU in #5456
- cli: create history file in home directory by @sdht0 in #5774
- Clean up
minUncommittedNodeOffsets
in Transaction and keep startOffset inside each LocalNodeTable by @ray6080 in #5781 - migration-testing: update script with option to split test files by @tgahunia05 in #5773
- Fix create empty node by @andyfengHKU in #5784
- Export symbols for external extensions by @andyfengHKU in #5783
- Remove
Transaction::getClientContext
interface by @ray6080 in #5786 - Add C API InternalID getter for rel value by @mewim in #5789
- Update DDL to be in line with recent Kuzu syntax by @prrao87 in #5785
- Fix types path in nodes_api package.json by @chrisvariety in #5793
- Align TS types with actual implementation by @chrisvariety in #5794
- Remove expression copy by @andyfengHKU in #5796
- export-import: implement parallel=false flag on export to fix import function by @tgahunia05 in #5770
- query profiler: segfault trying to profile export database by @tgahunia05 in #5803
- Move the ownership of Transaction from TransactionContext to TransactionManager by @ray6080 in #5798
- Fix Windows wheel by @mewim in #5812
- Bump version to 0.11.1 by @mewim in #5813
New Contributors
- @chrisvariety made their first contribution in #5793
Full Changelog: v0.11.0...v0.11.1
v0.11.0
Kuzu v0.11.0 is here! The team is excited to unveil this release, which is packed with a ton of new
features and improvements:
- Single-file databases
- Vector index and FTS improvements
- Mutable indices
- Filtered vector search with arbitrary Cypher queries
- FTS performance improvements
- LLM extension
- Azure support
- Swift API
Besides these major features, we have also added several new Cypher features, including:
- Create table as
- Alter relationship table add/drop connection
Please check our release post for more details!
What's Changed
- Support scan pandas dict by @acquamarin in #5370
- Add gds document examples by @andyfengHKU in #5294
- Extension uninstall by @acquamarin in #5361
- cypher: Improve Empty List and Null Handling by @tgahunia05 in #5356
- Implement force-install and update extension by @acquamarin in #5368
- ci: modify existing benchmark comment instead of multiple new comments by @sdht0 in #5384
- Truncate free pages at the end of the file in finalizeCheckpoint by @royi-luo in #5353
- Remove planner enumeration logic and output single optimal plan by @andyfengHKU in #5388
- extension(neo4j): print as output the detected properties and types by @sdht0 in #5379
- ci: fix multiplatform builds by @sdht0 in #5382
- Ease the type check on WASM prepared statement by @mewim in #5394
- Skip gcs tests for now by @ray6080 in #5401
- More BM exception robustness fixes by @royi-luo in #5375
- cli: piping into queries leads to silent exit by @tgahunia05 in #5392
- stringFormat: pass args by reference instead of value by @sdht0 in #5399
- Unify semi mask used in GDS, HNSW index & Recursive join by @andyfengHKU in #5403
- Add
CREATE NODE TABLE ... AS
syntax by @05st in #5354 - stringFormat: optionally use std::format for compile-time checks by @sdht0 in #5402
- Revert "Add
CREATE NODE TABLE ... AS
syntax" by @ray6080 in #5409 - extension(neo4j): kleene star migration syntax by @tgahunia05 in #5405
- Revert "Skip gcs tests for now" by @mewim in #5415
- Fix undirected edge src&dst node incorrect order by @andyfengHKU in #5416
- Add
CREATE NODE TABLE AS
syntax by @05st in #5414 - Update Java API Exception Handling by @tgahunia05 in #5418
- Enable static linking for extensions when building for Swift by @mewim in #5420
- Move catalog entry related initialization from mapCopyRelFrom to RelBatchInsert::initGlobalStateInternal... by @05st in #5419
- Clean up testing artifacts from previous PR by @05st in #5423
- Move index and overflow files into data.kz by @ray6080 in #5386
- refactor(nodejs_api): enhance type definitions and add async/sync met… by @jasonviipers in #5426
- Add noop operator and fix planning for create table as by @andyfengHKU in #5429
- feat(nodejs_api): add ES module support by @jasonviipers in #5427
- Remove unnecessary copy of ResultSetDescriptor by @andyfengHKU in #5432
- updates the
README.md
(nodejs api) to enhance clarity by @jasonviipers in #5434 - Clear unnecessary source operator copy by @andyfengHKU in #5436
- Update setuptools by @mewim in #5438
- Fix unwind with null value by @acquamarin in #5441
- Fix export database with json type in schema by @acquamarin in #5440
- Reorganize files under storage by @ray6080 in #5447
- Move catalog and metadata into data.kz by @ray6080 in #5431
- tests: export from prev release and import into a new release by @tgahunia05 in #5449
- Remove the early checkpoint for empty primary key indices by @benjaminwinger in #5439
- Separate PrimaryKeyIndex constructor by @ray6080 in #5455
- Fix module paths in package.json for Node.js by @mewim in #5459
- Rework rel group by @acquamarin in #5280
- tests: add ability to automatically rewrite test files by @tgahunia05 in #5442
- Add ability to pass python dataframes as query parameters by @royi-luo in #5376
- Clear logical plan by @andyfengHKU in #5463
- Avoid reserializing catalog/storage metadata if there are no changes since last checkpoint by @royi-luo in #5462
- Add Azure extension by @05st in #5460
- Fix macos extension tests by @royi-luo in #5468
- Add generic filtered search for vector index. by @andyfengHKU in #5464
- python: add support for iteration and dict output by @sdht0 in #5469
- chore: remove all code warnings by @sdht0 in #5471
- Collect memory optimization by @royi-luo in #5467
- python: code cleanup by @sdht0 in #5470
- extension(llm): create embedding function by @tgahunia05 in #5466
- ci: auto format rust & python code by @sdht0 in #5472
- Add ADLS support to Azure extension by @05st in #5475
- Support detach delete from single direction rel table by @royi-luo in #5476
- Better error message for wasm extension by @acquamarin in #5480
- Remove file format option when scan from azure by @acquamarin in #5478
- Use malloc-based buffer manager for Swift by @mewim in #5482
- Refactor card estimator by @andyfengHKU in #5479
- Loadable index by @ray6080 in #5477
- Index insert interface by @ray6080 in #5485
- Implement alter rel group add node table pair by @acquamarin in #5486
- Assign max logical type for property with different types by @andyfengHKU in #5487
- Remove some dead and unnecessary code by @benjaminwinger in #5461
- Add node rel comparison rewrite by @andyfengHKU in #5490
- Make zone map work with casted string columns by @royi-luo in #5493
- Optimize stem function with static stemmer by @acquamarin in #5497
- Java API: add tests to check conversion of values in prepared statements by @05st in #5495
- Add
CREATE REL TABLE AS
syntax by @05st in #5437 - Move OnDiskGraph and TableEntry out of HNSWIndex by @ray6080 in #5500
- Implement FTS insertion by @acquamarin in #5491
- lints: check for braces around single line statements by @tgahunia05 in #5507
- A bit clean up of isInMemory and shouldLogToWAL by @ray6080 in #5505
- Java API: fix string encoding scheme mismatches by @05st in #5506
- Reduce HNSW index upper layer memory usage by @royi-luo in #5499
- Fix decimal datatype casting by @acquamarin in #5501
- Remove unused arrow aux buffer by @ray6080 in #5510
- Reclaim space for all structures in data.kz file by @ray6080 in #5443
- Support incremental insertions for HNSW by @ray6080 in #5509
- Extension llm create embedding function by @tgahunia05 in #5498
- Fix invalid regex expressions occuring due to nested parentheses by @05st in #5514
- build: allow system dependencies in cmake, refactor Makefile and CI by @sdht0 in #5474
- shell: add support for ENTER from middle of a query by @sdht0 in #5380
- Optimize default stopwords lookup by @acquamarin in #5518
- Add graph cypher projection by @andyfengHKU in #5502
- python: always check requirements by @sdht0 in #5526
- Implement fts deletion when nodes are delete...
v0.10.1
v0.10.1 is a minor release which includes multiple fixes and improvements for the Java API bindings.
Full Changelog: v0.10.0...v0.10.1
v0.10.0
We're happy to announce v0.10.0 of Kuzu. This release contains the following features
- Graph algorithm extension, including
- K-Core decomposition
- PageRank
- Louvain
- Weakly connected components
- Strongly connected components
- Weight shortest paths (supported as Cypher pattern matching)
- Neo4j migration extension
- Android support
- Scan compressed CSV files
Besides these new features, we have also made several performance related changes, including
- Free space management mechanism to reclaim space as you update the database
- Performance improvement for recursive queries
- Performance improvement for json scanning
Please check our release post for more details!
What's Changed
- Compress neighbour offsets for in-mem HNSW graph by @royi-luo in #5054
- Allow empty partial column in copy from clause by @acquamarin in #5180
- rust: tie result lifetime to the database by @sdht0 in #5182
- Fix stop threshold for BM evictions by @benjaminwinger in #5176
- Implement to_epoch_ms function by @acquamarin in #5187
- Add click benchmark by @ray6080 in #5170
- Fix windows import db path bug by @acquamarin in #5192
- Fix copy from subquery state bug by @acquamarin in #5193
- Process potential BufferManager eviction candidates in batches by @benjaminwinger in #5196
- Merge consecutive match clause by @andyfengHKU in #5179
- Fix copy from subquery type mismatch by @andyfengHKU in #5197
- Update click benchmark by @ray6080 in #5198
- Use case insensitive map when binding query by @andyfengHKU in #5200
- Fix json null by @acquamarin in #5202
- Fix json httpfs by @acquamarin in #5206
- Replace Dockerhub with GHCR for extension repo by @mewim in #5213
- Fix click benchmark q29 by @benjaminwinger in #5212
- Add TypeScript definitions for kuzu database API by @mewim in #5201
- expedite clang tidy check by @acquamarin in #5208
- Support DOUBLE column for vector index by @ray6080 in #5209
- Support s3 session token param by @acquamarin in #5217
- Fix scan pandas/polars type error by @acquamarin in #5218
- Fixes json type comparison casting error by @acquamarin in #5219
- Improve benchmark error reporting by @benjaminwinger in #5220
- Limit the max DB size for the rust tests by @benjaminwinger in #5224
- Remove enabled check from semi mask by @andyfengHKU in #5221
- Fix merge by @andyfengHKU in #5225
- Add schedule run for clickbench by @ray6080 in #5223
- Handle rel group in projected graph by @andyfengHKU in #5227
- First version of neo4j migration tool extension by @acquamarin in #5186
- Fix ClickBench report url conflict by @mewim in #5229
- Free space management initial PR by @royi-luo in #5205
- Improve benchmar tool by @mewim in #5230
- Fix sql-query escape character by @acquamarin in #5232
- rust: bump min rust version to 1.81, update CI by @sdht0 in #5185
- Replace CountZeros implementation with one based on std::countl_zero/countr_zero by @benjaminwinger in #5234
- Implement dynamic sparse/dense frontier switch for recursive joins by @andyfengHKU in #5214
- Fix ALP test by @royi-luo in #5244
- Improve json scan by @acquamarin in #5237
- Add sparse frontier threshold setting by @andyfengHKU in #5238
- Fix failure limit when doing BM Evictions by @benjaminwinger in #5236
- Avoid double-counting errored lines for CSV warning line number by @royi-luo in #5243
- Add negative weight check to weighted shortest path by @andyfengHKU in #5239
- Build Java bindings for Android ARMv8-A platform by @mewim in #5248
- Improve scc performance by @andyfengHKU in #5241
- Fix handling of SimpleAggregate non-distinct functions mixed with distinct ones by @benjaminwinger in #5253
- Implement gzip compression by @acquamarin in #5252
- Improve parallel scc, handle filtered multi-label graph by @andyfengHKU in #5255
- FSM: Reclaim pages when dropping columns/tables by @royi-luo in #5235
- Make the scan of relID optional for GDS algorithms by @ray6080 in #5249
- Replace comparisons with toLower/toUpper with caseInsensitiveEquals by @benjaminwinger in #5258
- Add sparse weight shortest destination by @andyfengHKU in #5260
- Make gds as an extension by @acquamarin in #5261
- Skip reconstructing line for compressed CSV errors by @royi-luo in #5257
- Add weighted shortest path cost function by @andyfengHKU in #5262
- Improve json
getFieldIdx
function by @acquamarin in #5266 - Use the FactorizedTable's InMemOverflowBuffer in MinMaxAggregateState by @benjaminwinger in #5247
- Throw error during import db by @ray6080 in #5272
- Fix export database with special property name by @andyfengHKU in #5279
- Make ChunkedCSRHeader's capacity in CSRNodeGroupScanState be 1 if random lookup by @ray6080 in #5278
- Test framework respects 'threads' setting unless 'PARALLELISM' is set by @royi-luo in #5282
- Reclaim pages for fully-deleted node groups by @royi-luo in #5259
- Change on disk graph rel scan to random lookup by @ray6080 in #5285
- Optimize timestamp parsing by @benjaminwinger in #5270
- Add error message when forget to load extension by @acquamarin in #5290
- Support max iterations in wcc, scc by @acquamarin in #5291
- Fix get_as_arrow with null array by @acquamarin in #5293
- Reduce prereserving of capacity for string offset/data chunks by @royi-luo in #5284
- Pass LogicalType to ColumnChunk by rvalue by @benjaminwinger in #5296
- Add doc example for compressed csv by @acquamarin in #5298
- Optimize RelTableScanState during rel table update/delete and detach delete by @ray6080 in #5297
- Flush + populate warnings after COPY FROM subquery by @royi-luo in #5300
- QUERY_VECTOR_INDEX + MATCH bind error by @ted-wq-x in #5292
- Implement json allocator by @acquamarin in #5256
- Json scan small field index optimization by @benjaminwinger in #5281
- Add page rank normalize config by @andyfengHKU in #5304
- Give error message when exporting to compressed file by @acquamarin in #5305
- gds: init louvain by @sdht0 in #5155
- Add doc example for neo4j extension by @acquamarin in #5299
- Fix date conversion overflow issue for Node.js and WASM by @mewim in #5311
- Handle null nodes and null properties in neo4j extension by @acquamarin in #5308
- Optimize set of nodeID for basic scan node table queries by @ray6080 in #5310
- Fix list-concat functions by @acquamarin in #5319
- gds(louvain): add optional configuration by @sdht0 in #5321
- Increase number of struct fields to 65536(2^16) by @acquamarin in #5323
- Add snap GDS tests by @andyfengHKU in #5233
- Gds rename by @acquamarin in #5327
- Fix top k parameter by @andyfengHKU in #5326
- Rename create_projected_graph to project_graph by @acquamarin in #5329
- Add GDS interrupt and progress by @andyfengHKU in https://github.c...
v0.9.0
We’re delighted to announce the release of Kuzu 0.9.0, whose most notable feature is a new vector extension that allows you to perform similarity search over vector data fully within Kuzu.
Other features include:
- Arbitrary SQL scans from Postgres databases
- WASM with bundled extensions
- Async Python API and Sync Node.js API
- Unity Catalog integration
- MCP server implementation
- G.V() integration
Besides new features, we've continuously improved the performance of our aggregation along with the creation of fts indexes a lot!
Please check our release post for more details.
What's Changed
- Add tests with different node group sizes + fix bugs by @royi-luo in #4928
- Temporarily disable daily build until 0.8.2 release by @mewim in #4950
- Fix list-predicate functions by @acquamarin in #4947
- Implement hint for unnested subquery by @andyfengHKU in #4955
- Refactor scalar_func_exec_t to take in separate selection vectors by @royi-luo in #4948
- Revert "Temporarily disable daily build until 0.8.2 release" by @mewim in #4956
- Refactor semi mask by @andyfengHKU in #4940
- Skip inserting null keys into distinct hash tables by @benjaminwinger in #4949
- Fix compile warnings in function executors by @royi-luo in #4960
- Optimize embedding caching in vector index construction by @ray6080 in #4920
- Improve semi mask planning by @andyfengHKU in #4957
- Optimize InMemHNSWGraph::getNeighbors by @ray6080 in #4951
- Fix export database with official extension. by @acquamarin in #4961
- Fix label predicate in recursive pattern by @andyfengHKU in #4966
- Support EXISTS subquery in recursive pattern predicate by @andyfengHKU in #4969
- Fix json output format in shell by @acquamarin in #4968
- gds: init scc computation using kosaraju's algorithm by @sdht0 in #4893
- Fix extend cardinality by @royi-luo in #4843
- Parallel Distinct SimpleAggregate by @benjaminwinger in #4934
- Remove old recursive extend by @andyfengHKU in #4976
- Reuse scan state for multiple table scans by @ray6080 in #4975
- Support customized extension repo by @acquamarin in #4973
- Support lambdas on lists with size > DEFAULT_VECTOR_CAPACITY by @royi-luo in #4979
- Refactor parsed expr visitor by @andyfengHKU in #4977
- Refactor DDL operator by @andyfengHKU in #4984
- Fix Python build issue caused by shell_printer by @mewim in #4985
- Support ignore_errors option for copy from subquery by @royi-luo in #4988
- Update Kùzu to Kuzu for consistency with SEO discoverability by @prrao87 in #4965
- Only merge distinct aggregate hash tables into the global queues when full by @benjaminwinger in #4972
- Fix hash 256 function by @acquamarin in #4989
- Statistics update optimization by @benjaminwinger in #4980
- Fix deserialization of empty node groups by @royi-luo in #4987
- Update CI workflow to use Debian 12 for code coverage job by @mewim in #4996
- Refactor table scan state interfaces by @ray6080 in #4981
- Fix json output mode with meta-data by @acquamarin in #4993
- Fix json casting issue by @acquamarin in #4992
- Selection vector slicing with lightweight SelectionView by @benjaminwinger in #4998
- Support gds optioanl args by @acquamarin in #4999
- Add projected graph node filter by @andyfengHKU in #4990
- Split recursive join and gds at logical operator level by @andyfengHKU in #5003
- Enable dynamic dispatch for simsimd by @royi-luo in #5000
- Add is ready only field to function by @andyfengHKU in #5002
- Add squared distance function for arrays by @royi-luo in #5008
- Refactor gds frontier by @andyfengHKU in #5006
- refactor: Kùzu -> Kuzu by @sdht0 in #5005
- Tie rust QueryResult lifetime to that of the Database by @benjaminwinger in #5009
- Implement SQL_QUERY function by @acquamarin in #5010
- Fix nested decimal type casting by @acquamarin in #5018
- Split recursive extend and gds at binding level by @andyfengHKU in #5020
- Added rustdoc example for Connection::execute by @benjaminwinger in #5022
- Try to fix CI workflow for forked repo by @mewim in #5027
- Implement copy from table function by @acquamarin in #5023
- Split rec join and gds at physical level by @andyfengHKU in #5021
- Refactor gds output writer by @andyfengHKU in #5028
- gds: init parallel scc by @sdht0 in #5011
- Hide columns in table func bind data by @andyfengHKU in #5029
- Disable compression on floating point values in array/list by @ray6080 in #5035
- Separate SemiMask interface and implementation by @ray6080 in #5036
- Implement parameter casting for table functions by @acquamarin in #5034
- Fix incorrect src dst for undirected path by @andyfengHKU in #5041
- Revert "Try to fix CI workflow for forked repo" by @mewim in #5043
- Expose semi mask sub-plan in the logical plan tree by @ray6080 in #5037
- Swap tableName and indexName in hnsw functions by @ray6080 in #5038
- Allow loading from multiple files by @ray6080 in #5045
- Remove cardinality from tableFuncBindData by @acquamarin in #5039
- Refactor table function by @acquamarin in #5046
- Support handling null/deleted nodes during vector index creation by @royi-luo in #5014
- Improve SelectionVector::fromValueVectors by @ray6080 in #5052
- Make QueryResult::toString const by @benjaminwinger in #5013
- Make GDS table function by @andyfengHKU in #5048
- Change hnsw input parameter types by @acquamarin in #5042
- Use copyNullMask instead of looping during copies of nulls by @benjaminwinger in #5015
- Implement multi-labeled wcc by @andyfengHKU in #5057
- Implement synchronous APIs for Node.js bindings by @mewim in #5058
- Multi label page rank by @andyfengHKU in #5060
- Use correct offset to access vector index embeddings during creation by @royi-luo in #5063
- Add AsyncConnection for asynchronous query execution on Python API by @mewim in #5061
- Clear table function signatures by @andyfengHKU in #5059
- Filtered HNSW search by @ray6080 in #5019
- Unify vector index, gds & table function planning by @andyfengHKU in #5067
- Implement
internal_id
function to createinternal_id
literal by @acquamarin in #5071 - Throw exception when extension rewrite functions called in a multi-statement query by @acquamarin in #5072
- Add job for testing simsimd dynamic dispatch to nightly build-and-deploy workflow by @royi-luo in #5007
- Vector extension by @ray6080 in #5047
- Add option blind/directed upper sel threshold; rename distFunc to metric by @ray6080 in #5069
- Move OutputNodeMask output GDSComputeState by @andyfengHKU in #5077
- Optimize regex match execution by @acquamarin in #5079
- Add projected graph with table droping tests by @andyfengHKU in #5073
- Rename hnsw functions by @ray6080 in #5078
- Support yield for QUERY_VECTOR_INDEX by @ray6080 in https://github.com/ku...
v0.8.2
v0.8.2 is a minor release to fix the distinct hash aggregate with NULL bug
We're just a couple months into 2025, and we are happy to announce a new minor release: v0.8.2. This release is feature-packed, warranting its own blog post. One of the highlights is the introduction of the unity_catalog
extension, which allows you to scan/copy from Delta Lake tables managed by Unity Catalog.
We've also improved our existing extensions. For those of you on Google Cloud, we have some exciting news! We now support scanning/copying from/writing to files hosted on Google Cloud Storage(GCS) filesystem. This update leverages our existing httpfs
extension. Another useful new feature is that our CLI now explicitly excludes confidential information such as S3 access keys
from being stored in the command history file. This helps prevent accidental leakage of sensitive data into your command line history and ensures your credentials remain secure.
Our full-text search extension now supports customizing the stopwords table used in full-text search, which can be helpful in your custom domains where specific words not in the default list need to be excluded from the index.
From a performance perspective, we’ve significantly improved our execution of distinct aggregation queries via a new parallel distinct hash aggregation mechanism.
Please check our release post for more details. Hope you enjoy this release!
Full Changelog: v0.8.1...v0.8.2
v0.8.1
We're just a couple months into 2025, and we are happy to announce a new minor release: v0.8.1. This release is feature-packed, warranting its own blog post. One of the highlights is the introduction of the unity_catalog
extension, which allows you to scan/copy from Delta Lake tables managed by Unity Catalog.
We've also improved our existing extensions. For those of you on Google Cloud, we have some exciting news! We now support scanning/copying from/writing to files hosted on Google Cloud Storage(GCS) filesystem. This update leverages our existing httpfs
extension. Another useful new feature is that our CLI now explicitly excludes confidential information such as S3 access keys
from being stored in the command history file. This helps prevent accidental leakage of sensitive data into your command line history and ensures your credentials remain secure.
Our full-text search extension now supports customizing the stopwords table used in full-text search, which can be helpful in your custom domains where specific words not in the default list need to be excluded from the index.
From a performance perspective, we’ve significantly improved our execution of distinct aggregation queries via a new parallel distinct hash aggregation mechanism.
What's Changed
- Include delta/iceberg loader in extension build cmake by @acquamarin in #4855
- Fix rust build threads in CI by @benjaminwinger in #4853
- Improve label pruning by @andyfengHKU in #4842
- Remove SimpleTableFunction and make TableFunction non-extendable by @ray6080 in #4840
- adjusted cli autocomplete ordering by @MSebanc in #4845
- Table function logical plan by @ray6080 in #4848
- Forward declare ExecutionContext in PhysicalOperator by @ray6080 in #4859
- Add more test cases on vector size by @ray6080 in #4240
- Update README.md by @semihsalihoglu-uw in #4866
- Support customization of stopwords in full text search by @acquamarin in #4864
- Table function physical plan by @ray6080 in #4862
- build: separate out test build and run by @sdht0 in #4844
- Speed up recompilation when changing compile-time config by @royi-luo in #4850
- Optimize recompile times by @royi-luo in #4863
- Add Catalog version by @ray6080 in #4869
- Rename clone to copy by @andyfengHKU in #4871
- Allow COPY FROM in manual transactions by @ray6080 in #4872
- Refactor gds framework by @andyfengHKU in #4860
- Rework rewriteFunc of table functions by @ray6080 in #4873
- Migrate API docs to standalone repo by @mewim in #4875
- Fix extension-only build by @royi-luo in #4876
- Fix lambda function list-size limitation by @acquamarin in #4879
- Fix non exist pk error message by @acquamarin in #4883
- Fix nested agg binding exception by @andyfengHKU in #4885
- Remove count from evaluator local state by @andyfengHKU in #4878
- Refactor path backtrack by @andyfengHKU in #4886
- Make weight shortest path cost as double by @andyfengHKU in #4887
- Fixes the
alias
issue ofstruct_pack
function by @acquamarin in #4894 - Remove recursive extend binding by @andyfengHKU in #4896
- Parallel distinct hash aggregate by @benjaminwinger in #4881
- Fix export/import database by @acquamarin in #4900
- Fix
skipWhiteSpace()
function by @acquamarin in #4907 - Clean update info during node group checkpoint by @ray6080 in #4895
- Try to mitigate data race in the HashAggregate by @benjaminwinger in #4906
- Add empty columns to chunked node group if needed during COPY by @royi-luo in #4882
- Implement unity catalog extension by @acquamarin in #4890
- Disable test NodeUpdateTest.UpdateSameRowRedundtanly for in mem mode tests by @royi-luo in #4911
- Fix projection profiler by @andyfengHKU in #4908
- Separate hash aggregate finalization into its own operator by @benjaminwinger in #4913
- Break when error in the middle of multi statements by @ray6080 in #4914
- Allow create/drop hnsw index to run in manual transactions by @ray6080 in #4877
- Fix timing in the final QueryResult for create fts/hnsw index by @ray6080 in #4915
- Wsp track path by @andyfengHKU in #4898
- Fixes export database with stopwords by @acquamarin in #4917
- Split shortest path implementation to different files by @andyfengHKU in #4919
- Implement confidential statement by @acquamarin in #4910
- Update Kuzu logo image by @mewim in #4923
- Add predicate information per table to graph entry by @andyfengHKU in #4924
- Enable spilling during finalization of CreateHNSWIndex by @ray6080 in #4921
- Add GCS support by @royi-luo in #4892
Full Changelog: v0.8.0...v0.8.1
v0.8.0
We're kicking off the year 2025 with the exciting release of Kùzu 0.8.0, which brings two new features:
- Kùzu-WASM for in-browser graph analytics. You can now run your graph database while keeping all data and compute within your browser session!
fts
extension for full-text search. You can now run keyword-based search queries using BM25 in Kùzu.
In addition to these new features, we’ve streamlined the developer workflow during relationship table creation by unifying CREATE REL TABLE GROUP
into a single, flexible CREATE REL TABLE
syntax.
Finally, we’ve significantly improved our execution of aggregation queries via a new parallel hash aggregation mechanism.
Please check our release post for more details. Hope you enjoy this release!
What's Changed
- Add check of test file name on
-
, fix doc_example.test of JSON extension by @SterlingT3485 in #4537 - Add GDS support for vertex property scanning by @benjaminwinger in #4453
- Implement list_has_all by @acquamarin in #4546
- Added Unicode \u and \U parsing to the cli by @MSebanc in #4492
- Migrate macOS CI to use both x86-64 and ARM64 runner by @mewim in #4542
- Avoid importing polars in arrow scan by @acquamarin in #4551
- GDS interface cleanup by @andyfengHKU in #4524
- Use a flag to determine if the SelectionVector uses INCREMENTAL_SELECTED_POS by @benjaminwinger in #4552
- Fix VersionInfo SelectionVector creation by @benjaminwinger in #4556
- Remove some redundant compiler flags on Windows rust builds by @benjaminwinger in #4555
- Using shared_mutex instead of mutex in CatalogSet by @ted-wq-x in #4533
- Implement full text search by @acquamarin in #4416
- Rel scan selection optimizations by @benjaminwinger in #4558
- Fix wrong sniff type when sample_size = 1 by @SterlingT3485 in #4565
- Alter table with index by @acquamarin in #4563
- Fix the type cast in nested struct by @SterlingT3485 in #4560
- Wasm buffer manager support by @benjaminwinger in #4523
- Implement optional args in full text search by @acquamarin in #4569
- Fix config parameters binding for C API by @mewim in #4574
- Deprecated Ubuntu 23 runners from multi-platform test by @mewim in #4576
- Add more explicit comparisons when checking test result output by @benjaminwinger in #4280
- Minor path writer refactor by @andyfengHKU in #4580
- Fix rollback during Node Table COPY by @royi-luo in #4467
- Rework build pipeline by @mewim in #4581
- Gds node predicate push down by @andyfengHKU in #4461
- Migrate macOS build docs to internal repo by @mewim in #4585
- Handle failed queries in test runner if expected output is tuples by @royi-luo in #4584
- Add missing include; fix clangd by @mewim in #4586
- Fix flat select bug by @andyfengHKU in #4590
- Fixed cli handling of escape sequences after autocompletion by @MSebanc in #4589
- Revert "Fixed cli handling of escape sequences after autocompletion" by @andyfengHKU in #4594
- Increase buffer pool size for LargeListTest by @royi-luo in #4596
- Delta extension by @acquamarin in #4587
- Add test case name check & Share build between test and extension-test by @SterlingT3485 in #4572
- Add a homeDir field in LocalFileSystem for removefile to check by @SterlingT3485 in #4543
- Avoid double updating linesPerBlock in SharedFileErrorHandler if an exception is thrown by @royi-luo in #4601
- Implement drop/add property with if exists by @acquamarin in #4598
- Refactors the compilation of extensions by @acquamarin in #4602
- Remove unnecessary logic in ProcessorTask::finalizeIfNecessary by @royi-luo in #4573
- Cardinality estimation on top of HLL by @ray6080 in #4433
- Add recursive join benchmark by @andyfengHKU in #4603
- Parallel init dense gds array by @andyfengHKU in #4588
- Add inturrupt to path writer by @andyfengHKU in #4609
- Add sparse frontier implementation by @andyfengHKU in #4557
- Implement
FORMAT
option inLOAD FROM
clause. by @acquamarin in #4613 - Add e-notation double by @andyfengHKU in #4616
- Implement conjunctive full text search by @acquamarin in #4605
- Remove unnecessary lock in isVisibleNoLock check by @ray6080 in #4623
- Fix cost model for Extend by @ray6080 in #4429
- Add call function for bm info by @ray6080 in #4622
- Add exception when trying to load from a directory (or a file name with extension) by @SterlingT3485 in #4614
- Add setting
enable_plan_optimizer
by @royi-luo in #4619 - Implementing Parallel WCC (#4604) by @andyfengHKU in #4621
- Rename call function to simple table function by @andyfengHKU in #4626
- Refactor table bind func input by @andyfengHKU in #4627
- Add optimizer pass to repopulate cardinalities + combine cardinalities in Logical Plan/Operator by @royi-luo in #4606
- handle newline in front of profile/explain by @acquamarin in #4618
- Allow prepared statemenet parameter in CALL function by @acquamarin in #4628
- Add graph projection by @andyfengHKU in #4630
- refactor table bind func by @andyfengHKU in #4635
- Fix incorrect set of sequence val after exporting database by @ray6080 in #4636
- Add Ice Berg Extension by @SterlingT3485 in #4600
- Remove unnecessary join in fts by @andyfengHKU in #4637
- Support attach relational database with schema by @acquamarin in #4639
- Fix list-contains binding by @acquamarin in #4644
- Add iceberg metadata alter name test by @SterlingT3485 in #4645
- Trim Unnecessary Quote for CLI JSON output by @SterlingT3485 in #4643
- Implement
get_keys
function in fts by @acquamarin in #4647 - Remove offset from table func by @andyfengHKU in #4648
- Make iceberg test output to be stable by @SterlingT3485 in #4654
- Add include for
std::optional
in extension by @acquamarin in #4653 - Remove unnecessary sink in copy from subquery by @andyfengHKU in #4650
- Update PRODUCTION_RELEASES for extension by @mewim in #4659
- Add null checks to zone map by @royi-luo in #4642
- remove graph entry from fts input by @andyfengHKU in #4660
- Rollback hash index checkpoint by @royi-luo in #4559
- Fixed cli handling of escape sequences after autocompletion by @andyfengHKU in #4595
- Added type names to cli auto complete by @MSebanc in #4591
- Improve subquery planning by @andyfengHKU in #4651
- refactor table function constructor by @andyfengHKU in #4664
- Refactor scalar function constructor by @andyfengHKU in #4666
- Hash join flatten fix by @andyfengHKU in #4668
- Remove encoded join and enumeration flags from end-to-end testing framework by @ray6080 in #4669
- Skip query result from rewritten queries by @ray6080 in #4633
- Add more complete order by key types check by @ray6080 in #4671
- Refactor table func interface with const params by @ray6080 in #4670
- Add more clang tidy checks by @ray6080 in #4672
- Support IGNORE_ERRORS when scanning from pyarrow/pandas by @royi-luo in #4646
- Rename reader conf...
v0.7.1
We are excited to announce the release of two new extensions: Delta Lake and Iceberg. The Delta extension allows seamless scanning and copying from Delta Lake tables, while the Iceberg extension provides the same functionality for Apache Iceberg tables.
In addition to these new extensions, this release introduces several bug fixes and new features, including:
- The ability to attach to a specific schema in a relational database.
- Support for
ADD
/DROP PROPERTY IF [NOT] EXISTS
commands. - A new
list_has_all
function for enhanced list operations. - Experimental support for Android armv8a platform.
Hope you enjoy the new release!
What's Changed
- Trim Unnecessary Quote for CLI JSON output #4643
- Fix list-contains binding #4644
- Support attach relational database with schema #4639
- Add Ice Berg Extension #4600
- Fix incorrect set of sequence val after exporting database #4636
- Add e-notation double #4616
- Implement FORMAT option in LOAD FROM clause. #4613
- Add inturrupt to path writer #4609
- Implement drop/add property with if exists #4598
- Delta extension #4587
- Fix flat select bug #4590
- Gds node predicate push down #4461
- Fix rollback during Node Table COPY #4467
- Fix the type cast in nested struct #4560
- Rel scan selection optimizations #4558
- Using shared_mutex instead of mutex in CatalogSet #4533
- Fix VersionInfo SelectionVector creation #4556
- Avoid importing polars in arrow scan #4551
- Added Unicode \u and \U parsing to the cli #4492
- Implement list_has_all #4546
Full Changelog: v0.7.0...0.7.1