Skip to content

Conversation

geruh
Copy link
Contributor

@geruh geruh commented Jun 10, 2025

Which issue does this PR close?

Fixes two bugs in the GlueCatalogs table creation that caused inconsistency with Iceberg core and PyIceberg implementations:

  • StorageDescriptor.Location incorrectly set to metadata file path instead of table base location
  • iceberg.field.optional parameter logic inverted - was setting it to field.required instead of !field.required

Iceberg ref:

What changes are included in this PR?

Location

Before:

{
  "StorageDescriptor": {
    "Location": "s3://bucket/table/metadata/00000-uuid.metadata.json"
  },
  "Parameters": {
    "metadata_location": "s3://bucket/table/metadata/00000-uuid.metadata.json"
  }
}

After:

{
  "StorageDescriptor": {
    "Location": "s3://bucket/table"
  },
  "Parameters": {
    "metadata_location": "s3://bucket/table/metadata/00000-uuid.metadata.json"
  }
}

Schema (with required field)

Rust Schema

 NestedField::required(1, "foo", Type::Primitive(PrimitiveType::String)).into(),

Before:

{
    "Name": "foo",
    "Type": "string",
    "Parameters": {
        "iceberg.field.current": "true",
        "iceberg.field.id": "2",
        "iceberg.field.optional": "true" <---
    }
}

After:

{
    "Name": "foo",
    "Type": "string",
    "Parameters": {
        "iceberg.field.current": "true",
        "iceberg.field.id": "2",
        "iceberg.field.optional": "false" <---
    }
}

Are these changes tested?

  • Repaired the existing test
  • Tested manually with GlueCatalog.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Xuanwo Xuanwo merged commit df07537 into apache:main Jun 10, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants