-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5449 - Do not attach invalid document in exception message #2539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
bson/errors.py
Outdated
|
||
@property | ||
def document(self) -> Any: | ||
"""The invalid document that caused the error.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a versionadded and changelog reference.
bson/errors.py
Outdated
return self._document | ||
|
||
@document.setter | ||
def document(self, value: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this being a useful behavior for users, so making this public seems reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was suggesting we make it private, not public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for making it private? Security concerns if users misuse it? I find having the full document very useful for debugging and I imagine users do too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the getter should be public but when would we expect a user to set the document? We try to be conservative with public apis and only add ones that have a strong justification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops sorry I misread which method this was about. Totally agree, fixing.
goto cleanup; | ||
} | ||
PyObject *new_msg = PyUnicode_FromFormat("Invalid document %s | %s", dict_str_utf8, msg_utf8); | ||
PyObject *new_msg = PyUnicode_FromFormat("Invalid document: %s", msg_utf8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an invalid field in a nested doc, does this yield errors like "Invalid document: Invalid document: Invalid document: Invalid document: Invalid document: ...."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it works as expected:
encode({"t": {"f": Wrapper(1)}})
# Produces
Invalid document: cannot encode object: 1, of type: <class 'test.test_bson.TestBSON.test_doc_in_invalid_document_error_as_property.<locals>.Wrapper'>
elements.append(_element_to_bson(key, value, check_keys, opts)) | ||
except InvalidDocument as err: | ||
raise InvalidDocument(f"Invalid document {doc} | {err}") from err | ||
raise InvalidDocument(f"Invalid document: {err}", doc) from err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this InvalidDocument except clause include the encoding of _id
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a pre-existing bug, no? Happy to fix if we identify why the code currently looks like this. Maybe we assume that _id
will be valid if it reaches this far?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct it would be a pre-existing bug. Could you open a ticket for it?
>>>> encode({'_id': set()})
Traceback (most recent call last):
File "<python-input-6>", line 1, in <module>
encode({'_id': set()})
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 1053, in encode
return _dict_to_bson(document, check_keys, codec_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 1006, in _dict_to_bson
elements.append(_name_value_to_bson(b"_id\x00", doc["_id"], check_keys, opts))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 980, in _name_value_to_bson
raise InvalidDocument(f"cannot encode object: {value!r}, of type: {type(value)!r}")
bson.errors.InvalidDocument: cannot encode object: set(), of type: <class 'set'>
vs the expected behavior:
>>>> encode({'_id': 1, 'foo': set()})
Traceback (most recent call last):
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 1010, in _dict_to_bson
elements.append(_element_to_bson(key, value, check_keys, opts))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 994, in _element_to_bson
return _name_value_to_bson(name, value, check_keys, opts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 980, in _name_value_to_bson
raise InvalidDocument(f"cannot encode object: {value!r}, of type: {type(value)!r}")
bson.errors.InvalidDocument: cannot encode object: set(), of type: <class 'set'>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<python-input-7>", line 1, in <module>
encode({'_id': 1, 'foo': set()})
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 1053, in encode
return _dict_to_bson(document, check_keys, codec_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/shane/git/mongo-python-driver/bson/__init__.py", line 1012, in _dict_to_bson
raise InvalidDocument(f"Invalid document {doc} | {err}") from err
bson.errors.InvalidDocument: Invalid document {'_id': 1, 'foo': set()} | cannot encode object: set(), of type: <class 'set'>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened PYTHON-5560 to track.
bson/_cbsonmodule.c
Outdated
goto cleanup; | ||
} | ||
// Add doc to the error class as a property. | ||
PyObject_SetAttrString(InvalidDocument, "document", dict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be "_document" now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also highlights that this PR needs to add a test to ensure .document
is the expected dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly the tests fail if this is _document
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the two modified tests in this PR not sufficient to ensure that .document
behaves as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is setting a class attribute on the InvalidDocument class so the logic is incorrect, eg:
InvalidDocument.document = dict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I misunderstood what you meant. You're right that _error
returns a class, not an instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Had to do some C API delving to figure this out 😅
} | ||
PyObject *new_msg = PyUnicode_FromFormat("Invalid document: %s", msg_utf8); | ||
// Add doc to the error instance as a property. | ||
PyObject *new_evalue = PyObject_CallFunctionObjArgs(InvalidDocument, new_msg, dict, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check the new_msg == NULL
case before we call PyObject_CallFunctionObjArgs here.
This PR removes the full document from the error message and stores it in the
document
property on the error instead.