Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf8' codec can't decode byte 0x82 in position 1: invalid start byte #19

Open
pedroj84 opened this issue Dec 20, 2018 · 5 comments

Comments

@pedroj84
Copy link

When you run the script with the following parameters
python aws_inventory.py --profile my-profile-name --region my-region --debug -v

It breaks up at the end with the following error without writing json file:

DEBUG:aws_inventory.store:Building the response store.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/pedroj84/aws-inventory/aws_inventory/invoker.py", line 99, in _probe_services
self.write_results()
File "/home/pedroj84/aws-inventory/aws_inventory/invoker.py", line 124, in write_results
print self.store.get_response_store()
File "/home/pedroj84/aws-inventory/aws_inventory/store.py", line 83, in get_response_store
return json.dumps(self._response_store, cls=ResponseEncoder)
File "/usr/lib/python2.7/json/init.py", line 251, in dumps
sort_keys=sort_keys, **kw).encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x82 in position 1: invalid start byte

@ncc-erik-steringer
Copy link
Collaborator

Cleaning out issues. I can't figure out, based on the details given, where the issue exactly is. Were you able to pin this down more? If so, could you share what you found?

@jjpestacio
Copy link

@ncc-erik-steringer I encountered the same issue. After some investigation, I learned that AWS will send some responses with non-ASCII characters. Both the response and exception stores are pickled into bytes. JSON cannot load the stores after unpickling because the strings contain non-ASCII characters.

To fix, all values in the response and exception stores must be utf-8 encoded before pickling (or encoded after pickling and before JSON loading).

@ncc-erik-steringer
Copy link
Collaborator

@jjpestacio , do you happen to know which responses from AWS do this? Also, have you seen if the develop branch has the same behavior (Python 3 only): https://github.com/nccgroup/aws-inventory/tree/develop

@jjpestacio
Copy link

I'm not sure which exact responses contain non-ASCII characters, but they were mostly related to languages e.g. "Sign-up using Español." I haven't verified if the problem persists in the Python3 version but I think it will - regardless of Python2 or Python3 we're using the same botocore version so the responses will be the same.

@jjpestacio
Copy link

jjpestacio commented Sep 30, 2020

Also fwiw, here's the quick workaround i'm using:

aws_inventory/store.py

diff --git a/aws_inventory/store.py b/aws_inventory/store.py

+
+def convert(input):
+    if isinstance(input, dict):
+        return {convert(key): convert(value) for key, value in input.iteritems()}
+    elif isinstance(input, list):
+        return [convert(element) for element in input]
+    elif isinstance(input, unicode):
+        return input.encode('utf-8')
+    else:
+        return input
+
+

@@ -80,7 +90,7 @@ class ResultStore(object):
         :return: serialized response store in JSON format
         """
         LOGGER.debug('Building the response store.')
-        return json.dumps(self._response_store, cls=ResponseEncoder)
+        return json.dumps(convert(self._response_store), cls=ResponseEncoder)

     def dump_response_store(self, fp):
         """Pickle the response store.
@@ -88,7 +98,7 @@ class ResultStore(object):
         :param file fp: file to write to
         """
         LOGGER.debug('Writing the response store to file "%s".', fp.name)
-        pickle.dump(self._response_store, fp)
+        pickle.dump(convert(self._response_store), fp)

     def dump_exception_store(self, fp):
         """Pickle the exception store.
@@ -96,13 +106,15 @@ class ResultStore(object):
         :param file fp: file to write to
         """
         LOGGER.debug('Writing the exception store to file "%s".', fp.name)
-        pickle.dump(self._exception_store, fp)
+        pickle.dump(convert(self._exception_store), fp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants