Skip to content

Encoding issues with many popular sites #434

@pirate

Description

@pirate

Describe the bug

When using wayback in both --proxy and --proxy-record mode, many sites seem to display gibberish, possibly because of an incorrect encoding being forced somewhere.

Working sites:

Broken sites:

I'm not sure what makes some sites work and not others.

Steps to reproduce the bug

  1. Use Python 3.7.2 and either pywb v2.1.0 or pywb-2.2.0.dev0 (latest develop branch)
  2. Using an empty config.yaml, run:
wb-manager init demo
wayback --proxy-record --proxy demo
  1. Open a page in a browser:
google-chrome --proxy-server=http://localhost:8080 --ignore-certificate-errors --disable-web-security https://cloudflare.com

Expected behavior

A readable site is recorded and replayed instead of ��������

Screenshots

image

Environment

  • OS: macOS 10.14
  • Browser: Google Chrome 72.0.3626.64 beta
  • Version: Python 3.7.2 and pywb v2.1.0 [e.g. 22]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions