Skip to content

ARCRecordingProxy times out #116

@youssefeldakar

Description

@youssefeldakar

This issue was initially brought up by Mohamed Elsayed on the openwayback-dev group:

https://groups.google.com/forum/#!topic/openwayback-dev/Kv57MEzOAqw

What follows is quoted from post above...

Running under either OpenJDK IcedTea6 1.12.6 or Oracle JDK 1.8.0-b132, requests through the ARCRecordingProxy (tested through the ARCUnwrappingProxy) give 'HTTP 504 Gateway Timeout' on the first fresh attempt after a Tomcat restart, as seen in liveweb/arcs/live-.arc.gz. Then, all subsequent requests just say "connecting" for a very long time.

This was tested in Tomcat 6.0.35 on Debian 7.

This is the live web configuration (LiveWeb.xml):

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
           http://www.springframework.org/schema/beans/spring-beans-2.5.xsd">


  <bean name="8099" class="org.archive.wayback.liveweb.ARCRecordingProxy">
    <property name="arcCacheDir">
      <bean class="org.archive.wayback.liveweb.ARCCacheDirectory"
        init-method="init">

        <property name="arcDir" value="${wayback.basedir}/liveweb/arcs/" />
        <property name="arcPrefix" value="live" />
       </bean>
    </property>
    <property name="cacher">
      <bean class="org.archive.wayback.liveweb.URLtoARCCacher">
        <property name="recorderCacheDir" value="${wayback.basedir}/liveweb/tmp/" />
        <property name="backingFileBase" value="recorder-tmp" />
        <property name="userAgent" value="ia_archiver(OS-Wayback)" />
        <property name="connectionTimeoutMS" value="10000" />
        <property name="socketTimeoutMS" value="10000" />
      </bean>
    </property>
  </bean>
  <bean name="8098" class="org.archive.wayback.liveweb.ARCUnwrappingProxy">
    <property name="proxyHostPort" value="localhost:3128" />
  </bean>


  <bean id="proxylivewebcache"
      class="org.archive.wayback.liveweb.RemoteLiveWebCache">
    <property name="proxyHostPort" value="localhost:8099" />
<!--
    If you've set up a local squid/varnish to cache requests to the above
    ARCRecordingProxy, you should use the port for that, instead of 8099:
    <property name="proxyHostPort" value="localhost:3128" />
-->
  </bean>
  <bean id="excluder-factory-robot" class="org.archive.wayback.accesscontrol.robotstxt.RobotExclusionFilterFactory">
    <property name="maxCacheMS" value="86400000" />
    <property name="userAgent" value="ia_archiver" />
    <property name="webCache" ref="proxylivewebcache" />
  </bean>

</beans>

And this is the exclusion filter configuration (from wayback.xml):

  <import resource="LiveWeb.xml"/>
  <bean id="excluder-factory-robot" class="org.archive.wayback.accesscontrol.robotstxt.RobotExclusionFilterFactory">
    <property name="maxCacheMS" value="86400000" />
    <property name="userAgent" value="ia_archiver" />
    <property name="webCache" ref="proxylivewebcache" />
  </bean>


<!--
  The 'excluder-factory-static' bean defines an exclusionFactory object which
  consults a local text file containing either URLs or SURTs of content to
  block from the ResourceIndex. These URLs or SURTs are treated as prefixes:
     "http://www.archive.org/ima" will block anything starting with that string
     from being returned from the index.
-->
<!--
  <bean id="excluder-factory-static" class="org.archive.wayback.accesscontrol.staticmap.StaticMapExclusionFilterFactory">
    <property name="file" value="/var/tmp/os-cdx/exclusion-2008-09-22-cleaned.txt" />
    <property name="checkInterval" value="600000" />
  </bean>
-->

<!--
  The 'excluder-factory-composite' bean creates a single exclusionFactory
  which restricts from both a static list of URLs, and also by live web
  robots.txt documents.
-->
<!--
  <bean id="excluder-factory-composite" class="org.archive.wayback.accesscontrol.CompositeExclusionFilterFactory">
    <property name="factories">
      <list>
        <ref bean="excluder-factory-static" />
        <ref bean="excluder-factory-robot" />
      </list>
    </property>
  </bean>
-->

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions