Producing PDF and ZIM locally

From EJP Documentation
Jump to navigation Jump to search

https://github.com/ajithhub/education-justice-project


At the moment there is a problem using pediapress to produce the zim and podf files.


For now experimenting with doing it myself

  virtualenv-2.6 mwlibenv
  source mwlibenv/bin/activate
  pip install -i http://pypi.pediapress.com/simple/ mwlib
  pip install -i http://pypi.pediapress.com/simple/ pil
  pip install -i http://pypi.pediapress.com/simple/ mwlib.rl
  pip install -i http://pypi.pediapress.com/simple/ mwlib.zim
  # this on is for table of contents preparation
  sudo yum install pdftk
  # this one is to silence the warnings about RTL
  pip install -i http://pypi.pediapress.com/simple/ pyfribidi
    mw-zip -c http://ejpdocs.cuforrent.com/ -o test2.zip "Prepare Bootable EJP Recovery USB Drive" "Entering the BIOS Setup" "Configure HP Integrated Lights Out" "Security Considerations" "Windows Multipoint Installation" "Change Unidentified Networks to Private" "VSpace Server Installation"


  mw-render -c test2.zip -o test2.2.pdf -w rl
  mw-render -c test2.zip -o test2.2.zim -w zim
  cp *.{pdf,zim} /u/aantony/public_html/wikibooks/
  chmod a+r /u/aantony/public_html/wikibooks/*

The problem was query string length

The default dreamhost max get was set to 512, which was too short for the media wiki api calls that fetch image metadata. These ended up being pretty large. managed to override the value, so we can use pedia press again

 $ cat - > ~/.php/5.3/phprc
 [suhosin]
 # 3/10/2013 override get max for mediawiki api
 suhosin.get.max_value_length = 10000
 # screw it just turn suhosin off
 suhosin.simulation = On


 $ killall php53.cgi


Throttle requests

Presently hosting this wiki on a Dreamhost shared account, which is not supper fast. By default the image threadpool is 10, but we'll need to turn it down to 1 to be more reliable:


 ./mwlibenv/lib/python2.6/site-packages/mwlib/net/fetch.py
 < self.image_download_pool = gevent.pool.Pool(10)
 > self.image_download_pool = gevent.pool.Pool(1)


Also for some reason, lots fo requests get 500 status. It appears that merely retrying once is sufficient to fix them.

 ./mwlibenv/lib/python2.6/site-packages/mwlib/net/sapi.py
def _fetch(self, url):$
   try:$
      f = self.opener.open(url)$
   except Exception as e:$
      print "ERROR in %s" % url$
      print "Try again...."$
      try:$
        f = self.opener.open(url)$
      except Exception as e:$
         print "ERROR twice in %s" % url$
         raise(e)$