ruby on rails - Experience with web crawlers on heroku -


does have experience coding web crawlers gems such anemone , deploying them heroku own person use? such continuously running programs violate of heroku's toa/tos?

i don't have experience using web crawlers in heroku (i interested in reading that!). here points:

  1. this prohibited content. illegal activity prohibited (duh) , since sites "prohibit" web crawlers , screen scrapers (such imdb), considered illegal. let's ignore now.

  2. these prohibited actions. following prohibited:

    data mining web property (including heroku) find email addresses or other user account information;

  3. these usage limits:

    • network bandwidth: 2tb/month - soft
    • shared db processing: max 200msec per second cpu time - soft
    • dyno ram usage: 512mb - hard
    • slug size: 200mb - hard
    • request length: 30 seconds - hard
  4. in tos, point 2.5., it's explained:

    repeated exceeding of hard or soft usage limits may lead termination of account.

emphasis mine. heroku gives each app 750 dyno hours. long don't abuse heroku's services , don't use gather personal info, believe you're in clear. suggest:

  1. somehow cap web crawler. should limit rate of api requests, should have common courtesy of limiting speed of crawler.

  2. keep eye on dyno hours. can here.


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -