How to get latest tweet id, using python-twitter search API -


i'm trying find way not same tweets using search api. that's i'm doing:

  1. make request twitter
  2. store tweets
  3. make request twitter
  4. store tweets,
  5. compare results 2 , 4

ideally in step 5 0, meaning no overlapping tweets received. i'm not asking twitter server same information more once.

but think got stuck in step 3, have make call. i'm trying use 'since_id' argument tweets after points. i'm not sure if value i'm using correct.

code:

import twitter  class test():      def __init__(self):         self.t_auth()         self.hashtag = ['justinbieber']          self.tweets_1 = []         self.ids_1 = []         self.created_at_1 = []         self.tweet_text_1 = []         self.last_id_1 = ''         self.page_1 = 1          self.tweets_2 = []         self.ids_2 = []         self.created_at_2 = []         self.tweet_text_2 = []         self.last_id_2 = ''         self.page_2 = 1           in range(1,16):             self.tweets_1.extend(self.api.getsearch(self.hashtag, per_page=100, since_id=self.last_id_1, page=self.page_1))             self.page_1 += 1;         print len(self.tweets_1)         t in self.tweets_1:            self.ids_1.insert(0,t.id)            self.created_at_1.insert(0,t.created_at)            self.tweet_text_1.insert(0,t.text)            self.last_id_1 = t.id                         self.last_id_2 = self.last_id_1          in range(1,16):             self.tweets_2.extend(self.api.getsearch(self.hashtag, per_page=100, since_id=self.last_id_2, page=self.page_2))             self.page_2 += 1;         print len(self.tweets_2)         t in self.tweets_2:            self.ids_2.insert(0,t.id)            self.created_at_2.insert(0,t.created_at)            self.tweet_text_2.insert(0,t.text)            self.last_id_2 = t.id          print 'total number of tweets in test 1: ', len(self.tweets_1)         print 'last id of test 1: ', self.last_id_1          print 'total number of tweets in test 2: ', len(self.tweets_2)         print 'last id of test 2: ', self.last_id_2          print '##################################'         print '#############overlaping###########'          ids_overlap = set(self.ids_1).intersection(self.ids_2)         tweets_text_overlap = set(self.tweet_text_1).intersection(self.tweet_text_2)         created_at_overlap = set(self.created_at_1).intersection(self.created_at_2)          print 'ids: ', len(ids_overlap)         print 'text: ', len(tweets_text_overlap)         print 'created_at: ', len(created_at_overlap)          print ids_overlap         print tweets_text_overlap         print created_at_overlap        def t_auth(self):         consumer_key="xxx"         consumer_secret="xxx"         access_key = "xxx"         access_secret = "xxx"          self.api = twitter.api(consumer_key, consumer_secret ,access_key, access_secret)         self.api.verifycredentials()          return self.api  if __name__ == "__main__":     test()   

in addition 'since_id', can use 'max_id'. twitter api documentation:

iterating in result set: parameters such count, until, since_id, max_id allow control how iterate through search results, since large set of tweets.

by setting these values dynamically, can restrict search results not overlap. example, max_id set @ 1100 , since_id set @ 1000, , have tweets ids between 2 values.


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -