Since FriendFeed was bought by Fackbook, it's doomed for itself and users. Yes, they still have a team to maintain and fix issues, but it has become a zombie I would say. I stayed until two months ago, I removed all services I had added and removed it from yjl.im. As I see it as a zombie, it continued grabbing my stuff from a few sources for a few weeks. I didn't report because I didn't care.
Few days ago, I decided to remove some entries. I have wanted to do that for long time after I saw many links from FriendFeed (to this blog) reported in Webmaster Tools. Surely, I would have those link, I added my blog to FriendFeed. It's same reason why often don't link to my blog post in screenshots I upload on Flickr. I feel I was spamming myself. Time to fix it.
I am not trying to remove all entries but only those without being commented on or liked, I also don't remove the FriendFeed entries. For example, you write an entry or upload files or images directly on FriendFeed. I want to keep everything original intact. Blog entries, YouTube favorites, Last.fm favorites, etc., they will be still intact in source websites as they were after I remove those entries from my FriendFeed account. But likes and comments on FriendFeed are original stuff, so I keep entries which have those.
Unfortunately, there is an API rate limit for deletion (or write permission). Of course I use API, you don't expect me to delete 9,430 entries (10,091 entries in total in my account) by mouse clicks, do you? I don't know exact rate, it seems to be 100 requests per a few hours or per a day, I am not sure. Conservatively speaking, it's a 100-day job. I wrote an email to API team for detail about rate limit but I haven't got an response.
Since this is a one-time script, I just post the code here.
#!/usr/bin/env python
import datetime
import getpass
import re
import shelve
import sys
import time
from urllib2 import HTTPError
from friendfeed import FriendFeed
NUM = 100
DELETION_INTERVAL = 30
RE_TITLE = re.compile('(.*) - <a rel.*')
def print_eta(n, extra=0, est_only=False):
return
eta = 3600 * n / 100 + extra
est = datetime.datetime.now() + datetime.timedelta(seconds=eta)
if est_only:
print '[%s]' % est
else:
print 'Estimated time to complete: %d seconds, at %s' % (eta, est)
def main():
ff = FriendFeed()
nickname = raw_input('Your FriendFeed Nickname: ')
data = shelve.open('%s.data' % nickname)
if 'start' not in data:
start = 0
else:
start = data['start']
if start == -1:
# Finish retrieving entries
entries = data['entries']
marked = len([True for v in entries.values() if v[0]])
total = len(entries)
del_queue = [entry for entry, value in entries.items() if value[0] and not value[1]]
print '%d out of %d entries marked for deletion.' % (marked, total)
print '%d deleted, %d left to delete.' % (marked - len(del_queue), len(del_queue))
print
if not del_queue:
return
print 'You can find your Remote Key at http://friendfeed.com/remotekey'
print
remote_key = getpass.getpass('Please enter your remote key [no echo]: ')
ff = FriendFeed(nickname, remote_key)
print
print_eta(len(del_queue), extra=5)
print 'Starting deletion (every %d seconds a request) in 5 seconds...' % DELETION_INTERVAL
print
time.sleep(5)
del_count = 0
try:
while del_count < len(del_queue):
e_id = del_queue[del_count]
try:
result = ff._fetch('/api/entry/delete', {'entry': e_id})
except HTTPError, e:
data['entries'] = entries
data.sync()
if e.code == 403 and 'limit-exceeded' in e.read():
print
print 'Failed to delete [%s], reached the rate limit.' % e_id
print_eta(len(del_queue) - del_count, extra=10*60)
print 'Sleeping for 10 minutes...'
time.sleep(10 * 60)
print
continue
raise e
if result['success']:
entries[e_id] = (True, True)
else:
print
print 'Failed to delete [%s]: ' % e_id, result
print 'Continue, anyway.'
sys.stdout.write('#')
sys.stdout.flush()
del_count += 1
if del_count % 50 == 0 or del_count == len(del_queue):
sys.stdout.write(' %d \n' % del_count)
print_eta(len(del_queue) - del_count, extra=10*60)
data['entries'] = entries
data.sync()
time.sleep(DELETION_INTERVAL)
except Exception, e:
data['entries'] = entries
data.sync()
raise e
print 'Done.'
else:
entries = data.get('entries', {});
# Retrieving entries
while True:
feed = ff.fetch_user_feed(nickname, start=start, num=NUM, maxcomments=1, maxlikes=1, hidden=1)
ids = [entry['id'] for entry in feed['entries']]
for e_id in ids:
if e_id not in entries:
break
else:
# All already in entries
print 'Retrieval is done.'
break
for entry in feed['entries']:
if entry['id'] in entries:
continue
if entry['service']['id'] == 'internal':
# Don't delete FriendFeed stuff
entries[entry['id']] = (False, False)
#elif 'comments' not in entry and 'likes' not in entry:
elif len(entry['comments']) + len(entry['likes']) == 0:
entries[entry['id']] = (True, False)
else:
entries[entry['id']] = (False, False)
print 'start=%d, entries=%d' % (start, len(entries))
start += NUM
data['entries'] = entries
data['start'] = start
data.sync()
time.sleep(5)
data['start'] = -1
data.sync()
marked = len([True for v in entries.values() if v[0]])
total = len(entries)
print '%d out of %d entries marked for deletion.' % (marked, total)
data.close()
if __name__ == '__main__':
main()
(I silence print_eta()
. Because I don't have exact rate limit information, therefore I could not give an ETA.)
It's a two-stage design. First run is to collect entries and to mark entries should be deleted, you will only be asked for FriendFeed nickname. The collected data will be stored in nickname.data
file.
$ ./remove_lonely_entries.py Your FriendFeed Username: livibetter start=0, entries=100 start=100, entries=200 start=200, entries=300 start=300, entries=400 [...] start=9600, entries=9691 start=9700, entries=9791 start=9800, entries=9891 start=9900, entries=9991 start=10000, entries=10091 Retrieval is done. 9430 out of 10091 entries marked for deletion.
The second stage is to delete entries, you will also be asked for remote key. I still use API v1, API v2 uses OAuth and I am not sure if it supports three-legged OAuth. No need to trouble myself to that. Once you enter the key, the deletion will be started in five seconds. It will send a deletion request every 30 seconds. If it gets a rate limit exceeded response, it sleeps for 10 minutes, then try again.
It should be safe to interrupt this script anytime you want, it will pick up where you force it to leave. There is no option to tell this script which stage to perform, it knows. Simply run the script without options and follow the instruction, it will be fine.
Meh, I still have 8,887 entries to delete!