Dumb questions

My profile page says I have 5584 journal entries posted and 26,095 comments received. I just ran ljdump and apparently it ran successfully. Does it create one file per journal entry and one file per comment? Because I only get 3570 files beginning with "L-" (presumably journal entries) and 4600 files beginning with "C-" (presumably comments).

bikergeek@linux:~/devel/ljdump-1.5.1/bikergeek$ ls L-* | wc -l
3570
bikergeek@linux:~/devel/ljdump-1.5.1/bikergeek$ ls C-* | wc -l
4600
bikergeek@linux:~/devel/ljdump-1.5.1/bikergeek$


How can I run some simple checks to make sure ljdump got everything?
Me - Boat
  • tcb

Comments broken again?

Hullo!

I made the mods calmingshoggoth suggested after ljdump barfed after entry 1000.  Then it processed all entries, but was unable to grab the comments;

Fetching journal entry L-1494 (update)
Fetching journal comments for: tcb
*** Error fetching comment body, possibly not community maintainer?
*** not well-formed (invalid token): line 111, column 15
Fetching userpics for: tcb
1434 new entries, 0 new comments
Any ideas?
  • aoeui21

how to dump other user's ljournal? recieving Don't have access to requested journal error

hi.

i'm trying to dump ljournal of other user:
i specify user-name1 and password1 in the "ljdump.config",
but set journal=otheruser2.

when i run the ljdump.py, the error appears:

Fetching journal entries for: otheruser2
Traceback (most recent call last):
  File "./ljdump.py", line 368, in 
    ljdump(server, username, password, e.childNodes[0].data)
  File "./ljdump.py", line 175, in ljdump
    }, Password))
  File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1316, in single_request
    return self.parse_response(response)
  File "/usr/lib/python2.7/xmlrpclib.py", line 1493, in parse_response
    return u.close()
  File "/usr/lib/python2.7/xmlrpclib.py", line 800, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault 300: "Client error:  Don't have access to requested journal: You don't have access">


is it able to dump (i suppose only public) entries of lj of other user?
i use just downloaded ljdump.py Version 1.5.1 on Ubuntu 16.10
shog

I had some problems with ljdump version 1.5.1 and fixed them

The first problem was that LJ limits you to 1000 fetches per hour. I made the loop sleep for four seconds (60*60/1000 is 3.6, so I rounded up) between fetches and it doesn't seem to have that problem anymore.

Then I ran into a problem with comments. The first comment id in the journal I am backing up is in the mid four thousands. When it fetches with an id of 1 it got back an empty set of comments (<comments></comments>). It then looped endlessly because it never changed the maxid. I changed the value in the .last file to be 4000 and it fetched the first thousand or so comments, but then there was another big gap in the comment ids (it jumped up to seven thousand something) and it again got stuck in an infinite loop.

Looking at the code I noticed that all of the ids are present in the comment.meta file, so I changed the code to grovel through that data structure instead of just blindly using maxid + 1 as the next id.

Here is the diff containing my changes:

Read more...Collapse )
rest

Bug. String and number typing

I found bug.

In my post is text with +, but in xml created with your program, + is lost...

field

example, text of post is "+887878"

but in xml will be
887878

without + !!!

why?

i think, bug in xmlrpclib

because in this point in function "dumpelement"

s = unicode(str(e[k]), "UTF-8")

e[k] is NUMBER! not string! number without + of cause

how fix it?

thanks

ps.

as a temporary solution, made ​​and tested by downloading entry through the editor. it is also used when there is a record in the embed, it takes the entire embed instead of just useless links.

print "Fetching journal entry %s (%s)" % (item['item'], item['action'])
try:
    e = server.LJ.XMLRPC.getevents(dochallenge(server, {
        'username': Username,
        'ver': 1,
        'selecttype': "one",
        'itemid': item['item'][2:],
        'usejournal': Journal,
    }, Password))
    if e['events']:

        #--------------added by EI 20140503
        i = e['events'][0]['ditemid']
        tt = e['events'][0]['event']

        tt = unicode(str(tt), "UTF-8")

        ro = re.compile('lj-embed', re.M | re.S | re.U)
        n_ro = re.compile('^\d+$', re.M | re.S | re.U)

        m = re.search(ro, tt)
        is_number = re.search(n_ro, tt)

        if m or is_number:
            rr = int(item['item'][2:])

            r = urllib2.urlopen(urllib2.Request(Server+"/editjournal.bml?journal=%s&itemid=%d%s" % (Journal, i, authas), headers = {'Cookie': "ljsession="+ljsession}))
            meta = r.read()
            r.close()

            ro = re.compile('<textarea[^>]+id="body"[^>]+>(.*?)</textarea>', re.M | re.S | re.U)
            m = re.search(ro, meta)
            if m:
                e['events'][0]['event'] = str(m.group(1))
                e['events'][0]['event'] = saxutils.unescape(e['events'][0]['event'], {'"':'"'})
        #-----------------
        writedump("%s/%s" % (Journal, item['item']), e['events'][0])
  • gyve

First attemt at Python 3 port

Here it goes: http://pastebin.com/Q35EmMyY

It doesn't use this part because of some sort of byte/str serialization error happening I don't know a thing about:

f = codecs.open("%s/comment.meta" % Journal, "w", "UTF-8")
#pickle.dump(metacache, f)
f.close()

f = codecs.open("%s/user.map" % Journal, "w", "UTF-8")
#pickle.dump(usermap, f)
f.close()
  • Current Mood
    exhausted exhausted
front

How can I view my journal locally?

Thanks for building this great tool. I've run it fine and seem to have downloaded everything. Now, how can I nicely view them?

I've flicked back through this forum and found a previous discussion of how non-l33t people like me might get a bit stuck, but there doesn't seem to be a nice answer there. Looks like I could import the whole thing to wordpress if I liked -- but I don't, I've moved on from this journal, I just want to keep it for old time's sake.

So what's the best way to view or convert it? Thanks.