Lefora Free Forum
login join
Loading
214 views

Threading/Blocking and avoiding the Beachball

Page 1
1–5
regular - member
54 posts

I have been tinkering with the little cocoa I have learned so far. I have put together a simple project that goes through a list of urls one by one, puts the url into the "currently processing" text field, and gets the html from each, and makes sure there is no "error" text(post not found, article doesn't exist, etc.) . That's it for now and it works...kind of. For a short list it seems to work, for a long list it does what it should but the dreaded beachball appears. Also the text field is not getting updated for each url, it is only getting updated after the last url is finished. I have tried a bit of research and found that my threading or blocking is incorrect?????

I am not looking for a specific answer to my problem here, there is also way too much code to post. I am not looking for you guys to code this for me. And all in all it is a throw away project in the end, or at best will need to be recoded. What I am looking for is what do I need to research more to ovoid this problem? Maybe some common errors that cause this?
Here's basically what's going on:

1. button pressed sends 'startDownloading' to appController.
2. startDownloading has a loop inside that does the rest.

I have created a class to house and parse, and return bits of the html.



Todd

superstar - member
233 posts

Todd

I'll happily look at this for you next week (if I don't go to England).  Can you zip your code and send it to me?  I won't steal your code - honest - I'm only trying to help.

I'm rather busy during the next few days with opensource stuff which I have promised.  Once I'm clear of that, I'll be happy to look at your stuff.

__________________
superstar - member
233 posts




Todd

Can I validate the specification here with you, please?  I've written a little script:

#!/bin/bash

count=0
while read line;
do
count=$(($count+1));
url=http://clanmills.com/${line}
echo url = $url lines = `curl --silent http://clanmills.com/${line} | wc -l`
done < $1
echo "processed : $count files"

which reads the file:

BoydWedding.html
NicholasBirthday.html
arizona.html
dennis.html
eventsevents.html
gps.html
gpsfoo.html
popup.html
rollovermap.html
rollovermap1.html
statesonly.html
utah.html

and produces the output:

543 /home/rmills/temp$ readem.sh readem.txt
url = http://clanmills.com/BoydWedding.html lines = 10
url = http://clanmills.com/NicholasBirthday.html lines = 10
url = http://clanmills.com/arizona.html lines = 14
url = http://clanmills.com/dennis.html lines = 28
url = http://clanmills.com/eventsevents.html lines = 509
url = http://clanmills.com/gps.html lines = 83
url = http://clanmills.com/gpsfoo.html lines = 280
url = http://clanmills.com/popup.html lines = 32
url = http://clanmills.com/rollovermap.html lines = 104
url = http://clanmills.com/rollovermap1.html lines = 104
url = http://clanmills.com/statesonly.html lines = 10
url = http://clanmills.com/utah.html lines = 10
processed : 12 files

So, it's reading the names of files from readem.txt and building URLs which it then gets using curl.  For simplicity, all I'm doing in a simple line count on the file as proof that all the files are different.

I've written a command-line tool version of this in Obj/C++ (code below) and here's the output:

2010-03-14 20:57:28.599 readem[25593:a0f] http://clanmills.com/BoydWedding.html lines = 11
2010-03-14 20:57:28.697 readem[25593:a0f] http://clanmills.com/NicholasBirthday.html lines = 11
2010-03-14 20:57:28.794 readem[25593:a0f] http://clanmills.com/arizona.html lines = 15
2010-03-14 20:57:28.890 readem[25593:a0f] http://clanmills.com/dennis.html lines = 29
2010-03-14 20:57:29.091 readem[25593:a0f] http://clanmills.com/eventsevents.html lines = 510
2010-03-14 20:57:29.191 readem[25593:a0f] http://clanmills.com/gps.html lines = 84
2010-03-14 20:57:29.300 readem[25593:a0f] http://clanmills.com/gpsfoo.html lines = 281
2010-03-14 20:57:29.397 readem[25593:a0f] http://clanmills.com/popup.html lines = 33
2010-03-14 20:57:29.498 readem[25593:a0f] http://clanmills.com/rollovermap.html lines = 105
2010-03-14 20:57:29.601 readem[25593:a0f] http://clanmills.com/rollovermap1.html lines = 105
2010-03-14 20:57:29.697 readem[25593:a0f] http://clanmills.com/statesonly.html lines = 11
2010-03-14 20:57:29.793 readem[25593:a0f] http://clanmills.com/utah.html lines = 11
2010-03-14 20:57:29.984 readem[25593:a0f] http://clanmills.com/  lines = 212
2010-03-14 20:57:29.984 readem[25593:a0f] processed = 13 files

Let's ignore the arithmetic isn't quite correct and the Obj/C++ version seems to count an extra line (and has 13 files instead of 12).  Let's not worry about those details which have to do with a trailing blank line both in readem.txt and the code from pulled from the internet.

Just for grins, I duplicated all the lines in readem.txt (so there were 2600 URLs to be read).  It ran in about 2 seconds.  Kind of amazing, isn't it?  In fact, the console had the comment "*** process 12345 exceeded 500 messages per second limit, remaining messages this second discarded ***".
Obviously we can use this little program below a GUI, with a button to select the input file, and a scrolling view gizzmo to display the output and so on.  And maybe you could double click on an output line and see the file .... and all the other stuff that makes UI programming both fun and never-ending.
Can you confirm that I've understood the task? How long is a "long list?" (100s, 1000s, 1000,000s)? 
I know you're using XCode 2.4 and don't have the garbage collector stuff (which was thrown into my code by the Wizard in XCode 3.2).  However that's a distraction from understanding the mission and the issue at hand.
If you're reading a lot of very long files from the internet, this could take rather a long time and if you're not releasing the objects correctly, you could be eating the computer which will probably cause the beachball to show up.
Robin

Here's the code:


#import <Foundation/Foundation.h>

static int lineCount(NSString* s)
{
NSArray* listItems = [s componentsSeparatedByString:@"\n"];
return  [listItems count] ;
}

int main (int argc, const char * argv[])
{
if ( argc != 2 ) return printf("syntax: readem file") ;

    NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
NSStringEncoding encoding ;
NSError* error ;
NSUInteger urlCount = 0 ;

// get the lines from the input file
NSString* fileName = [NSString stringWithUTF8String:argv[1]];
NSString* fileString = [NSString
stringWithContentsOfFile : fileName
usedEncoding : &encoding
error : &error
];
NSArray* lines = [fileString componentsSeparatedByString:@"\n"] ;

// run over the lines, convert them to URLs and read them from clanmills.com
NSUInteger i ;
for ( i = 0 ; i < [lines count ] ; i++ ) {
urlCount ++ ;
NSString* urlString = [NSString stringWithFormat:@"http://clanmills.com/%@",[lines objectAtIndex:i]];
NSURL* url = [NSURL URLWithString:urlString];
  
NSString* s = [NSString 
stringWithContentsOfURL : url 
usedEncoding : &encoding
error : &error
];
NSLog(@"%@ lines = %d",urlString,lineCount(s)) ;
}
NSLog(@"processed = %d files",urlCount) ;
    [pool drain];
    return 0;
}






__________________
regular - member
54 posts

Robin,
            Thank you for the reply. I have been sick and didn't want to leave you hanging. I have a tone of stuff to catch up on, so I won't be able to read and comprehend your post until later today. Although it does seem like you have the idea. A long list will likely be low 100's. I got a copy of Hillegass's book also and realized I could not subject you to my code...yet. I think I am doing things the hard way, and not fully embracing cocoa.

Thanks again,
Todd

superstar - member
233 posts

You've got the book.  That's the best step you can make in the right direction.

No matter the state of your code, I'd deal with it and never pass judgement.  Please let me know if you need help, however you can take as long as you like.

I was rather surprised by the speed of the code I wrote.  I think it must be simply pulling the files from the cache - I can't believe it does about 1000 gets/second.  However I'm busy with some opensource stuff and I might remember to revisit this.  More likely, I'll forget.

However if you ask for help, I'll most certainly help you.

__________________
Page 1
1–5

Locked Topic


You must be a member to post in this forum

Join Now!